
System Documentation
Installation and Test
- Download and unpack the distribution after registering as a new user.
- Unzip the file. If your unzip tool has a "smart" option for unpacking
text files, disable this - it can corrupt some of the data files in the code.
- Run the file flinger_test.bat in the directory where you unpacked the
distribution. This will run the program and load an example script called fl_test.scm. The script is written in Festival's scripting language, called "scheme". You should hear some talking and singing, synthesized from the
MIDI file in festival\examples\midi\ogi.mid. You can look at this
file in your favorite MIDI composition program.
Interactive Command Line Mode
Flinger is really just a customized version of
the Festival TTS system. Therefore, it can do
all of the things Festival itself can do. For instance,
it can speak a sentence and save it to a file (e.g., MS riff
format) as follows:
festival> (utt.save.wave (SayText "Hello world.") "test.wav" 'riff)
- All of the commands in fl_test.scm could have been typed at the
festival command line, such as
festival> (Flinger.sing "festival/examples/midi/ogi.mid" nil)
(Yes the parentheses are important!)
- To save the output to a wav file, do
festival> (utt.save.wave (Flinger.sing "festival/examples/midi/ogi.mid" nil) "test.wav" 'riff)
A file test.wav should be created in the standard Microsoft 'riff' format.
- To exit the program, type
festival> (quit)
and press Enter. (Yes the parentheses are important!)
Scripts
Festival and Flinger use a lisp-like scripting language called "scheme".
The example script fl_test.scm will get you started. For more
info, see the section on scheme in the Festival manual.
Global Singer Parameters
Global Singer Parameters are, in a sense, the properties of the
"singer" who will be performing your MIDI file. These are static
for the whole rendering of a MIDI file once you define them.
In the test script fl_test.scm, you should find a code
block that illustrates how to change these parameters, which are
explained below:
(Flinger.init
(list
'(vibrato_freq 5.0)
'(vibrato_max 0.030)
'(maxbend_semi 1.0)
'(portamento_time_default 0.050)
'(drift_freq1 4.7)
'(drift_freq2 7.1)
'(drift_freq3 12.7)
'(drift_ampl 0.003)
'(transpose -5)
'(consonant_stretch 1.0)
'(phone_delimiter "|")
'(dump_info 0)
))
- vibrato_freq
- frequency of vibrato
- vibrato_max
- maximum depth of vibrato (at max range of MIDI Modulation controller) as a percentage of the frequency of a note
- maxbend_semi
- frequency modification in semitones at max excursion of MIDI pitch bend controller
- portamento_time_default
- time of glide between notes (seconds)
- drift_freq1,2,3
- frequencies of quasi-random drift of frequency -- models the inability of humans to sing a steady note
- drift_ampl
- amplitude of drift, as percentage of note frequency
- transpose
- shift of note frequencies (in semitones) from those given in MIDI file
- consonant_stretch
- stretch of consonants in slowly sung syllables (experimental)
- phone_delimiter
- In the phoneme input mode (yet to be documented), the delimiter used to separate phoneme symbols.
- dump_info
- set to "1" to dump out lots of debugging information that may or may not be helpful
MIDI-Controllable Parameters
Using the MIDI controllers to dynamically modify the characteristics
of the voice makes the difference between a very machine-like
rendering of the melody and something pretty. The following MIDI
controllers are used:
- Modulation - controls vibrato depth
- Pitch Bend - controls pitch bend depth
- Volume (planned for future) - controls volume
- Expression (planned for future) - controls vocal effort characteristics
Voices and Voice Characteristics
Installing new voices
You can get other voices from the OGIresLPC download site. In particular, the AEC, TLL, and JPH
voices should work well. There are others there for Mexican Spanish, British English, and German, but installing these is more complicated and requires
some knowledge of Festival's internal workings.
To install a voice...
- Download one of them, e.g., voice_aec_di-2.0.tar.gz.
- Unpack this in the SAME PLACE you unpacked the Flinger distribution.
- In a scm script like fl_test.scm, add a line
(voice_aec_diphone)
at the point at which you want the voice to change. To change it
back to the default MWM, add a line
(voice_mwm_diphone)
Modifying vocal tract size
Using a relatively simple signal processing trick, you can make the
voice sound much deeper - this especially helps get a deep baritone
singer quality. This works with any voice. Add the following
lines to your scm script:
(OGIresLPC.init++
(list '(vqual_mod
((vt_global_warp_wave 1.0)
(vt_voiced_warp_wave 0.90) ;;;<<<< change this
(vt_global_warp_lsf 1.0)
(vt_voiced_warp_lsf 1.0)))))
You can change the "vt_voiced_warp_wave" line marked above to
get various effects: a number between 0.0 and 1.0 will make
the voice deeper, a number greater than 1.0 will simulate a smaller
vocal tract (e.g., a child).
Advanced Capabilities
If you want to go further than what is given here, then the best
strategy would be to learn about the capabilities of the Festival
TTS system from the Festival manual. Flinger uses only some of the standard TTS modules,
because singing presents a different set of language processing
and signal processing issues than TTS does.
The scheme code is editable by you -- see the files in festival\lib\,
and in particular flinger.scm and ogi_effect.scm in that
directory. In the future, we hope to allow a source code distribution so
that others can customize Flinger on a deeper level.
Licensing
If you are interested in commercial applications of this
technology, please contact Mike Macon at OGI.
Please send constructive comments or suggestions to
Mike Macon
Last modified: Wed May 24 17:16:21 PDT