You are here
Variable Speed Speech Synthesis
Title: President/Chief Software Engineer
Phone: (908) 240-7530
Email: soohapark@hotmail.com
Title: Chief Scientist
Phone: (908) 240-8196
Email: minkyul@hotmail.com
The objective of this proposal is to demonstrate the feasibility of developing variable speed speech synthesis technology. We plan to use open source TTS systems because they often provide flexibility and interoperability, which is essential for research oriented work. To modify speaking speed, we plan to focus on time domain time-scale modification algorithms, which provide good quality with less computational complexity compared to other approaches such as sinusoidal models or vocoder-based approaches. We will test time domain methods including SOLA, PSOLA, and WSOLA. We will apply linear scaling factor, which modifies the duration regardless of whether the speech segment is a silence, a transient or a sustained vowel. We will also apply different scaling factors to different parts of speech segments. During the optional six months, we will focus on creating multiple voices by modifying voice types, gender, dialects (accents), and perceived emotion of the speech. Based on the source-filter models, we will investigate algorithms for modifying source and filter characteristics, from which many different voices can be generated.
* Information listed above is at the time of submission. *