Re: Text-to-Speech Software

Date: Tue Sep 29 11:09:38 1998
Posted By: Lori Holt, Graduate (Ph.D.) Student, Psychology, Ph.D., University of Wisconsin
Area of science: Computer Science
ID: 907027754.Cs

Message:


Dear Lance,

It is certainly possible to create synthesized speech that sounds very 
natural. We know quite a bit about the acoustic structure of individual 
speech sounds (phonemes) and can model this structure with a computer 
synthesizer.

However, going from natural-sounding synthesis of individual phonemes to 
natural-sounding text is not as easy as stringing phonemes together like 
beads on a necklace.

When we produce speech, our mouths are constrained in their movements. They 
can only move so far so fast! As a result, the way a particular phoneme is 
produced is highly dependent upon the phonemes that are spoken before and 
after it. (Try saying "dean" and "draw", feeling where your tongue hits the 
roof of your mouth as you speak. There is a subtle but important difference 
in the position where your tongue makes contact for "d"). This 
context-dependency is part of what gives speech a "natural" sound. 

Most simple text-to-speech devices simply string together phonemes, 
avoiding all of these context-dependencies - thus the robotic, unnatural 
sound.

Many scientists are working to design better text-to-speech devices, but 
the problem has proven quite difficult. In English, especially, individual 
letters of the alphabet do not even represent the same sound in every word! 
(the "i" in "bit" and "bite" for example). The mapping from letter to sound 
is extremely complex and thus difficult to model in a text-to-speech 
device.

So will these difficulties ever be solved? My own view is that we've 
been approaching the problem from the wrong perspective. Instead of 
searching for engineering solutions, we might be better off to continue to 
work toward understanding how speech is produced and perceived by human 
speakers and listeners in hopes of gaining clues of how to model these 
processes in computers.

If you'd like more information about this field of research, please feel 
free to contact me.

Lori L. Holt
llholt@facstaff.wisc.edu

Current Queue | Current Queue for Computer Science | Computer Science archives

Try the links in the MadSci Library for more information on Computer Science.