Department of Physics, Engineering Physics & Astronomy

Department of Physics, Engineering Physics & Astronomy
Department of Physics, Engineering Physics & Astronomy

Physical Modeling of Intonation in Speech

Prof. Gregory Kochanski
with Chilin Shih, Bell Labs, Lucent Technologies, NJ, USA

Date: Monday, September 30, 2002
Time: 10:30 AM
Location: Stirling 412A


Humans usually talk nearly as fast as possible, and the rate of speech is limited by how fast we can drive our muscles (e.g. tongue, jaw, vocal folds, ...). Consequently, the muscle dynamics and the control strategies that the brain uses to control the speech muscles are critical in understanding this important communication system. I describe physics-based models of pitch dynamics in Mandarin (Chinese) and Cantonese speech. These are tone languages, where changes of pitch can switch a syllable from one word to a completely different one. The model treats speech as an optimized communication system which is attempting to simultaneously minimize the communication error rate and the effort required to produce speech.

Previous approaches have used ad-hoc models, often from machine learning systems, and have been largely unsuccessful at connecting acoustic parameters to features of the language. These models yield a set of parameters which correspond to the linguistic concept of the "strength" or importance of a word. We have shown consistent use of strength to mark boundaries in the speech, and mark words with high information content. The strength values are among the first objective, and quantitative measurements that can be compared to linguistic theories.

Professor Greg P. Kochanski is a candidate for the ATOP-funded Assistant Professor position in the Department.

Refreshments will be served in the lounge after the talk.