ELEC97080 (EE9-AML3-01) Speech Processing
Lecturer(s): Prof Patrick Naylor
To introduce students to the signal processing and statistical techniques that are used in processing speech signals and to give students an understanding of how these techniques are used in the coding, synthesis and recognition of speech.
To enable students to apply signal processing techniques appropriately for the coding, synthesis and recognition of speech signals. To enable students to use dynamic programming, statistical modelling and inference techniques in pattern recognition applications.
The human vocal and auditory systems. Characteristics of speech signals: phonemes, prosody, IPA notation. Lossless tube model of speech production. Time and frequency domain representations of speech; window characteristics and time/frequency resolution tradeoffs. Properties of digital filters: mean log response, resonance gain and bandwidth relations, bandwidth expansion transformation, all-pass filter characteristics. Autocorrelation and covariance linear prediction of speech; optimality criteria in time and frequency domains; alternate LPC parametrisation. Speech coding: PCM, ADPCM, CELP. Speech synthesis: language processing, prosody, diphone and formant synthesis; time domain pitch and speech modification. Speech recognition: hidden Markov models and associated recognition and training algorithms. Dynamic Programming. Language modelling. Large vocabulary recognition. Acoustic preprocessing for speech recognition.
Exam Duration: 3:00hrs
Coursework contribution: 0%
Closed or Open Book (end of year exam): Closed
Oral Exam Required (as final assessment): no
Prerequisite module(s): None required
Course Homepage: This course uses Blackboard
Please see Module Reading list