Abstract
In this paper a number of alternative pre-processing configurations are applied to an HMM-based phoneme recognition system and evaluated on the TIMIT speech corpus. It is demonstrated that there is considerable advantage in the addition of processing steps after the initial signal processing. F-ratio analysis gives a clear ranking of the discriminatory power of commonly used features such as log-power, zero-crossing rate, cepstral, delta cepstral and band-power coefficients. Results have been obtained that demonstrate a 20% reduction in the mis-classification rate using a linear discriminant analysis transformation from a 43-variable feature set to a 10-variable linearly transformed feature set. Finally the paper demonstrates that vector quantisation using totally non-parametric classification trees can lead to phoneme classification results competitive with those achieved using traditional techniques, while at the same time offering much faster evaluation.
Original language | English (US) |
---|---|
Pages | 631-634 |
Number of pages | 4 |
State | Published - 1992 |
Externally published | Yes |
Event | 2nd International Conference on Spoken Language Processing, ICSLP 1992 - Banff, Canada Duration: Oct 13 1992 → Oct 16 1992 |
Conference
Conference | 2nd International Conference on Spoken Language Processing, ICSLP 1992 |
---|---|
Country/Territory | Canada |
City | Banff |
Period | 10/13/92 → 10/16/92 |
ASJC Scopus subject areas
- Language and Linguistics
- Linguistics and Language