ALTERNATIVE PRE-PROCESSING TECHNIQUES FOR DISCRETE HIDDEN MARKOV MODEL PHONEME RECOGNITION

Andrew Tridgell; Bruce Millar; Kim Anh Do

ALTERNATIVE PRE-PROCESSING TECHNIQUES FOR DISCRETE HIDDEN MARKOV MODEL PHONEME RECOGNITION

Andrew Tridgell, Bruce Millar, Kim Anh Do

Research output: Contribution to conference › Paper › peer-review

Abstract

In this paper a number of alternative pre-processing configurations are applied to an HMM-based phoneme recognition system and evaluated on the TIMIT speech corpus. It is demonstrated that there is considerable advantage in the addition of processing steps after the initial signal processing. F-ratio analysis gives a clear ranking of the discriminatory power of commonly used features such as log-power, zero-crossing rate, cepstral, delta cepstral and band-power coefficients. Results have been obtained that demonstrate a 20% reduction in the mis-classification rate using a linear discriminant analysis transformation from a 43-variable feature set to a 10-variable linearly transformed feature set. Finally the paper demonstrates that vector quantisation using totally non-parametric classification trees can lead to phoneme classification results competitive with those achieved using traditional techniques, while at the same time offering much faster evaluation.

Original language	English (US)
Pages	631-634
Number of pages	4
State	Published - 1992
Externally published	Yes
Event	2nd International Conference on Spoken Language Processing, ICSLP 1992 - Banff, Canada Duration: Oct 13 1992 → Oct 16 1992

Conference

Conference	2nd International Conference on Spoken Language Processing, ICSLP 1992
Country/Territory	Canada
City	Banff
Period	10/13/92 → 10/16/92

ASJC Scopus subject areas

Language and Linguistics
Linguistics and Language

Cite this

@conference{530a6961c7bf489a999a930881b43b30,

title = "ALTERNATIVE PRE-PROCESSING TECHNIQUES FOR DISCRETE HIDDEN MARKOV MODEL PHONEME RECOGNITION",

abstract = "In this paper a number of alternative pre-processing configurations are applied to an HMM-based phoneme recognition system and evaluated on the TIMIT speech corpus. It is demonstrated that there is considerable advantage in the addition of processing steps after the initial signal processing. F-ratio analysis gives a clear ranking of the discriminatory power of commonly used features such as log-power, zero-crossing rate, cepstral, delta cepstral and band-power coefficients. Results have been obtained that demonstrate a 20% reduction in the mis-classification rate using a linear discriminant analysis transformation from a 43-variable feature set to a 10-variable linearly transformed feature set. Finally the paper demonstrates that vector quantisation using totally non-parametric classification trees can lead to phoneme classification results competitive with those achieved using traditional techniques, while at the same time offering much faster evaluation.",

author = "Andrew Tridgell and Bruce Millar and Do, {Kim Anh}",

note = "Funding Information: The authors gratefully acknowledge the support of the Australian Telecommunications and Electronics Research Board, ANUTech Pty Ltd and the Commonwealth of Australia who provided postgraduate scholarships to the first author. Publisher Copyright: {\textcopyright} 1992 2nd International Conference on Spoken Language Processing, ICSLP 1992. All rights reserved.; 2nd International Conference on Spoken Language Processing, ICSLP 1992 ; Conference date: 13-10-1992 Through 16-10-1992",

year = "1992",

language = "English (US)",

pages = "631--634",

}

TY - CONF

T1 - ALTERNATIVE PRE-PROCESSING TECHNIQUES FOR DISCRETE HIDDEN MARKOV MODEL PHONEME RECOGNITION

AU - Tridgell, Andrew

AU - Millar, Bruce

AU - Do, Kim Anh

N1 - Funding Information: The authors gratefully acknowledge the support of the Australian Telecommunications and Electronics Research Board, ANUTech Pty Ltd and the Commonwealth of Australia who provided postgraduate scholarships to the first author. Publisher Copyright: © 1992 2nd International Conference on Spoken Language Processing, ICSLP 1992. All rights reserved.

PY - 1992

Y1 - 1992

N2 - In this paper a number of alternative pre-processing configurations are applied to an HMM-based phoneme recognition system and evaluated on the TIMIT speech corpus. It is demonstrated that there is considerable advantage in the addition of processing steps after the initial signal processing. F-ratio analysis gives a clear ranking of the discriminatory power of commonly used features such as log-power, zero-crossing rate, cepstral, delta cepstral and band-power coefficients. Results have been obtained that demonstrate a 20% reduction in the mis-classification rate using a linear discriminant analysis transformation from a 43-variable feature set to a 10-variable linearly transformed feature set. Finally the paper demonstrates that vector quantisation using totally non-parametric classification trees can lead to phoneme classification results competitive with those achieved using traditional techniques, while at the same time offering much faster evaluation.

AB - In this paper a number of alternative pre-processing configurations are applied to an HMM-based phoneme recognition system and evaluated on the TIMIT speech corpus. It is demonstrated that there is considerable advantage in the addition of processing steps after the initial signal processing. F-ratio analysis gives a clear ranking of the discriminatory power of commonly used features such as log-power, zero-crossing rate, cepstral, delta cepstral and band-power coefficients. Results have been obtained that demonstrate a 20% reduction in the mis-classification rate using a linear discriminant analysis transformation from a 43-variable feature set to a 10-variable linearly transformed feature set. Finally the paper demonstrates that vector quantisation using totally non-parametric classification trees can lead to phoneme classification results competitive with those achieved using traditional techniques, while at the same time offering much faster evaluation.

UR - http://www.scopus.com/inward/record.url?scp=85135111087&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85135111087&partnerID=8YFLogxK

M3 - Paper

AN - SCOPUS:85135111087

SP - 631

EP - 634

T2 - 2nd International Conference on Spoken Language Processing, ICSLP 1992

Y2 - 13 October 1992 through 16 October 1992

ER -

ALTERNATIVE PRE-PROCESSING TECHNIQUES FOR DISCRETE HIDDEN MARKOV MODEL PHONEME RECOGNITION

Abstract

Conference

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this