An evaluation of using mutual information for selection of acoustic-features representation of phonemes for speech recognition

Mohamed Kamal Omar; Ken Chen; Mark Hasegawa-Johnson; Yigal Brandman

An evaluation of using mutual information for selection of acoustic-features representation of phonemes for speech recognition

Mohamed Kamal Omar, Ken Chen, Mark Hasegawa-Johnson, Yigal Brandman

Research output: Contribution to conference › Paper › peer-review

Abstract

This paper addresses the problem of finding a subset of the acoustic feature space that best represents the phoneme set used in a speech recognition system. A maximum mutual information approach is presented for selecting acoustic features to be combined together to represent the distinctions among the phonemes. The overall phoneme recognition accuracy is slightly increased for the same length of feature vector for clean speech and at 10 dB compared to FFT-based Mel-frequency cepstrum coefficients (MFCC) by using acoustic features selected based on a maximum mutual information criterion. Using 16 different feature sets, the rank of the feature sets based on mutual information can predict phoneme recognition accuracy with a correlation coefficient of 0.71 compared to a correlation coefficient of 0.28 when using a criterion based on the average pair-wise Kullback-Liebler divergence to rank the feature sets.

Original language	English (US)
Pages	2129-2132
Number of pages	4
State	Published - 2002
Externally published	Yes
Event	7th International Conference on Spoken Language Processing, ICSLP 2002 - Denver, United States Duration: Sep 16 2002 → Sep 20 2002

Other

Other	7th International Conference on Spoken Language Processing, ICSLP 2002
Country/Territory	United States
City	Denver
Period	9/16/02 → 9/20/02

ASJC Scopus subject areas

Language and Linguistics
Linguistics and Language

Cite this

An evaluation of using mutual information for selection of acoustic-features representation of phonemes for speech recognition. / Omar, Mohamed Kamal; Chen, Ken; Hasegawa-Johnson, Mark et al.
2002. 2129-2132 Paper presented at 7th International Conference on Spoken Language Processing, ICSLP 2002, Denver, United States.

Research output: Contribution to conference › Paper › peer-review

@conference{5afbe92a594643258fce35d604f3717f,

title = "An evaluation of using mutual information for selection of acoustic-features representation of phonemes for speech recognition",

abstract = "This paper addresses the problem of finding a subset of the acoustic feature space that best represents the phoneme set used in a speech recognition system. A maximum mutual information approach is presented for selecting acoustic features to be combined together to represent the distinctions among the phonemes. The overall phoneme recognition accuracy is slightly increased for the same length of feature vector for clean speech and at 10 dB compared to FFT-based Mel-frequency cepstrum coefficients (MFCC) by using acoustic features selected based on a maximum mutual information criterion. Using 16 different feature sets, the rank of the feature sets based on mutual information can predict phoneme recognition accuracy with a correlation coefficient of 0.71 compared to a correlation coefficient of 0.28 when using a criterion based on the average pair-wise Kullback-Liebler divergence to rank the feature sets.",

author = "Omar, {Mohamed Kamal} and Ken Chen and Mark Hasegawa-Johnson and Yigal Brandman",

year = "2002",

language = "English (US)",

pages = "2129--2132",

}

TY - CONF

T1 - An evaluation of using mutual information for selection of acoustic-features representation of phonemes for speech recognition

AU - Omar, Mohamed Kamal

AU - Chen, Ken

AU - Hasegawa-Johnson, Mark

AU - Brandman, Yigal

PY - 2002

Y1 - 2002

N2 - This paper addresses the problem of finding a subset of the acoustic feature space that best represents the phoneme set used in a speech recognition system. A maximum mutual information approach is presented for selecting acoustic features to be combined together to represent the distinctions among the phonemes. The overall phoneme recognition accuracy is slightly increased for the same length of feature vector for clean speech and at 10 dB compared to FFT-based Mel-frequency cepstrum coefficients (MFCC) by using acoustic features selected based on a maximum mutual information criterion. Using 16 different feature sets, the rank of the feature sets based on mutual information can predict phoneme recognition accuracy with a correlation coefficient of 0.71 compared to a correlation coefficient of 0.28 when using a criterion based on the average pair-wise Kullback-Liebler divergence to rank the feature sets.

AB - This paper addresses the problem of finding a subset of the acoustic feature space that best represents the phoneme set used in a speech recognition system. A maximum mutual information approach is presented for selecting acoustic features to be combined together to represent the distinctions among the phonemes. The overall phoneme recognition accuracy is slightly increased for the same length of feature vector for clean speech and at 10 dB compared to FFT-based Mel-frequency cepstrum coefficients (MFCC) by using acoustic features selected based on a maximum mutual information criterion. Using 16 different feature sets, the rank of the feature sets based on mutual information can predict phoneme recognition accuracy with a correlation coefficient of 0.71 compared to a correlation coefficient of 0.28 when using a criterion based on the average pair-wise Kullback-Liebler divergence to rank the feature sets.

UR - http://www.scopus.com/inward/record.url?scp=33947688163&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33947688163&partnerID=8YFLogxK

M3 - Paper

AN - SCOPUS:33947688163

SP - 2129

EP - 2132

T2 - 7th International Conference on Spoken Language Processing, ICSLP 2002

Y2 - 16 September 2002 through 20 September 2002

ER -

An evaluation of using mutual information for selection of acoustic-features representation of phonemes for speech recognition

Abstract

Other

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this