An evaluation of using mutual information for selection of acoustic-features representation of phonemes for speech recognition

Mohamed Kamal Omar, Ken Chen, Mark Hasegawa-Johnson, Yigal Brandman

Research output: Contribution to conferencePaperpeer-review

8 Scopus citations

Abstract

This paper addresses the problem of finding a subset of the acoustic feature space that best represents the phoneme set used in a speech recognition system. A maximum mutual information approach is presented for selecting acoustic features to be combined together to represent the distinctions among the phonemes. The overall phoneme recognition accuracy is slightly increased for the same length of feature vector for clean speech and at 10 dB compared to FFT-based Mel-frequency cepstrum coefficients (MFCC) by using acoustic features selected based on a maximum mutual information criterion. Using 16 different feature sets, the rank of the feature sets based on mutual information can predict phoneme recognition accuracy with a correlation coefficient of 0.71 compared to a correlation coefficient of 0.28 when using a criterion based on the average pair-wise Kullback-Liebler divergence to rank the feature sets.

Original languageEnglish (US)
Pages2129-2132
Number of pages4
StatePublished - 2002
Externally publishedYes
Event7th International Conference on Spoken Language Processing, ICSLP 2002 - Denver, United States
Duration: Sep 16 2002Sep 20 2002

Other

Other7th International Conference on Spoken Language Processing, ICSLP 2002
Country/TerritoryUnited States
CityDenver
Period9/16/029/20/02

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'An evaluation of using mutual information for selection of acoustic-features representation of phonemes for speech recognition'. Together they form a unique fingerprint.

Cite this