Modeling pronunciation variation using artificial neural networks for English spontaneous speech

Ken Chen; Mark Hasegawa-Johnson

Modeling pronunciation variation using artificial neural networks for English spontaneous speech

Ken Chen, Mark Hasegawa-Johnson

Research output: Contribution to conference › Paper › peer-review

Abstract

Pronunciation variation in conversational speech has caused significant amount of word errors in large vocabulary automatic speech recognition. Rule-based approaches and decision-tree based approaches have been previously proposed to model pronunciation variation. In this paper, we report our work on modeling pronunciation variation using artificial neural networks (ANN). The results we achieved are significantly better than previously published ones on two different corpora, indicating that ANN may be better suited for modeling pronunciation variation than other statistical models that have been previously investigated. Our experiments indicate that binary distinctive features can be used to effectively represent the phonological context. We also find that including pitch accent feature in input improves the prediction of pronunciation variation on a ToBI-labeled subset of the Switchboard corpus.

Original language	English (US)
Pages	1461-1464
Number of pages	4
State	Published - 2004
Externally published	Yes
Event	8th International Conference on Spoken Language Processing, ICSLP 2004 - Jeju, Jeju Island, Korea, Republic of Duration: Oct 4 2004 → Oct 8 2004

Other

Other	8th International Conference on Spoken Language Processing, ICSLP 2004
Country/Territory	Korea, Republic of
City	Jeju, Jeju Island
Period	10/4/04 → 10/8/04

ASJC Scopus subject areas

Language and Linguistics
Linguistics and Language

Cite this

@conference{c1621c373e1d4daa816b35a5d14fc22a,

title = "Modeling pronunciation variation using artificial neural networks for English spontaneous speech",

abstract = "Pronunciation variation in conversational speech has caused significant amount of word errors in large vocabulary automatic speech recognition. Rule-based approaches and decision-tree based approaches have been previously proposed to model pronunciation variation. In this paper, we report our work on modeling pronunciation variation using artificial neural networks (ANN). The results we achieved are significantly better than previously published ones on two different corpora, indicating that ANN may be better suited for modeling pronunciation variation than other statistical models that have been previously investigated. Our experiments indicate that binary distinctive features can be used to effectively represent the phonological context. We also find that including pitch accent feature in input improves the prediction of pronunciation variation on a ToBI-labeled subset of the Switchboard corpus.",

author = "Ken Chen and Mark Hasegawa-Johnson",

year = "2004",

language = "English (US)",

pages = "1461--1464",

}

TY - CONF

T1 - Modeling pronunciation variation using artificial neural networks for English spontaneous speech

AU - Chen, Ken

AU - Hasegawa-Johnson, Mark

PY - 2004

Y1 - 2004

N2 - Pronunciation variation in conversational speech has caused significant amount of word errors in large vocabulary automatic speech recognition. Rule-based approaches and decision-tree based approaches have been previously proposed to model pronunciation variation. In this paper, we report our work on modeling pronunciation variation using artificial neural networks (ANN). The results we achieved are significantly better than previously published ones on two different corpora, indicating that ANN may be better suited for modeling pronunciation variation than other statistical models that have been previously investigated. Our experiments indicate that binary distinctive features can be used to effectively represent the phonological context. We also find that including pitch accent feature in input improves the prediction of pronunciation variation on a ToBI-labeled subset of the Switchboard corpus.

AB - Pronunciation variation in conversational speech has caused significant amount of word errors in large vocabulary automatic speech recognition. Rule-based approaches and decision-tree based approaches have been previously proposed to model pronunciation variation. In this paper, we report our work on modeling pronunciation variation using artificial neural networks (ANN). The results we achieved are significantly better than previously published ones on two different corpora, indicating that ANN may be better suited for modeling pronunciation variation than other statistical models that have been previously investigated. Our experiments indicate that binary distinctive features can be used to effectively represent the phonological context. We also find that including pitch accent feature in input improves the prediction of pronunciation variation on a ToBI-labeled subset of the Switchboard corpus.

UR - http://www.scopus.com/inward/record.url?scp=85009097780&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85009097780&partnerID=8YFLogxK

M3 - Paper

AN - SCOPUS:85009097780

SP - 1461

EP - 1464

T2 - 8th International Conference on Spoken Language Processing, ICSLP 2004

Y2 - 4 October 2004 through 8 October 2004

ER -

Modeling pronunciation variation using artificial neural networks for English spontaneous speech

Abstract

Other

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this