Towards long-range prosodic attribute modeling for language recognition
Refereed conference paper presented and published in conference proceedings


Full Text

Times Cited
Web of Science2WOS source URL (as at 29/11/2020) Click here for the latest count

Other information
AbstractAs a high-level feature, prosody may be an effective feature when it is modeled over longer ranges than the typical range of a syllable. This paper is about language recognition with the high-level prosodic attributes. It studies two important issues of long-range modeling, namely the data scarcity handling method, and the model which properly describes prosodic boundary events. Illustrated by NIST language recognition evaluation (LRE) 2009, long-range modeling is shown to bring a 7.2% relative improvement to a prosodic language detector. Score fusion between the long-range prosodic system and a phonotactic system gives an EER of 3.07%. Exploiting boundary N-grams is the main contributing factor to global EER reduction, while different long-range prosodic modeling factors benefit the detection of different languages. Analysis reveals the evidence of language-specific long-range prosodic attributes, which sheds light on robust long-range modeling methods for language recognition.
All Author(s) ListNg RWM, Leung CC, Hautamaki V, Lee T, Ma B, Li HZ
Name of Conference11th Annual Conference of the International-Speech-Communication-Association 2010
Start Date of Conference26/09/2010
End Date of Conference30/09/2010
Place of ConferenceMakuhari
Country/Region of ConferenceJapan
Detailed descriptionorganized by International Speech Communication Association,
Year2010
Month1
Day1
PublisherISCA-INST SPEECH COMMUNICATION ASSOC
Pages1792 - 1795
ISBN978-1-61782-123-3
LanguagesEnglish-United Kingdom
Keywordslanguage recognition; long-range modeling; prosody
Web of Science Subject CategoriesEngineering; Engineering, Electrical & Electronic; Telecommunications

Last updated on 2020-30-11 at 00:19