Exploration of phase and vocal excitation modulation features for speaker recognition
Refereed conference paper presented and published in conference proceedings


Times Cited
Altmetrics Information
.

Other information
AbstractMel-frequency cepstral coefficients (MFCCs) are found closely related to the linguistic content of speech. Besides cepstral features, there are resources in speech, e.g, the phase and excitation source, are believed to contain useful properties for speaker discrimination. Moreover, the magnitude-based features are insufficient to provide satisfactory and robust speaker recognition accuracy in real-world applications when large variations exist between the development and application scenarios. AM-FM signal modeling technique offers an effective approach to characterize and analyze speech properties. This work is therefore motivated to capture the relevant phase and vocal excitation related modulation features in complementing with MFCCs. In the context of multi-band demodulation analysis, we present a novel parameterization of speech and vocal excitation signal. A pertinent representation for most dominant primary frequencies present in the speech signal is first built. It is then applied to frames of the speech signal to derive effective speaker-discriminative features. The source-related amplitude and phase quantities are also parameterized into feature vectors. The application of the features is assessed in the context of a standard speaker identification and verification system. Complementary correlation between MFCCs and the modulation features is revealed by system fusion on score level. © 2012 Springer-Verlag.
All Author(s) ListWang N., Ching P.C., Lee T.
Name of Conference7th Chinese Conference on Biometric Recognition, CCBR 2012
Start Date of Conference04/12/2012
End Date of Conference05/12/2012
Place of ConferenceGuangzhou
Country/Region of ConferenceChina
Detailed descriptioned. by Wei-Shi Zheng, Zhenan Sun, Yunhong Wang, Xilin Chen, Pong C. Yuen and Jianhuang Lai.
Year2012
Month12
Day26
Volume Number7701 LNCS
PublisherSpringer Verlag
Place of PublicationGermany
Pages251 - 259
ISBN9783642355059
ISSN0302-9743
LanguagesEnglish-United Kingdom
Keywordsexcitation modulation features, phase information, Speaker recognition

Last updated on 2020-26-11 at 00:06