SuperLectures.com

LANGUAGE IDENTIFICATION USING A COMBINED ARTICULATORY PROSODY FRAMEWORK

Language Identification

Full Paper at IEEE Xplore

Presented by: John Hansen, Author(s): Abhijeet Sangwan, Mahnoosh Mehrabani, John Hansen, The University of Texas at Dallas, United States

This study presents new advancements in our articulatory-based language identification (LID) system. Our LID system automatically identifies language-features (LFs) from a phonological features (PFs) based representation of speech. While our baseline system uses a static PF-representation for extracting LFs, the new system is based on a dynamic PF representation for feature extraction. Interestingly, the new LFs outperform our baseline system by 11.8% absolute in a difficult 5-way classification task of South Indian Languages. Additionally, we incorporate pitch and energy based features in our new system to leverage prosody in classification. In particular, we employ a Legendre polynomial based contour-estimation to capture shape parameters which are used in classification. Additionally, the fusion of PF and prosody-based LFs further improves the overall classification result by 16.5% absolute over the baseline system. Finally, the proposed articulatory language ID system is combined with a PPRLM (parallel phone recognition language model) system to obtain an overall classification accuracy of 86.6%.


  Speech Transcript

|

  Slides

Enlarge the slide | Show all slides in a pop-up window

0:00:16

  1. slide

0:00:52

  2. slide

0:01:37

  3. slide

0:02:33

  4. slide

0:05:39

  5. slide

0:06:14

  6. slide

0:07:12

  7. slide

0:07:57

  8. slide

0:08:22

  9. slide

0:09:09

 10. slide

0:09:25

 11. slide

0:09:51

 12. slide

0:10:29

 13. slide

0:11:09

 14. slide

0:11:56

 15. slide

0:12:35

 16. slide

0:13:12

 17. slide

0:14:11

 18. slide

0:14:57

 19. slide

  Comments

Please sign in to post your comment!

  Lecture Information

Recorded: 2011-05-24 10:55 - 11:15, Panorama
Added: 16. 6. 2011 15:17
Number of views: 38
Video resolution: 1024x576 px, 512x288 px
Video length: 0:19:23
Audio track: MP3 [6.55 MB], 0:19:23