SuperLectures.com

AN INVESTIGATION OF SUBSPACE MODELING FOR PHONETIC AND SPEAKER VARIABILITY IN AUTOMATIC SPEECH RECOGNITION

Acoustic Modeling

Full Paper at IEEE Xplore

Presented by: Richard Rose, Author(s): Richard Rose, Shou-Chun Yin, Yun Tang, McGill University, Canada

This paper investigates the impact of subspace based techniques for modeling speaker variability and phonetic variability in automatic speech recognition(ASR). There are many well known approaches to speaker space based adaptation which represent sources of variability as a projection within a low dimensional subspace. A new approach to acoustic modeling in ASR, referred to as the subspace based Gaussian mixture model (SGMM), represents phonetic variability as a set of projections applied at the state level in a hidden Markov model (HMM) based acoustic model. The impact of the SGMM in modeling these intrinsic sources of variability is evaluated for a continuous speech recognition (CSR) task where the performance of continuous density HMM(CDHMM) based ASR systems is already reasonably good. Speaker independent SGMM based ASR was shown to provide an 18% reduction in word error rate (WER) over the CDHMM and a 5% reduction in WER over unsupervised speaker adaptation in the resource management CSR domain.


  Speech Transcript

|

  Slides

Enlarge the slide | Show all slides in a pop-up window

0:00:16

  1. slide

0:00:39

  2. slide

0:02:29

  3. slide

0:03:57

  4. slide

0:05:04

  5. slide

0:08:39

  6. slide

0:09:44

  7. slide

0:11:36

  8. slide

0:12:47

  9. slide

0:14:10

 10. slide

0:15:02

 11. slide

0:16:46

 12. slide

0:18:17

 13. slide

0:19:44

     7. slide

  Comments

Please sign in to post your comment!

  Lecture Information

Recorded: 2011-05-25 15:25 - 15:45, Panorama
Added: 15. 6. 2011 17:03
Number of views: 73
Video resolution: 1024x576 px, 512x288 px
Video length: 0:20:20
Audio track: MP3 [6.87 MB], 0:20:20