SuperLectures.com

IMPROVING TEXT-INDEPENDENT PHONETIC SEGMENTATION BASED ON THE MICROCANONICAL MULTISCALE FORMALISM

Speech Analysis

Full Paper at IEEE Xplore

Přednášející: Vahid Khanagha, Autoři: Vahid Khanagha, Khalid Daoudi, Oriol Pont, Hussein Yahia, INRIA Bordeaux Sud-Ouest, France

In an earlier work, we proposed a novel phonetic segmentation method based on speech analysis under the Microcanonical Multiscale Formalism (MMF). The latter relies on the computation of local geometrical parameters, singularity exponents (SE). We showed that SE convey valuable information about the local dynamics of speech that can readily and simply used to detect phoneme boundaries. By performing error analysis of our original algorithm, in this paper we propose a 2-steps technique which better exploits SE to improve the segmentation accuracy. In the first step, we detect the boundaries of the original signal and of a low-pass filtred version, and we consider the union of all detected boundaries as candidates. In the second step, we use a hypothesis test over the local SE distribution of the original signal to select the final boundaries. We carry out a detailed evaluation and comparison over the full training set of the TIMIT database which could be useful to other researchers for comparison purposes. The results show that the new algorithm not only outperforms the original one, but also is significantly much more accurate than state-of-the-art ones.


  Přepis řeči

|

  Slajdy

Zvětšit slajd | Zobrazit všechny slajdy

0:00:16

  1. slajd

0:00:30

  2. slajd

0:01:17

  3. slajd

0:02:26

  4. slajd

0:03:45

  5. slajd

0:04:26

  6. slajd

0:06:19

  7. slajd

0:08:40

  8. slajd

0:09:28

  9. slajd

0:10:00

 10. slajd

0:10:58

 11. slajd

0:13:54

 12. slajd

0:14:42

 13. slajd

0:17:23

 14. slajd

0:17:38

 15. slajd

0:18:16

 16. slajd

  Komentáře

Please sign in to post your comment!

  Informace o přednášce

Nahráno: 2011-05-25 11:10 - 11:30, Panorama
Přidáno: 15. 6. 2011 16:35
Počet zhlédnutí: 24
Rozlišení videa: 1024x576 px, 512x288 px
Délka videa: 0:19:28
Audio stopa: MP3 [6.58 MB], 0:19:28