SuperLectures.com

LINGUISTIC INFLUENCES ON BOTTOM-UP AND TOP-DOWN CLUSTERING FOR SPEAKER DIARIZATION

Full Paper at IEEE Xplore

Speaker Diarization

Přednášející: Simon Bozonnet, Autoři: Simon Bozonnet, Dong Wang, Nicholas Evans, Raphaël Troncy, EURECOM, France

While bottom-up approaches have emerged as the standard, default approach to clustering for speaker diarization we have always found the top-down approach gives equivalent or superior performance. Our recent work shows that significant gains in performance can be obtained when cluster purification is applied to the output of top-down systems but that it can degrade performance when applied to the output of bottom-up systems. This paper demonstrates that these observations can be accounted for by factors unrelated to the speaker and that they can impact more strongly on the performance of bottom-up clustering strategies than top-down strategies. Experimental results confirm that clusters produced through top-down clustering are better normalized against phone variation than those produced through bottom-up clustering and that this accounts for the observed inconsistencies in purification performance. The work highlights the need for marginalization strategies which should encourage convergence toward different speakers rather than toward nuisance factors such as that those related to the linguistic content.


  Přepis řeči

|

  Slajdy

Zvětšit slajd | Zobrazit všechny slajdy

0:00:16

  1. slajd

0:00:54

  2. slajd

0:01:57

  3. slajd

0:02:41

  4. slajd

0:03:32

  5. slajd

0:04:15

  6. slajd

0:05:26

  7. slajd

0:05:58

  8. slajd

0:06:28

  9. slajd

0:07:20

 10. slajd

0:08:10

 11. slajd

0:08:58

 12. slajd

0:09:31

 13. slajd

0:09:52

 14. slajd

0:10:46

 15. slajd

0:12:10

 16. slajd

0:12:54

 17. slajd

0:15:15

 18. slajd

0:16:02

 19. slajd

0:17:04

 20. slajd

  Komentáře

Please sign in to post your comment!

  Informace o přednášce

Nahráno: 2011-05-24 14:25 - 14:45, Panorama
Přidáno: 16. 6. 2011 18:59
Počet zhlédnutí: 20
Rozlišení videa: 1024x576 px, 512x288 px
Délka videa: 0:19:26
Audio stopa: MP3 [6.57 MB], 0:19:26