SuperLectures.com

CONCEPT-BASED CLASSIFICATION FOR MULTI-DOCUMENT SUMMARIZATION

Full Paper at IEEE Xplore

Spoken Document Processing

Přednášející: Dilek Hakkani-Tur, Autoři: Asli Celikyilmaz, University of California Berkeley, United States; Dilek Hakkani-Tür, Microsoft Corporation, United States

Documents often contain inherently many concepts reflecting specific and generic aspects. To automatically generate a short summary text of documents on similar topics, it is imperative that we discover general aspects in documents because summaries usually contain general rather than specific concepts. This paper presents a semi-supervised extractive summarization model based upon latent concept classification that can differentiate between the two types of aspects as hidden concepts being mentioned in documents. A classifier is trained on hidden concepts discovered from documents and their corresponding human-generated summaries using a probabilistic Bayesian model: the summary-focused topic model. Experimental results based on ROUGE evaluations indicate that ranking sentences to be included in summary text based on the latent summary concept classification has improvements on the quality of the generated summaries.


  Přepis řeči

|

  Slajdy

Zvětšit slajd | Zobrazit všechny slajdy

0:00:16

  1. slajd

0:00:23

  2. slajd

0:01:39

  3. slajd

0:03:02

  4. slajd

0:05:17

  5. slajd

0:06:27

  6. slajd

0:07:17

  7. slajd

0:07:57

  8. slajd

0:08:17

  9. slajd

0:08:40

 10. slajd

0:09:28

 11. slajd

0:10:45

 12. slajd

0:12:23

 13. slajd

0:13:41

 14. slajd

  Komentáře

Please sign in to post your comment!

  Informace o přednášce

Nahráno: 2011-05-27 14:05 - 14:25, Panorama
Přidáno: 7. 6. 2011 19:17
Počet zhlédnutí: 34
Rozlišení videa: 1024x576 px, 512x288 px
Délka videa: 0:16:08
Audio stopa: MP3 [5.51 MB], 0:16:08