DISTRIBUTED TRAINING OF LARGE SCALE EXPONENTIAL LANGUAGE MODELS
Language Modeling
Přednášející: Bhuvana Ramabhadran, Autoři: Abhinav Sethy, Stanley Chen, Bhuvana Ramabhadran, IBM, United States
Shrinkage-based exponential language models, such as the recently introduced Model M, have provided significant gains over a range of tasks . Training such models requires a large amount of computational resources in terms of both time and memory. In this paper, we present a distributed training algorithm for such models based on the idea of cluster expansion . Cluster expansion allows us to efficiently calculate the normalization and expectations terms required for Model M training by minimizing the computation needed between consecutive n-grams. We also show how the algorithm can be implemented in a distributed environment, greatly reducing the memory required per process and training time.
Informace o přednášce
Nahráno: | 2011-05-25 16:35 - 16:55, Club H |
---|---|
Přidáno: | 9. 6. 2011 01:58 |
Počet zhlédnutí: | 47 |
Rozlišení videa: | 1024x576 px, 512x288 px |
Délka videa: | 0:19:16 |
Audio stopa: | MP3 [6.58 MB], 0:19:16 |
Komentáře