SIGdial 2017

18th Annual SIGdial Meeting on Discourse and Dialogue

Challenges for Data-driven Dialogue Systems: Finding the Goldilocks Zone for Conversational Data

Oliver Lemon

I will review current approaches to data-driven dialogue systems, both for tasks and social chat, focusing on three main issues: synthetic data, big data, and noisy data. With reference to some of our current projects, I will illustrate (1) the limitations of using synthetic data; (2) how linguistic knowledge, in the form of a semantic grammar, can be used in combination with machine learning to bootstrap dialogue systems from very small amounts of data; and (3) how our Amazon Alexa Challenge system has been built to avoid some of the problems of large amounts of real but problematically noisy data. For more information please visit: www.macs.hw.ac.uk/InteractionLab