Speech Transcript - Shared Task Update Report

0:00:15	before the a final discussion i just want to update
0:00:18	on the watch a shared
0:00:20	task so we have been collecting a lot of the dialogs chat-oriented dialogues collectively using
0:00:28	many chat bots
0:00:29	and i'd like to use
0:00:32	share with the you some of the results we've got
0:00:36	so the objectives of the shared task
0:00:39	is to collect a chat-oriented dialogue data that can be made available for research purposes
0:00:45	so it includes human chop bought data and also human dialogue sessions and the covering
0:00:51	a wide variety of chapel technologist approaches
0:00:54	and also languages and cultural backgrounds i'm offering some japanese chat data and there's some
0:00:59	other people offering chinese and those sewing these dialogues
0:01:03	and the another objective is to develop
0:01:06	a framework
0:01:08	for the automatic evaluation of a chat-oriented a system so we perform subjective evaluation know
0:01:15	that chat data sessions at turn level
0:01:17	and we also crowdsource multiple annotations for the s a martin's because it is very
0:01:22	subjective
0:01:24	and we also applying
0:01:25	machine learning approaches to reproducing human annotations that is human
0:01:30	subjective evaluations
0:01:31	and there
0:01:33	three
0:01:34	activities related to share task so first one is chat data collection so we could
0:01:40	have to human chart buttons human chat-oriented data sessions
0:01:43	and the second one is subjective evaluation so manual scoring annotation at each turn level
0:01:50	of the collected data sessions
0:01:52	and the third activity is targeted metrics so we use machine learning techniques to generative
0:01:58	models that a
0:02:00	able to automatically generate a scoring given by human
0:02:04	annotators
0:02:05	and our main enforce kinds the still focusing on tasks one and two
0:02:12	and the last full roles
0:02:14	in our activity
0:02:16	so you can participate as one of these roles all more than one roles so
0:02:22	you can be a chat bots provider so the participant all the chapel changing the
0:02:27	wants to provide access to it
0:02:28	either by distributing a standalone version or by a web access
0:02:33	and second role would be data generate a the participants those who is willing to
0:02:39	use one of these chat bots and to generate a dialogues
0:02:43	and the third the were always data provider so in you know in the industry
0:02:48	it is difficult to
0:02:50	sort of make access the everything so
0:02:53	the participant owns the
0:02:55	or hasn't access to chat but they cannot provide access to it
0:02:59	but
0:03:00	she or he can generate data was within that come within the company or between
0:03:04	that the institution and provide the generated
0:03:07	dialogue
0:03:08	dialogues
0:03:09	and the last the role is a data and of data so the participant is
0:03:12	willing to annotate
0:03:14	some of the generated and provided that fixations so there are four rows
0:03:18	and
0:03:19	in
0:03:19	so we are recruiting people
0:03:22	but the spitting one of those
0:03:24	these roles
0:03:26	and we l kinds of you have a the use chat bots ready to be
0:03:30	used by anyone so there are six
0:03:34	chat bots joke a
0:03:35	iris
0:03:36	by you guys the sara stick talk and so i mean
0:03:41	and we have been doing some annotation
0:03:44	so we are using several annotation schemes
0:03:47	so this is the first one and the features used quite often by the community
0:03:51	if we choose the appropriateness score so we have about eight acceptable and embodied there
0:03:58	is this is a three-way annotation scheme
0:04:01	and also another annotation scheme we are using is that it breakdown labels scheme features
0:04:07	more focusing on the violation
0:04:09	by the talked about
0:04:10	and it is a but breakdown a possible breakdown or not a breakdown and in
0:04:15	this annotation scheme we are using quite a lot of a human
0:04:19	humans down states this data beta for example we are using about the twenty four
0:04:24	to thirty people to annotate a single utterance so that we can know that
0:04:30	somebody's i we cannot the distribution
0:04:33	although these labels and of because this is i i'll affecting the subject subjective nature
0:04:38	of
0:04:39	chat-oriented dialogue systems
0:04:41	and we also adding additional types
0:04:45	as the how annotation like positive and negative
0:04:48	or offencive tags
0:04:50	two utterances
0:04:51	those thus we are language or
0:04:53	is machine
0:04:55	annotations to this data
0:04:58	two briefly give you the
0:05:01	size
0:05:03	so we have clicked in
0:05:05	but the still not very much succeeding in collecting a large number of that data
0:05:10	we are having about the
0:05:12	six hundred data so far with a
0:05:15	two twenty thousand a turns and the
0:05:19	and then we have over ten thousand annotations of well
0:05:24	but still we apply making a progress
0:05:28	and i just show you some results from
0:05:32	our annotations so this is the us a proper in a score distributions
0:05:36	that humans are doing good but the day there are some invalid utterances from humans
0:05:41	as well and i we are not this closing
0:05:45	what the boards are but
0:05:47	bots have several different sees and some good and some about than analysing them would
0:05:52	be a very interesting
0:05:53	thing to do
0:05:56	and we always so i think data and that's no dialogue sessions are being collect
0:06:02	data using i reason take talk
0:06:03	and also using this data we are performing a
0:06:07	organizing a data rate and
0:06:08	detection times at that a state tracking times six
0:06:12	and also we also annotating other additional dialogue sessions
0:06:17	to the data virus picked up and joker
0:06:21	and they would be appropriateness score prediction task at the next what set workshop
0:06:27	so the next
0:06:29	steps
0:06:30	so we want to continue promoting the shared task activities because we haven't got enough
0:06:34	data and we want to improve the current chart but it can system and we
0:06:39	want to hold the
0:06:41	the next workshop editions hand other events so
0:06:45	the next
0:06:45	what should to be
0:06:47	at next i w is the s
0:06:51	and we have an and some it the proposed out
0:06:53	so that we can have many dialogues and annotations during the summer okay
0:07:00	so that's about
0:07:01	the
0:07:04	task update and if you have any question please ask
0:07:06	now and if not the i we can go to the next discussion

Shared Task Update Report

Second WOCHAT Special Session on Chatbots and Conversational Agents (WOCHAT-SS)

Ryuichiro Higashinaka