before the a final discussion i just want to update
on the watch a shared
task so we have been collecting a lot of the dialogs chat-oriented dialogues collectively using
many chat bots
and i'd like to use
share with the you some of the results we've got
so the objectives of the shared task
is to collect a chat-oriented dialogue data that can be made available for research purposes
so it includes human chop bought data and also human dialogue sessions and the covering
a wide variety of chapel technologist approaches
and also languages and cultural backgrounds i'm offering some japanese chat data and there's some
other people offering chinese and those sewing these dialogues
and the another objective is to develop
a framework
for the automatic evaluation of a chat-oriented a system so we perform subjective evaluation know
that chat data sessions at turn level
and we also crowdsource multiple annotations for the s a martin's because it is very
subjective
and we also applying
machine learning approaches to reproducing human annotations that is human
subjective evaluations
and there
three
activities related to share task so first one is chat data collection so we could
have to human chart buttons human chat-oriented data sessions
and the second one is subjective evaluation so manual scoring annotation at each turn level
of the collected data sessions
and the third activity is targeted metrics so we use machine learning techniques to generative
models that a
able to automatically generate a scoring given by human
annotators
and our main enforce kinds the still focusing on tasks one and two
and the last full roles
in our activity
so you can participate as one of these roles all more than one roles so
you can be a chat bots provider so the participant all the chapel changing the
wants to provide access to it
either by distributing a standalone version or by a web access
and second role would be data generate a the participants those who is willing to
use one of these chat bots and to generate a dialogues
and the third the were always data provider so in you know in the industry
it is difficult to
sort of make access the everything so
the participant owns the
or hasn't access to chat but they cannot provide access to it
but
she or he can generate data was within that come within the company or between
that the institution and provide the generated
dialogue
dialogues
and the last the role is a data and of data so the participant is
willing to annotate
some of the generated and provided that fixations so there are four rows
and
in
so we are recruiting people
but the spitting one of those
these roles
and we l kinds of you have a the use chat bots ready to be
used by anyone so there are six
chat bots joke a
iris
by you guys the sara stick talk and so i mean
and we have been doing some annotation
so we are using several annotation schemes
so this is the first one and the features used quite often by the community
if we choose the appropriateness score so we have about eight acceptable and embodied there
is this is a three-way annotation scheme
and also another annotation scheme we are using is that it breakdown labels scheme features
more focusing on the violation
by the talked about
and it is a but breakdown a possible breakdown or not a breakdown and in
this annotation scheme we are using quite a lot of a human
humans down states this data beta for example we are using about the twenty four
to thirty people to annotate a single utterance so that we can know that
somebody's i we cannot the distribution
although these labels and of because this is i i'll affecting the subject subjective nature
of
chat-oriented dialogue systems
and we also adding additional types
as the how annotation like positive and negative
or offencive tags
two utterances
those thus we are language or
is machine
annotations to this data
two briefly give you the
size
so we have clicked in
but the still not very much succeeding in collecting a large number of that data
we are having about the
six hundred data so far with a
two twenty thousand a turns and the
and then we have over ten thousand annotations of well
but still we apply making a progress
and i just show you some results from
our annotations so this is the us a proper in a score distributions
that humans are doing good but the day there are some invalid utterances from humans
as well and i we are not this closing
what the boards are but
bots have several different sees and some good and some about than analysing them would
be a very interesting
thing to do
and we always so i think data and that's no dialogue sessions are being collect
data using i reason take talk
and also using this data we are performing a
organizing a data rate and
detection times at that a state tracking times six
and also we also annotating other additional dialogue sessions
to the data virus picked up and joker
and they would be appropriateness score prediction task at the next what set workshop
so the next
steps
so we want to continue promoting the shared task activities because we haven't got enough
data and we want to improve the current chart but it can system and we
want to hold the
the next workshop editions hand other events so
the next
what should to be
at next i w is the s
and we have an and some it the proposed out
so that we can have many dialogues and annotations during the summer okay
so that's about
the
task update and if you have any question please ask
now and if not the i we can go to the next discussion