Speech Transcript - Shared Task Update Report

before the a final discussion i just want to update

on the watch a shared

task so we have been collecting a lot of the dialogs chat-oriented dialogues collectively using

many chat bots

and i'd like to use

share with the you some of the results we've got

so the objectives of the shared task

is to collect a chat-oriented dialogue data that can be made available for research purposes

so it includes human chop bought data and also human dialogue sessions and the covering

a wide variety of chapel technologist approaches

and also languages and cultural backgrounds i'm offering some japanese chat data and there's some

other people offering chinese and those sewing these dialogues

and the another objective is to develop

a framework

for the automatic evaluation of a chat-oriented a system so we perform subjective evaluation know

that chat data sessions at turn level

and we also crowdsource multiple annotations for the s a martin's because it is very

subjective

and we also applying

machine learning approaches to reproducing human annotations that is human

subjective evaluations

and there

three

activities related to share task so first one is chat data collection so we could

have to human chart buttons human chat-oriented data sessions

and the second one is subjective evaluation so manual scoring annotation at each turn level

of the collected data sessions

and the third activity is targeted metrics so we use machine learning techniques to generative

models that a

able to automatically generate a scoring given by human

annotators

and our main enforce kinds the still focusing on tasks one and two

and the last full roles

in our activity

so you can participate as one of these roles all more than one roles so

you can be a chat bots provider so the participant all the chapel changing the

wants to provide access to it

either by distributing a standalone version or by a web access

and second role would be data generate a the participants those who is willing to

use one of these chat bots and to generate a dialogues

and the third the were always data provider so in you know in the industry

it is difficult to

sort of make access the everything so

the participant owns the

or hasn't access to chat but they cannot provide access to it

but

she or he can generate data was within that come within the company or between

that the institution and provide the generated

dialogue

dialogues

and the last the role is a data and of data so the participant is

willing to annotate

some of the generated and provided that fixations so there are four rows

and

so we are recruiting people

but the spitting one of those

these roles

and we l kinds of you have a the use chat bots ready to be

used by anyone so there are six

chat bots joke a

iris

by you guys the sara stick talk and so i mean

and we have been doing some annotation

so we are using several annotation schemes

so this is the first one and the features used quite often by the community

if we choose the appropriateness score so we have about eight acceptable and embodied there

is this is a three-way annotation scheme

and also another annotation scheme we are using is that it breakdown labels scheme features

more focusing on the violation

by the talked about

and it is a but breakdown a possible breakdown or not a breakdown and in

this annotation scheme we are using quite a lot of a human

humans down states this data beta for example we are using about the twenty four

to thirty people to annotate a single utterance so that we can know that

somebody's i we cannot the distribution

although these labels and of because this is i i'll affecting the subject subjective nature

chat-oriented dialogue systems

and we also adding additional types

as the how annotation like positive and negative

or offencive tags

two utterances

those thus we are language or

is machine

annotations to this data

two briefly give you the

size

so we have clicked in

but the still not very much succeeding in collecting a large number of that data

we are having about the

six hundred data so far with a

two twenty thousand a turns and the

and then we have over ten thousand annotations of well

but still we apply making a progress

and i just show you some results from

our annotations so this is the us a proper in a score distributions

that humans are doing good but the day there are some invalid utterances from humans

as well and i we are not this closing

what the boards are but

bots have several different sees and some good and some about than analysing them would

be a very interesting

thing to do

and we always so i think data and that's no dialogue sessions are being collect

data using i reason take talk

and also using this data we are performing a

organizing a data rate and

detection times at that a state tracking times six

and also we also annotating other additional dialogue sessions

to the data virus picked up and joker

and they would be appropriateness score prediction task at the next what set workshop

so the next

steps

so we want to continue promoting the shared task activities because we haven't got enough

data and we want to improve the current chart but it can system and we

want to hold the

the next workshop editions hand other events so

the next

what should to be

at next i w is the s

and we have an and some it the proposed out

so that we can have many dialogues and annotations during the summer okay

so that's about

the

task update and if you have any question please ask

now and if not the i we can go to the next discussion

Shared Task Update Report

Second WOCHAT Special Session on Chatbots and Conversational Agents (WOCHAT-SS)

Ryuichiro Higashinaka