0:00:15 | before the a final discussion i just want to update |
---|
0:00:18 | on the watch a shared |
---|
0:00:20 | task so we have been collecting a lot of the dialogs chat-oriented dialogues collectively using |
---|
0:00:28 | many chat bots |
---|
0:00:29 | and i'd like to use |
---|
0:00:32 | share with the you some of the results we've got |
---|
0:00:36 | so the objectives of the shared task |
---|
0:00:39 | is to collect a chat-oriented dialogue data that can be made available for research purposes |
---|
0:00:45 | so it includes human chop bought data and also human dialogue sessions and the covering |
---|
0:00:51 | a wide variety of chapel technologist approaches |
---|
0:00:54 | and also languages and cultural backgrounds i'm offering some japanese chat data and there's some |
---|
0:00:59 | other people offering chinese and those sewing these dialogues |
---|
0:01:03 | and the another objective is to develop |
---|
0:01:06 | a framework |
---|
0:01:08 | for the automatic evaluation of a chat-oriented a system so we perform subjective evaluation know |
---|
0:01:15 | that chat data sessions at turn level |
---|
0:01:17 | and we also crowdsource multiple annotations for the s a martin's because it is very |
---|
0:01:22 | subjective |
---|
0:01:24 | and we also applying |
---|
0:01:25 | machine learning approaches to reproducing human annotations that is human |
---|
0:01:30 | subjective evaluations |
---|
0:01:31 | and there |
---|
0:01:33 | three |
---|
0:01:34 | activities related to share task so first one is chat data collection so we could |
---|
0:01:40 | have to human chart buttons human chat-oriented data sessions |
---|
0:01:43 | and the second one is subjective evaluation so manual scoring annotation at each turn level |
---|
0:01:50 | of the collected data sessions |
---|
0:01:52 | and the third activity is targeted metrics so we use machine learning techniques to generative |
---|
0:01:58 | models that a |
---|
0:02:00 | able to automatically generate a scoring given by human |
---|
0:02:04 | annotators |
---|
0:02:05 | and our main enforce kinds the still focusing on tasks one and two |
---|
0:02:12 | and the last full roles |
---|
0:02:14 | in our activity |
---|
0:02:16 | so you can participate as one of these roles all more than one roles so |
---|
0:02:22 | you can be a chat bots provider so the participant all the chapel changing the |
---|
0:02:27 | wants to provide access to it |
---|
0:02:28 | either by distributing a standalone version or by a web access |
---|
0:02:33 | and second role would be data generate a the participants those who is willing to |
---|
0:02:39 | use one of these chat bots and to generate a dialogues |
---|
0:02:43 | and the third the were always data provider so in you know in the industry |
---|
0:02:48 | it is difficult to |
---|
0:02:50 | sort of make access the everything so |
---|
0:02:53 | the participant owns the |
---|
0:02:55 | or hasn't access to chat but they cannot provide access to it |
---|
0:02:59 | but |
---|
0:03:00 | she or he can generate data was within that come within the company or between |
---|
0:03:04 | that the institution and provide the generated |
---|
0:03:07 | dialogue |
---|
0:03:08 | dialogues |
---|
0:03:09 | and the last the role is a data and of data so the participant is |
---|
0:03:12 | willing to annotate |
---|
0:03:14 | some of the generated and provided that fixations so there are four rows |
---|
0:03:18 | and |
---|
0:03:19 | in |
---|
0:03:19 | so we are recruiting people |
---|
0:03:22 | but the spitting one of those |
---|
0:03:24 | these roles |
---|
0:03:26 | and we l kinds of you have a the use chat bots ready to be |
---|
0:03:30 | used by anyone so there are six |
---|
0:03:34 | chat bots joke a |
---|
0:03:35 | iris |
---|
0:03:36 | by you guys the sara stick talk and so i mean |
---|
0:03:41 | and we have been doing some annotation |
---|
0:03:44 | so we are using several annotation schemes |
---|
0:03:47 | so this is the first one and the features used quite often by the community |
---|
0:03:51 | if we choose the appropriateness score so we have about eight acceptable and embodied there |
---|
0:03:58 | is this is a three-way annotation scheme |
---|
0:04:01 | and also another annotation scheme we are using is that it breakdown labels scheme features |
---|
0:04:07 | more focusing on the violation |
---|
0:04:09 | by the talked about |
---|
0:04:10 | and it is a but breakdown a possible breakdown or not a breakdown and in |
---|
0:04:15 | this annotation scheme we are using quite a lot of a human |
---|
0:04:19 | humans down states this data beta for example we are using about the twenty four |
---|
0:04:24 | to thirty people to annotate a single utterance so that we can know that |
---|
0:04:30 | somebody's i we cannot the distribution |
---|
0:04:33 | although these labels and of because this is i i'll affecting the subject subjective nature |
---|
0:04:38 | of |
---|
0:04:39 | chat-oriented dialogue systems |
---|
0:04:41 | and we also adding additional types |
---|
0:04:45 | as the how annotation like positive and negative |
---|
0:04:48 | or offencive tags |
---|
0:04:50 | two utterances |
---|
0:04:51 | those thus we are language or |
---|
0:04:53 | is machine |
---|
0:04:55 | annotations to this data |
---|
0:04:58 | two briefly give you the |
---|
0:05:01 | size |
---|
0:05:03 | so we have clicked in |
---|
0:05:05 | but the still not very much succeeding in collecting a large number of that data |
---|
0:05:10 | we are having about the |
---|
0:05:12 | six hundred data so far with a |
---|
0:05:15 | two twenty thousand a turns and the |
---|
0:05:19 | and then we have over ten thousand annotations of well |
---|
0:05:24 | but still we apply making a progress |
---|
0:05:28 | and i just show you some results from |
---|
0:05:32 | our annotations so this is the us a proper in a score distributions |
---|
0:05:36 | that humans are doing good but the day there are some invalid utterances from humans |
---|
0:05:41 | as well and i we are not this closing |
---|
0:05:45 | what the boards are but |
---|
0:05:47 | bots have several different sees and some good and some about than analysing them would |
---|
0:05:52 | be a very interesting |
---|
0:05:53 | thing to do |
---|
0:05:56 | and we always so i think data and that's no dialogue sessions are being collect |
---|
0:06:02 | data using i reason take talk |
---|
0:06:03 | and also using this data we are performing a |
---|
0:06:07 | organizing a data rate and |
---|
0:06:08 | detection times at that a state tracking times six |
---|
0:06:12 | and also we also annotating other additional dialogue sessions |
---|
0:06:17 | to the data virus picked up and joker |
---|
0:06:21 | and they would be appropriateness score prediction task at the next what set workshop |
---|
0:06:27 | so the next |
---|
0:06:29 | steps |
---|
0:06:30 | so we want to continue promoting the shared task activities because we haven't got enough |
---|
0:06:34 | data and we want to improve the current chart but it can system and we |
---|
0:06:39 | want to hold the |
---|
0:06:41 | the next workshop editions hand other events so |
---|
0:06:45 | the next |
---|
0:06:45 | what should to be |
---|
0:06:47 | at next i w is the s |
---|
0:06:51 | and we have an and some it the proposed out |
---|
0:06:53 | so that we can have many dialogues and annotations during the summer okay |
---|
0:07:00 | so that's about |
---|
0:07:01 | the |
---|
0:07:04 | task update and if you have any question please ask |
---|
0:07:06 | now and if not the i we can go to the next discussion |
---|