0:00:15 | so my name is a given to be some degree c and i'm currently a |
---|
0:00:19 | postdoctoral researcher |
---|
0:00:22 | and i'm going to present this work with a great level and phonetic nonbackchannel |
---|
0:00:29 | and |
---|
0:00:30 | first let me see if you were the buttons and the context of this work |
---|
0:00:34 | so this work is part of the european project |
---|
0:00:39 | i have a spell |
---|
0:00:40 | which aims that the designing artificial which we get of information and assistance |
---|
0:00:46 | and this assistant a on the form of the actual agents |
---|
0:00:51 | but are it can that ever to engage in a pretty model interaction |
---|
0:00:57 | involving verbal and nonverbal behavior |
---|
0:01:01 | there's agents also aim at its adapting to the user |
---|
0:01:07 | and adapting to for instance in expected situations such as interaction |
---|
0:01:12 | as well as to this to show emotional state of the u |
---|
0:01:17 | and |
---|
0:01:18 | in these projects and that's to that interested in a convergence and that better alignment |
---|
0:01:25 | as shown by the communication accommodation sorry |
---|
0:01:29 | can value convergence of behaviour is a very important features of you menu many interaction |
---|
0:01:36 | that occurs both at low level such as pos true accent speech right and that |
---|
0:01:43 | high-level such as of the mental emotional and cognitive label |
---|
0:01:48 | and in particular |
---|
0:01:52 | human |
---|
0:01:53 | human the participant |
---|
0:01:56 | align the mb at all at many linguistic level such as the lexical syntactic and |
---|
0:02:01 | semantic ones |
---|
0:02:05 | and one consequence of successful alignments in dialogue i is a set and a repetitive |
---|
0:02:12 | nice |
---|
0:02:13 | and |
---|
0:02:16 | as a consequence there are there is going to be a |
---|
0:02:20 | some of dialog regions that are going to imagine between the dialogue participant |
---|
0:02:27 | under the form of lexical items for instance |
---|
0:02:31 | so on the slide you can see two example of a dialog which represent the |
---|
0:02:35 | same face aging introduction every face of a negotiation |
---|
0:02:40 | and in this |
---|
0:02:43 | in this examples |
---|
0:02:45 | the dialogue buttons |
---|
0:02:47 | a core roles and their buttons are the main focus of this work |
---|
0:02:52 | so on the left you can see that they are very few buttons |
---|
0:02:56 | in this case we says that the available alignment is very low on the contrary |
---|
0:03:01 | on the right example you can see that |
---|
0:03:03 | that a participant's aligned us to may need that of routines |
---|
0:03:09 | such as nice to meet you how are you good |
---|
0:03:13 | in this case we are going to say that the better a alignment is higher |
---|
0:03:18 | so the main focus on this work is to propose measures of the of alignment |
---|
0:03:22 | based on this data which |
---|
0:03:27 | so what you think about alignment for human machine interaction so first |
---|
0:03:32 | we can see from human interaction and that's this is a subconscious phenomenon that naturally |
---|
0:03:38 | appears and it has been shown by previous work |
---|
0:03:42 | that speakers we use lexical as well as syntactic structures from previous utterances |
---|
0:03:50 | on top of that |
---|
0:03:53 | double and temporal alignment may facilitate successful taskoriented the conversations |
---|
0:03:59 | however in human machine interaction |
---|
0:04:02 | it has been shown that linguistic alignment cultures |
---|
0:04:06 | and in particular are users at the lexical items and syntactic structures from the system |
---|
0:04:12 | but this is only one way |
---|
0:04:15 | in most of the system the user aligned with the system is not able to |
---|
0:04:20 | like |
---|
0:04:23 | so in this work all goal is to provide a virtual agent with the ability |
---|
0:04:27 | to detect the alignment behavior of its human participant of each from an interlocutor |
---|
0:04:32 | and to align or not depending on the strategy with the user |
---|
0:04:37 | so them in which iteration |
---|
0:04:39 | of using the about alignment for an agent |
---|
0:04:45 | is set provide a natural source of evaluation in dialogue and in particular for the |
---|
0:04:51 | natural language generation that |
---|
0:04:53 | it also makes it possible to take into account the social emotional behavior of the |
---|
0:04:58 | behaviour and works |
---|
0:05:00 | as a social blue |
---|
0:05:02 | and |
---|
0:05:03 | it's also way of adapting results the need of an extensive user profile |
---|
0:05:10 | and what we expect from |
---|
0:05:13 | providing an agent with the ability of the body a line is to and this |
---|
0:05:19 | agents ability likability and friendliness to improve |
---|
0:05:24 | interaction naturalness as wavelet to maintain and for still user engagement |
---|
0:05:30 | finally we aim at improving collaboration in taskoriented that |
---|
0:05:36 | so |
---|
0:05:37 | in this work or approach is to provide the majors a characterizing babble alignment |
---|
0:05:45 | that are going to be based on the transcript on dialogue and on the shared |
---|
0:05:49 | expression at the lexical |
---|
0:05:52 | and a proposition stands on |
---|
0:05:55 | i was stream in past |
---|
0:05:57 | the first one is to extract |
---|
0:06:00 | the dialogue routines other justices the shared expression from the dialogue transcripts |
---|
0:06:05 | the second part is to be an expression lexicon from this shared expression a as |
---|
0:06:11 | that's keep track of the expression and some features of these expressions |
---|
0:06:17 | and then they're deriving measures of that better alignment from the data transcript and the |
---|
0:06:23 | expression icsi |
---|
0:06:25 | let me so if you word about the automatic building at the expression a lexicon |
---|
0:06:29 | so in this work we provide a model where we define |
---|
0:06:33 | a surface text but then at the utterance a shine expression as a surface text |
---|
0:06:37 | but then at the utterance level that has been produced by both speakers in dialogue |
---|
0:06:42 | so for instance you can see |
---|
0:06:45 | i and example of dialogue |
---|
0:06:47 | on the left of the slide that in the middle |
---|
0:06:50 | where there is are shown expressions that's not gonna work for me |
---|
0:06:55 | which is used to reject a proposition in a negotiation dialogue is that is used |
---|
0:06:59 | by the interlocutor at |
---|
0:07:02 | in it in that first term and by interlocutor b in the first |
---|
0:07:08 | don't |
---|
0:07:09 | so is a shared expression is part of the expression lexicon |
---|
0:07:14 | and has been initiated by eight |
---|
0:07:18 | and so in this paper we present a framework of expressions that maybe and but |
---|
0:07:24 | the or not |
---|
0:07:26 | and we also provide |
---|
0:07:28 | way of automatically extracting is it their shared expression to be done expression next we |
---|
0:07:34 | can automatically |
---|
0:07:36 | so this is an instance of sequential best down mining in |
---|
0:07:42 | and it involves the use of by you informatics algorithms that are usually used to |
---|
0:07:49 | my in dna sequences |
---|
0:07:52 | so in short |
---|
0:07:54 | it is involve zeros are the reserving of the multiple common subsequence problems for the |
---|
0:08:00 | generalize to fix tree data structure |
---|
0:08:03 | and through this |
---|
0:08:05 | base of sequential pattern mining we can be from the transcript of dialogue d v |
---|
0:08:10 | a dialog lexical |
---|
0:08:13 | then from the data transcript and the expression lexicon we derive some aspects for one |
---|
0:08:19 | measures |
---|
0:08:21 | to characterize verbal alignment |
---|
0:08:23 | so the first measures a global on the single dialog |
---|
0:08:29 | and now the expression lexicon size that is this is a number of a unique |
---|
0:08:33 | shown expression other to establish between dialogue participant |
---|
0:08:37 | and the expression by a variety which is the expression lexicon size a normalized by |
---|
0:08:43 | the length of the not a given but as a number of to the total |
---|
0:08:47 | number of token in the day |
---|
0:08:50 | we also derive |
---|
0:08:53 | measure that a specific to the speakers |
---|
0:08:57 | first the expressed in the expression repetition measure |
---|
0:09:01 | which |
---|
0:09:04 | measure which gives the amount of token that is dedicated |
---|
0:09:09 | to the repetition of an expression by the user |
---|
0:09:13 | over the total amount of token |
---|
0:09:15 | and the initiated the expression racial which determines for a given speaker the number |
---|
0:09:23 | of expression that has been a initiated by him |
---|
0:09:31 | so to study the proposed from a we present in this paper copies based contrastive |
---|
0:09:39 | study |
---|
0:09:40 | that stands on a real interaction copper well involving you menu man and you man |
---|
0:09:46 | agent but |
---|
0:09:47 | as well as artificial cover all which |
---|
0:09:52 | and used as a baseline |
---|
0:09:54 | and in this work we provide several a study comparing |
---|
0:09:59 | the real interaction corpora right to our baseline |
---|
0:10:02 | comparing a double alignment in you menu men covers and human-agent copies and also studying |
---|
0:10:07 | some condition on the am an agent copy such as a negotiation |
---|
0:10:13 | so let me so if you will about |
---|
0:10:15 | the negotiation corpora that we are using this work |
---|
0:10:21 | so this negotiation corpora |
---|
0:10:26 | involve two participants is that are required to find an agreement |
---|
0:10:32 | over the of the amount of |
---|
0:10:36 | okay they are they have to share |
---|
0:10:39 | and this negotiation task can be is a integrative that is to say that can |
---|
0:10:45 | jana to be a wean for bus participant |
---|
0:10:48 | all completed you |
---|
0:10:51 | and |
---|
0:10:54 | this couple right available in that you monuments aiding continue in the human agents sitting |
---|
0:11:01 | you consume the slide an image from the human agent corpora |
---|
0:11:08 | in the human-agent sitting |
---|
0:11:11 | the agent is controlled by you are without of course system |
---|
0:11:15 | that has been designed to be as natural as possible |
---|
0:11:19 | and this was system involves more than a eleven thousand possible you challenge is so |
---|
0:11:26 | the agent as a wider variety of you terence to express it's a |
---|
0:11:35 | the human colour i never eighty four that a white the human-agent corpora |
---|
0:11:41 | involve one hundred then fifty four down |
---|
0:11:46 | from these a couple are we constructed all based about a baseline the showing it |
---|
0:11:52 | corpora |
---|
0:11:53 | which have been designed to break the dynamic of us interactive alignment protocol |
---|
0:12:00 | and to do that we decided to break the cooking between you differences |
---|
0:12:04 | so starting from a real interaction dialogue |
---|
0:12:08 | what we have done is that we have k |
---|
0:12:12 | all the utterances from a speaker |
---|
0:12:15 | where substituting all the user utterances from the speaker from the others a speaker |
---|
0:12:20 | by you two entities should which was an from one concludes |
---|
0:12:26 | from sorry from there are several pull |
---|
0:12:30 | but utterances are chosen randomly |
---|
0:12:32 | and the prove a specific |
---|
0:12:37 | for the human participant |
---|
0:12:39 | the human participant facing an agent and for the agent |
---|
0:12:44 | system |
---|
0:12:45 | so on the slide you can see an example of real dialogue on the colour |
---|
0:12:50 | and of the left |
---|
0:12:52 | and one randomized version where all the utterances from the human participant had been that |
---|
0:13:00 | subject you to buy a randomly choose an and jones |
---|
0:13:03 | so the main idea of these corpora used to break the dynamic of interactive alignment |
---|
0:13:10 | process |
---|
0:13:14 | so the first one of the first hypothesis is that we are investigating in this |
---|
0:13:20 | work |
---|
0:13:21 | is that it's the dialogue participants should constitute a richer expression lexicon |
---|
0:13:27 | in the real interaction call logs and what would happen incidentally industrial get corporal |
---|
0:13:35 | in the artificial or |
---|
0:13:37 | and so to investigate this it was hypothesis we looked at the expression very variety |
---|
0:13:44 | measure from all model |
---|
0:13:47 | and |
---|
0:13:48 | what we found |
---|
0:13:50 | is that there is a significant shift different difference between the you menu man |
---|
0:13:56 | and so the it's at if you can talk about as well as for human |
---|
0:14:00 | agent as in as and it's |
---|
0:14:03 | artificial can talk about |
---|
0:14:06 | in the sense that is expression body right variety is higher in the real interaction |
---|
0:14:11 | copper wire than in the signal string will get one |
---|
0:14:15 | so what we have observed is that's or it was is we have a provided |
---|
0:14:21 | some arguments to can for this is this hypothesis is that in the sense that |
---|
0:14:27 | we have observed a richer expression lexicon in the real interaction couple and then the |
---|
0:14:32 | in the artificial ones |
---|
0:14:34 | which have been designed to avoid |
---|
0:14:39 | the interaction process the interactive alignment process and thus the constitution of expression mexico |
---|
0:14:47 | then we have been interest the in the comparison of that better alignments shows a |
---|
0:14:54 | measure that we propose a |
---|
0:14:57 | between the human corpora corpus and the agent corpus |
---|
0:15:03 | so here what we expected that we expected that moldable alignment from the human |
---|
0:15:10 | in the human-agent interaction |
---|
0:15:15 | then the agent the main reason is that |
---|
0:15:18 | the agent even if it even if it's a was it has not been designed |
---|
0:15:23 | to be able to align |
---|
0:15:24 | and the second reason is that |
---|
0:15:27 | the human participant may be influenced by the belief about the limitation of the communicative |
---|
0:15:32 | get abilities of the agents |
---|
0:15:35 | so to us to this i prissy six we looked at the initiated expression right |
---|
0:15:42 | sure that we propose in a model as well as the expression repetition ratio |
---|
0:15:49 | and in the human interaction |
---|
0:15:53 | in terms i would that there are no differences between the two speakers in that |
---|
0:15:57 | it's there is a symmetrical that by alignments |
---|
0:16:01 | regarding of these two measures |
---|
0:16:04 | bus dialogue participants initiate |
---|
0:16:06 | approximately the same amount of expression |
---|
0:16:10 | and they repeat also the same amount of |
---|
0:16:14 | of expression |
---|
0:16:16 | however |
---|
0:16:17 | is this is not the case in the human agents and sitting |
---|
0:16:21 | and we observe here |
---|
0:16:25 | and estimate |
---|
0:16:28 | so |
---|
0:16:29 | this estimator e a |
---|
0:16:32 | is |
---|
0:16:35 | this end symmetry happened and |
---|
0:16:38 | can be is summarized by the fact that |
---|
0:16:42 | the human participants adopt more was initiated expression |
---|
0:16:48 | which is not surprising because the which cannot |
---|
0:16:51 | a adopt easy to use a human participant expression the human participants also they did |
---|
0:16:57 | get small talk into the repetition of expression |
---|
0:17:01 | so a here |
---|
0:17:03 | this give some |
---|
0:17:07 | arguments to say that the human participant |
---|
0:17:10 | is influenced by its belief about the limitations |
---|
0:17:14 | of the communicative capabilities of the agents |
---|
0:17:17 | and it should be stressed that lets us test image three a does not appear |
---|
0:17:23 | when considering the number of the can produce by each speaker or when considering the |
---|
0:17:27 | change proportion |
---|
0:17:29 | is the proportion of vocabulary |
---|
0:17:35 | finally we looked at some conditioned on the human agent corpus and |
---|
0:17:42 | we have mainly focus on the negotiation type |
---|
0:17:47 | in we wanted to see if there was an impact |
---|
0:17:50 | on the verbal alignment indicators |
---|
0:17:53 | given the type of negotiation so integrative negotiation which i don't know to be a |
---|
0:17:59 | wean a distributive |
---|
0:18:03 | negotiation |
---|
0:18:04 | which is a competitive one |
---|
0:18:06 | and what we found is that |
---|
0:18:08 | both negotiation type have as a similar amounts the c is a similar value for |
---|
0:18:16 | the expression for it |
---|
0:18:18 | that is to says that down |
---|
0:18:20 | the same amount of expression |
---|
0:18:22 | that are created in both dialogues but there is a clear difference in the text |
---|
0:18:28 | prediction repetition ratio |
---|
0:18:30 | which shows that's |
---|
0:18:32 | in the competitive in the negotiation |
---|
0:18:36 | dialogue participants |
---|
0:18:39 | repeats |
---|
0:18:40 | all and their body allowing more |
---|
0:18:42 | then in wean negotiation |
---|
0:18:48 | so |
---|
0:18:51 | the fact what we provide here is arguments to us about the fact that it's |
---|
0:18:59 | competitive negotiation |
---|
0:19:01 | due to more rubber alignment and one it was this is that |
---|
0:19:07 | the participants a need to be already allowing more on control proposition |
---|
0:19:14 | so to conclude on in this work and we have proposed automatic and generic measures |
---|
0:19:20 | of the other alignment based on sequential pattern mining at the level of stuff first |
---|
0:19:25 | of texture differences |
---|
0:19:26 | that makes it possible to characterize |
---|
0:19:30 | interesting aspect of that by law alignment such as the reading position process |
---|
0:19:35 | the degree of repetition between that a participant and the orientation of the about that |
---|
0:19:39 | alignment |
---|
0:19:41 | we have contrast construe a contrastive then you menu man and you men agent that |
---|
0:19:47 | better alignment showing us that there is a symmetry in babble alignment |
---|
0:19:53 | when a given now indicators on |
---|
0:19:57 | in human interaction why there is an asymmetry in human-agent interaction |
---|
0:20:02 | and this touch we wanted to evenly comfy m some hypothesis is from they need |
---|
0:20:08 | to ensure |
---|
0:20:10 | and the perspective that we want to explore used to used as a measure that |
---|
0:20:16 | we propose in a dialogue system and should be stressed that the major based on |
---|
0:20:21 | very efficient algorithm is to say |
---|
0:20:26 | linear complexity algorithms |
---|
0:20:31 | we would like also to investigate this |
---|
0:20:36 | more the query and to do a qualitative analysis of that but alignments between a |
---|
0:20:40 | human interaction in human-agent interaction |
---|
0:20:43 | such as a function and analysis of the repetition |
---|
0:20:47 | and finally we would like to investigate |
---|
0:20:51 | that was are comparable here menu man and human-agent gabor |
---|
0:20:55 | to confirm or reasons |
---|
0:20:57 | thank you for your attention and i'm now ready to answer your question ratio image |
---|
0:21:14 | thanks for the top i was i was wondering several things about adopt actually one |
---|
0:21:20 | of them is i i'm not quite sure you said something that on |
---|
0:21:26 | way of the machine adapting to |
---|
0:21:30 | to the user there's nothing out there |
---|
0:21:34 | you have any idea why is nothing out there because when i looked into |
---|
0:21:40 | it was like slot filling kind of dialogue and that was difficult because you don't |
---|
0:21:44 | have a lot of data about user |
---|
0:21:46 | to make the system about two but in this kind of data it might be |
---|
0:21:50 | different and also the second question is whether |
---|
0:21:52 | the measures that you come up with would work got for turn level |
---|
0:21:57 | so if you have the decision to change from mexico expression |
---|
0:22:00 | with those words to make changes that the turn level rather than |
---|
0:22:04 | several turn |
---|
0:22:06 | but like rather than taking into consideration example for |
---|
0:22:11 | so for the first question about that there are systems that are able to align |
---|
0:22:17 | as in some interesting work and they are pointed out in the in our paper |
---|
0:22:22 | the main disadvantages that most of the system i'll based out rule based |
---|
0:22:27 | and specific to some domain |
---|
0:22:30 | all of some tasks |
---|
0:22:32 | and the idea and providing measures and used to go towards more data driven way |
---|
0:22:38 | an automatic way of aligning |
---|
0:22:42 | but there are some system that i module |
---|
0:22:45 | and the second question |
---|
0:22:49 | so if i understand you where is that if we change the granularity of where |
---|
0:22:55 | we've well where we look for expression |
---|
0:22:59 | so we can not be over your problem i |
---|
0:23:03 | don't see the em program in using all means of to be just changing that |
---|
0:23:07 | when you're writing and ueller richie |
---|
0:23:10 | of the units |
---|
0:23:13 | which we |
---|
0:23:14 | do you think you would get this you would keep the same accuracy |
---|
0:23:20 | i don't know we have one check because here we go to variable for a |
---|
0:23:26 | couple always very when the limited you challenge is |
---|
0:23:33 | if we look at |
---|
0:23:36 | i'm not sure to understand their we will your point in fact |
---|
0:23:39 | we can talk a yes |
---|
0:23:49 | hello i am here but talk about how you are looking on the degree of |
---|
0:23:57 | repetition and what i didn't you are looking as repeated |
---|
0:24:01 | i think perhaps not counting |
---|
0:24:04 | probably so you get things like |
---|
0:24:08 | i'm interested in shares or whatever was and in the next one you're getting a |
---|
0:24:12 | time |
---|
0:24:14 | in content items as being the repetition |
---|
0:24:18 | in terms of being you know sort of |
---|
0:24:22 | alignment which i think in this case where the participants don't really have so much |
---|
0:24:27 | like what they say that phone first-person pronoun there is only one |
---|
0:24:34 | and |
---|
0:24:35 | you have similar ones for me i think that if you were doing alignments |
---|
0:24:40 | on the on that might also be the same sort of a problem |
---|
0:24:45 | what the in i think that it's just one of the difficulty to work when |
---|
0:24:51 | we walk misalignment is that it can be very |
---|
0:24:56 | you can very specific |
---|
0:24:59 | words such as the difference between what time i used adding all at what time |
---|
0:25:05 | and it is going to be very important in that case |
---|
0:25:10 | and in this work we have chosen to |
---|
0:25:15 | select all the expression |
---|
0:25:17 | and to can everything even though we are probably counting some |
---|
0:25:25 | expression that in around and that are still going to happen even without that but |
---|
0:25:31 | alignment |
---|
0:25:32 | but what we show in the by comparing to the strongest cultural |
---|
0:25:37 | i think is that |
---|
0:25:41 | when people line |
---|
0:25:44 | they will create mall expression |
---|
0:25:49 | so |
---|
0:25:52 | i just if you were telling |
---|
0:25:57 | information for |
---|
0:25:59 | right |
---|
0:26:01 | i think you would want to understand some of these things are alignment in some |
---|
0:26:05 | ways |
---|
0:26:07 | so that you would be producing delays |
---|
0:26:09 | thinking |
---|
0:26:11 | and regarding that |
---|
0:26:13 | since |
---|
0:26:14 | the expression mexican keeps track of expression instead our future such as the frequency |
---|
0:26:20 | such as a recent c of an expression we can use it is it's is |
---|
0:26:24 | it is features to feature out |
---|
0:26:28 | an interesting expression |
---|
0:26:30 | but can you because i could be extremely free |
---|
0:26:34 | and it could be very recent as well |
---|
0:26:37 | mm |
---|
0:26:43 | i can just two |
---|
0:26:45 | to copy this behaviour |
---|
0:26:50 | we can choose to stop my sentences by the same expression that we use for |
---|
0:26:54 | instance i want to align |
---|
0:26:59 | thank you very much nothing to speaker again |
---|