0:00:14 | a dash |
---|
0:00:16 | textual |
---|
0:00:18 | moving |
---|
0:00:19 | a minute so can tell our research and just the background on what we doing |
---|
0:00:24 | so |
---|
0:00:25 | we have in inferred variable name eureka was developed by your shoes you did all |
---|
0:00:30 | and part of this project is for a eureka to be able to function simple |
---|
0:00:36 | roles |
---|
0:00:37 | and form task as well as the human so it's a very realistic looking and |
---|
0:00:40 | with rubber we demonstrated your okay how the conferences last year signal |
---|
0:00:46 | and in this work |
---|
0:00:48 | i'm gonna described your caused all of model |
---|
0:00:50 | and are always an attentive listener |
---|
0:00:53 | so you |
---|
0:00:54 | at each of listening |
---|
0:00:56 | we had |
---|
0:00:57 | but of a example in the keynote so i am this morning by was |
---|
0:01:01 | so it into listening is where basically erica |
---|
0:01:05 | try to listen to the user talk |
---|
0:01:08 | and what showed in straight interjections of dialogue so we want to stick might more |
---|
0:01:12 | conversation |
---|
0:01:13 | but primarily by the user |
---|
0:01:16 | and the scenario we don't needed to understand the conversation so it |
---|
0:01:20 | we don't trying to any complex natural language processing |
---|
0:01:24 | and only intended time this is the only people are people do the following i |
---|
0:01:29 | to get some social isolation |
---|
0:01:31 | and the back and hit a robotic control to not my view pretty a like |
---|
0:01:38 | cognitive |
---|
0:01:39 | this at least as well |
---|
0:01:40 | so i is an example |
---|
0:01:42 | suppose my son actually a model |
---|
0:01:48 | okay |
---|
0:01:52 | sorry so it clearly army |
---|
0:01:55 | it is not the language |
---|
0:02:02 | but it helps and that |
---|
0:02:05 | that the by you saying something down to the other we present |
---|
0:02:09 | so we wanna kinda protectors and what they're gonna but obviously not heuristic look up |
---|
0:02:13 | at header |
---|
0:02:15 | and her feel like they choose actually understanding what the users sign |
---|
0:02:21 | so you |
---|
0:02:22 | this is a mobile not to type of system so it is below relied we |
---|
0:02:26 | continue listening |
---|
0:02:28 | since of up to listen is we animals yours applications like something to say |
---|
0:02:33 | and elastic as a paper with its phase is a listening source trouble |
---|
0:02:38 | so the mile almost all the l system actually is that |
---|
0:02:42 | we use a state response system so we actually to use the content of what |
---|
0:02:45 | the user sees |
---|
0:02:46 | and did you write something response |
---|
0:02:49 | so |
---|
0:02:50 | we also want to have that |
---|
0:02:52 | in an open-domain so we don't restrict of the main the should a system should |
---|
0:02:56 | be able to wait for what if we use this is |
---|
0:02:59 | and the language model uses quite minimalistic we don't use any |
---|
0:03:04 | on a very tricky all models or on the training methods |
---|
0:03:09 | but we wanna do is generic simple incoherent responses |
---|
0:03:12 | so all describe that we do that |
---|
0:03:15 | so it just to talk about your is environment we have a kinect since matrix |
---|
0:03:21 | the user so we know close talking |
---|
0:03:24 | and we relocated that's very case |
---|
0:03:26 | and rather handy on what finds we use a microphone array so if we want |
---|
0:03:30 | to use it to be able to actually talk to your case they were and |
---|
0:03:34 | a human in the conversation |
---|
0:03:36 | so a the automatic speech recognition is done entirely to the microphone right so the |
---|
0:03:41 | user's for his hands and things like that |
---|
0:03:46 | it seems of the nonzero architecture of the system |
---|
0:03:50 | we have speech processing we have a natural language processing so one of the focus |
---|
0:03:54 | would into this and all that taking |
---|
0:03:57 | and that the main thing is the response model so we have |
---|
0:04:01 | two things whilst i'm response system |
---|
0:04:04 | which |
---|
0:04:04 | produces responses for the user |
---|
0:04:07 | and we used in the back channeling system which produce backchannels |
---|
0:04:11 | include in this was also to in taking that i one described in section so |
---|
0:04:14 | much and it's also we can actually implemented |
---|
0:04:16 | this we just to the conceptual idea of what we wanna do it seems like |
---|
0:04:20 | in and |
---|
0:04:22 | if you see the video don't show you know why we don't complete syntactic the |
---|
0:04:26 | model |
---|
0:04:28 | so that the channel is a response more just actually run in parallel so we |
---|
0:04:31 | can use them |
---|
0:04:34 | so altogether three features of the system the first additionally |
---|
0:04:38 | in these two types of dictionary that we can consider we have a bit showing |
---|
0:04:42 | a print it so we need a we receive an ipu from the |
---|
0:04:47 | is our system we can maybe this section going insane with of this is a |
---|
0:04:51 | good place tones can just a backchannel |
---|
0:04:56 | and the other one is a time base system where we continuously recognize and we |
---|
0:05:00 | don't know should be still not require k |
---|
0:05:03 | so we trained models one for these types |
---|
0:05:06 | all back channeling systems |
---|
0:05:08 | our for this we use a counseling corpus or in counseling corpus |
---|
0:05:12 | we have many examples of the teams of listening where |
---|
0:05:15 | the user |
---|
0:05:17 | or sorry calcite basically just listens to be |
---|
0:05:20 | types more passionately the other speaker and basically c is something like okay a |
---|
0:05:28 | so there's in japanese and the same basic idea applies |
---|
0:05:33 | and we wanted to the bit channel timing in the form sorry we consider just |
---|
0:05:37 | the japanese backchannel forms |
---|
0:05:39 | on the balloon and a at the moment based with gonna be the most common |
---|
0:05:46 | so it doesn't features that we use a prosodic features are research and tell statistics |
---|
0:05:52 | on those |
---|
0:05:53 | and we have these looks cool features represent by with it is |
---|
0:05:59 | widely base model uses one as all the prosodic and lexical features within the audience |
---|
0:06:04 | or like you |
---|
0:06:05 | where is the time base model will just come to take continuous windows from the |
---|
0:06:09 | whole time |
---|
0:06:12 | possible pass time windows and we just trying these using a simple logistic regression model |
---|
0:06:18 | sorry for the subjective experiment we selected team different recording something counseling corpus we actually |
---|
0:06:23 | talk |
---|
0:06:24 | snippets from the canceling corpus |
---|
0:06:28 | we do this not only use backchannels |
---|
0:06:30 | and the backchannels we actually generated using your to your system and you thing |
---|
0:06:36 | so we had three types of models are fixed form |
---|
0:06:39 | iq base model the time base model and we compute its were graduate condition sorry |
---|
0:06:45 | and the ground truth conditions which replace |
---|
0:06:47 | the chances voice with the synthesized voice so |
---|
0:06:51 | there was any effects of the type of human like voice |
---|
0:06:54 | but of course |
---|
0:06:56 | when you replace these you actually lose the specific prosodic properties of the fictional sorry |
---|
0:07:04 | in this in this case is not an exact ground truth i'll that's kind of |
---|
0:07:07 | the synthesized ground truth |
---|
0:07:09 | so the timing is rates the form is great but the actual press prosody almost |
---|
0:07:14 | i'm actual is different |
---|
0:07:17 | so i with forty subjects listened to |
---|
0:07:20 | business and which should with the rain condition |
---|
0:07:23 | i mean they evaluated each of the |
---|
0:07:26 | snippets of recordings with the backchannels with the look at scales using those images |
---|
0:07:33 | so i'll give an example of but i'm based fiction model so |
---|
0:07:36 | we apply this |
---|
0:07:37 | we apply to model to this particular recording |
---|
0:07:40 | do you |
---|
0:07:41 | i think about going to document or a mogul going to carry out of course |
---|
0:07:46 | not always able to |
---|
0:07:48 | so if we don't go to good |
---|
0:07:51 | many consider the goals of a few don't usually not gonna do you can buy |
---|
0:07:55 | divided into three |
---|
0:07:57 | well |
---|
0:07:58 | posted it for the goal |
---|
0:08:01 | but we do |
---|
0:08:03 | total to the bow |
---|
0:08:06 | so you want to sit too little |
---|
0:08:08 | okay so you get here that approaches backchannels i should also mention that |
---|
0:08:13 | for the time base model we actually quite poor results for predicting the form of |
---|
0:08:17 | the picture |
---|
0:08:18 | so it can still using the prediction we just use a random white intentional for |
---|
0:08:23 | performance |
---|
0:08:25 | where is the widely base model used action which was better |
---|
0:08:31 | so the results of the system |
---|
0:08:33 | which down there that's like base model actually performs better than the rt base model |
---|
0:08:38 | this quite intuitive so we know that it base model takes some processing time |
---|
0:08:43 | tech to produce an ipu |
---|
0:08:44 | that's right approach is a backchannel |
---|
0:08:47 | and so a date time maybe the timing of a |
---|
0:08:51 | a backchannel is quite so i |
---|
0:08:54 | so which they you |
---|
0:08:56 | people who evaluated the system sample as well |
---|
0:08:59 | so it |
---|
0:09:01 | it's a conclusion for this was that the correct timing of the fictional section more |
---|
0:09:04 | including the form |
---|
0:09:05 | so even though we use range of backchannels and it's white based models that's the |
---|
0:09:10 | thing is better |
---|
0:09:12 | so we can we use this i'm actual system for |
---|
0:09:18 | so next of the baptist a personal |
---|
0:09:20 | so |
---|
0:09:21 | the same response is basically |
---|
0:09:24 | trying to generate a response based on the focus where it's we extract the speech |
---|
0:09:28 | the user |
---|
0:09:29 | so the thing is we don't wanna handcrafted model for your can buy some key |
---|
0:09:33 | what sorry we consider an open-domain |
---|
0:09:36 | for you talk conversation |
---|
0:09:39 | we can |
---|
0:09:40 | we can away practically making handcrafted although we'll those |
---|
0:09:44 | keywords |
---|
0:09:45 | sorry routed in doing that we can extract keyword from what the user c is |
---|
0:09:50 | and then we can find |
---|
0:09:53 | an appropriate response so we have four types of responses |
---|
0:09:56 | and the planning what we do we can find a focus now on in a |
---|
0:09:59 | we do you want a pretty good |
---|
0:10:01 | and we do we can find a question with images wannabes |
---|
0:10:05 | so these for other question on focus on the partial repeat what the rising tone |
---|
0:10:10 | i'm the cushion on the predicate and |
---|
0:10:12 | and in the case of full between under these conditions of me we just |
---|
0:10:16 | playful new like expression |
---|
0:10:20 | so we extract the focus phrase all the pretty good we use the conditional random |
---|
0:10:24 | field |
---|
0:10:25 | this is done in previous work so from the focus would |
---|
0:10:28 | and we use it in remodel to match spectral question word so maybe some examples |
---|
0:10:34 | of what types response we can get |
---|
0:10:36 | so for example of cushion on focus we actually identify focus and we can use |
---|
0:10:42 | that to magic which in which the focus |
---|
0:10:45 | so it for example if it is it is the right carry |
---|
0:10:50 | maybe you're gonna |
---|
0:10:51 | sorry |
---|
0:10:52 | the focus will is carried the system can to take the and |
---|
0:10:56 | the question which is what kind of sorry there is response be what kind of |
---|
0:10:59 | carry so she extends the conversation this way |
---|
0:11:03 | for people the writing time for example or to run america vol |
---|
0:11:07 | so the folks would extract is american but we don't approach with the day |
---|
0:11:11 | so we just say something like all america |
---|
0:11:14 | the question the predicate |
---|
0:11:17 | you know i with a lot |
---|
0:11:18 | so we have a pretty could you know and we for nick which would like |
---|
0:11:22 | to wear |
---|
0:11:23 | so you're kick ask we did you gotta |
---|
0:11:26 | and |
---|
0:11:27 | lastly we have like no |
---|
0:11:29 | focusing on remote predicate that we can find |
---|
0:11:32 | so we use the system will just idea or okay so for example that's beautiful |
---|
0:11:36 | which are pointing the finger |
---|
0:11:39 | the simple just say |
---|
0:11:41 | okay |
---|
0:11:44 | so it is thus we actually use data from a previous experiment of the erica |
---|
0:11:49 | in the previous experiment it was another state response system |
---|
0:11:54 | so you would actually only three for direct responses back to that |
---|
0:12:00 | for polite expressions |
---|
0:12:01 | so what we did we handle this data and we applied the this unanswerable |
---|
0:12:07 | statements trusting response system |
---|
0:12:09 | and we could we check to deal with the results will be response that could |
---|
0:12:13 | be generated |
---|
0:12:15 | informers also found i nearly fifty obscene all these previous we do not smoke statements |
---|
0:12:21 | could be responded to what else system sorry we believe that these statements ability from |
---|
0:12:27 | like expressions |
---|
0:12:29 | any |
---|
0:12:29 | these we just by annotators to be |
---|
0:12:31 | coherent responses well sorry |
---|
0:12:34 | responses that would be stated |
---|
0:12:36 | in a |
---|
0:12:37 | i don't know conversation |
---|
0:12:40 | okay so you lost your talk about turn taking so it is gonna be quite |
---|
0:12:44 | brief because we haven't |
---|
0:12:47 | actually implement this so in progress |
---|
0:12:51 | but the can single gaussian second system is that |
---|
0:12:54 | running try to predict |
---|
0:12:56 | take the turn or not take the turn we use it a decision |
---|
0:13:01 | rather than a binary on |
---|
0:13:03 | so i because we know the probabilistic |
---|
0:13:07 | thresholds for some actions |
---|
0:13:09 | been we actually just slider "'cause" response by subway |
---|
0:13:12 | so |
---|
0:13:14 | if you see this very simple diagram which she is |
---|
0:13:18 | goes from not taking a turn |
---|
0:13:19 | and generating an original which indicates not taken into |
---|
0:13:23 | then we can generate a filler which gender in the case that we might we've |
---|
0:13:27 | got seconded |
---|
0:13:29 | and in vastly we would be actually take the turn endorsements |
---|
0:13:34 | sorry backchannels |
---|
0:13:36 | indicate not turn taking in for those in the current turn taking heavily the |
---|
0:13:42 | the benefit of this is that these are fully committed action so we don't actually |
---|
0:13:45 | take the turn at a time |
---|
0:13:47 | we say something |
---|
0:13:49 | in preparation for seconds |
---|
0:13:50 | but the user can select the right this so for example you're good as a |
---|
0:13:54 | filler |
---|
0:13:55 | maybe the user doesn't wanna finished looking so the continue talking the and it doesn't |
---|
0:14:00 | stop before conversation |
---|
0:14:02 | when you're it does response |
---|
0:14:05 | so we had this can see but we wanna know actually |
---|
0:14:09 | how do we finally threshold the others so this is just to extract the real |
---|
0:14:12 | what we wanna the |
---|
0:14:15 | so we trying to tune psyching model and based on logistic regression |
---|
0:14:21 | we use price prosodic and lexical features and we analyze the likelihood scores and from |
---|
0:14:25 | the frequency decisions |
---|
0:14:27 | just the we can find simple example to t one and z two |
---|
0:14:31 | so we found that maybe sing the threshold one at least in |
---|
0:14:35 | zero point four five is |
---|
0:14:37 | to completely silent sorry |
---|
0:14:39 | we use the just keep take the turn |
---|
0:14:42 | where is a threshold zero point ninety five in we say okay we did not |
---|
0:14:48 | be taken to |
---|
0:14:50 | but in the middle because we are not quite sure we live it didn't of |
---|
0:14:54 | filler or backchannel to try and |
---|
0:14:58 | i the make the user you the twins was or side okay no you can |
---|
0:15:02 | continue |
---|
0:15:04 | so it is something that we wish to each one and this basic idea |
---|
0:15:09 | so we the basic algorithm all that interesting it's very simple |
---|
0:15:14 | basically what the user is speaking or continuous you do backchannels |
---|
0:15:18 | using the backchannel system |
---|
0:15:20 | we get the appropriate timing for the |
---|
0:15:23 | when we get a result from the speech recognition system we did all it's a |
---|
0:15:27 | dialogue taking on the results |
---|
0:15:29 | the speech the question we cancelled out so |
---|
0:15:33 | because we can manage this kiwi matching a database or this is you way natural |
---|
0:15:40 | language processing |
---|
0:15:42 | hey that's not a question we know that the segment |
---|
0:15:45 | then we can use a state response more to restore the response |
---|
0:15:48 | based on |
---|
0:15:50 | a universal responses |
---|
0:15:52 | so you the thing is that because the usual talking and we can see beginning |
---|
0:15:57 | asr results |
---|
0:16:00 | we can |
---|
0:16:01 | overwrites our previous response are actually you're only response the last part of the speech |
---|
0:16:08 | and then when we notice that these especially to be in their insiders the response |
---|
0:16:13 | so all they've been example |
---|
0:16:16 | the system in action |
---|
0:16:17 | and you see that actually this latency is the what they have an issue |
---|
0:16:22 | but the response that in the region here |
---|
0:16:28 | i don't know should be run on this |
---|
0:16:34 | so that the question |
---|
0:16:36 | similarly |
---|
0:16:37 | because one of which is not scandalous |
---|
0:16:46 | g |
---|
0:16:50 | so the buttons |
---|
0:16:53 | for so |
---|
0:16:56 | right |
---|
0:16:59 | so much |
---|
0:17:04 | so he shall |
---|
0:17:06 | extract |
---|
0:17:08 | the focus point is that there was a |
---|
0:17:13 | are you know |
---|
0:17:23 | so the skies the focus of it is |
---|
0:17:25 | right |
---|
0:17:26 | which from implies |
---|
0:17:35 | so make a she couldn't one the focus would and could because of then is |
---|
0:17:38 | it your for it just wasn't and the model |
---|
0:17:43 | so you can see that i'm the laziest or problem so this electro three seconds |
---|
0:17:47 | between responses which is not actually that good |
---|
0:17:50 | for this in the posting system you want people to keep talking |
---|
0:17:53 | and feel what robot is actually distinct |
---|
0:17:56 | we can see that the response generation system gives reasonably good responses sorry |
---|
0:18:01 | we hope that the users will keep concerning the conversation like this |
---|
0:18:06 | so that is this a matching supposing system |
---|
0:18:11 | we conducted a pilot study |
---|
0:18:13 | we only use three subjects just part of probably of iterations see the weird |
---|
0:18:18 | one big problem we have is that |
---|
0:18:20 | we tell users to interact with your can they really do |
---|
0:18:25 | some in adding to post install interaction |
---|
0:18:27 | usually locality are able to stay near and after a couple of you easy questions |
---|
0:18:32 | and j |
---|
0:18:33 | kind of not know what to say |
---|
0:18:35 | so and this case |
---|
0:18:37 | we had to actually explicitly tell them what to say |
---|
0:18:41 | so first we got them to read from scripts taken from an existing corpus sorry |
---|
0:18:46 | they would say things that were taken from a previous |
---|
0:18:51 | wasn't was experiments |
---|
0:18:54 | and you know that we instructed him to tell your career story in |
---|
0:18:57 | keep talking as long as possible i is long as they wanted to be in |
---|
0:19:01 | a fruitful scenario |
---|
0:19:03 | so what's they use the script they kind of hunters the what you're the |
---|
0:19:09 | i denote a difficult scenario like that |
---|
0:19:11 | the in a super group of judges listened to the audio of the interaction and |
---|
0:19:16 | we evaluated each of your is backchannels in utterances according to the timing and the |
---|
0:19:21 | coherence |
---|
0:19:23 | so you |
---|
0:19:24 | the results we found that so the backchannel time is quite appropriate actually we find |
---|
0:19:29 | it |
---|
0:19:29 | quite useful |
---|
0:19:30 | but if you can see from the video and it was noted by participants the |
---|
0:19:34 | state response estimates the once |
---|
0:19:37 | so this is something we need to work |
---|
0:19:41 | on the other means in terms of all the sponsors that we generated |
---|
0:19:46 | from the slide response system |
---|
0:19:48 | a more than half of them were quite here sorry we think this is quite |
---|
0:19:52 | reasonable so we |
---|
0:19:53 | find the |
---|
0:19:56 | because responses |
---|
0:19:58 | to the conversation going and a reasonable |
---|
0:20:04 | and just some examples of the model can someone construction so at least instructions we |
---|
0:20:08 | didn't tell people what the dirty they just talks were there are k |
---|
0:20:11 | i we often have these i'm elated |
---|
0:20:14 | so is something you can see the first one |
---|
0:20:18 | the user's libertarians |
---|
0:20:21 | they're talking to i don't know why the variance but it was about a week |
---|
0:20:25 | and re ions |
---|
0:20:26 | and you're actually |
---|
0:20:27 | notice that a focus where does it and ask what we would ideas from ideas |
---|
0:20:31 | from where |
---|
0:20:32 | and this is quite surprising for the for the user to press buttons are always |
---|
0:20:37 | integrate you listen to me one talk |
---|
0:20:40 | we talk about it and so obviously we don't create |
---|
0:20:43 | some responses that are ago aliens "'cause" this doesn't come up much but |
---|
0:20:47 | it shows that we can |
---|
0:20:50 | five use the same response system and a wide variety of contexts |
---|
0:20:55 | another one that's maybe this that's useful |
---|
0:20:58 | the human asked you want to watch it is quite of shoes or watch |
---|
0:21:02 | i see |
---|
0:21:04 | maybe it's quite strange |
---|
0:21:05 | and in than they are states very rainy |
---|
0:21:08 | and the robot's at your "'cause" he's i where is the right so this was |
---|
0:21:11 | about but this is a bit strange in the context of a conversation but it |
---|
0:21:16 | was interesting to people to be user |
---|
0:21:20 | so you just have conclusions and future work |
---|
0:21:24 | user are the demonstration a we find the most imprisonment there are applied to the |
---|
0:21:28 | coherent question |
---|
0:21:29 | and we can of the next in a conversation that way |
---|
0:21:33 | and even incoherence train statements can be interesting or funny so you |
---|
0:21:37 | even though it this time and state are cases like where is the right |
---|
0:21:42 | it doesn't really make scenes but |
---|
0:21:44 | so you |
---|
0:21:45 | users can be quite interesting ditching come up with this |
---|
0:21:48 | kind of thing |
---|
0:21:49 | so maybe it doesn't have to be very |
---|
0:21:53 | grammatically correct way you're |
---|
0:21:56 | with one evictions isn't with quite well and get the randomness of the backchannels is |
---|
0:22:01 | not so severe |
---|
0:22:02 | really useful at the back channels but |
---|
0:22:05 | and the latency is the biggest problem i'm at the moment without system |
---|
0:22:10 | so i future work will be the speed up this latency |
---|
0:22:13 | and |
---|
0:22:16 | just right |
---|
0:22:17 | going on from that keynote today we know it and emotional dialogue responding to that |
---|
0:22:21 | is very important as well |
---|
0:22:23 | so we had to increase the range responses here can generate and do some i'm |
---|
0:22:28 | emotion recognition actually |
---|
0:22:30 | so okay the use of talks about how to say okay you can she can |
---|
0:22:33 | actually generated good response like |
---|
0:22:37 | so thank you any questions |
---|
0:22:45 | thank you |
---|
0:22:46 | and now we have a sometime questions |
---|
0:22:56 | thank you for the talking about one slide |
---|
0:23:00 | your claim there backchannels are generally well time randomness not a survey reissue sort of |
---|
0:23:05 | speech to me that |
---|
0:23:07 | i that if you build a system that just random way that some interval created |
---|
0:23:14 | backchannels especially japanese |
---|
0:23:17 | it would work just as well |
---|
0:23:19 | a what i would like to see as it is a comparison of two systems |
---|
0:23:23 | this one that you have the one where it's just random time backchannels and see |
---|
0:23:28 | people |
---|
0:23:29 | i okay difference between so x actually what i didn't mention is there and this |
---|
0:23:34 | in this section we experiment |
---|
0:23:36 | we actually get another system which i'm sorry i which today which are randomly random |
---|
0:23:41 | backchannels with the egg and redeker there wasn't between the first is much work |
---|
0:23:47 | as well |
---|
0:23:48 | great thank you |
---|
0:23:58 | i was actually wondering that the dialogue seems to be a model encouraging that short |
---|
0:24:06 | utterances and i think this kind of or a feedback giving a behaviour |
---|
0:24:13 | more likely to occur when the usage of stuff really tells a one story or |
---|
0:24:19 | something did you were set of try to encourage the people to the behave in |
---|
0:24:24 | that way or was it more like that kind of but |
---|
0:24:28 | it |
---|
0:24:28 | really depends on the user's site |
---|
0:24:31 | why don't know we do this because japanese people are quite reluctant to |
---|
0:24:35 | telling stories of what it |
---|
0:24:39 | we i would say to people come and may be seen you know them stand |
---|
0:24:42 | in front of your account |
---|
0:24:44 | by say what one season |
---|
0:24:45 | and in white for the response and agencies like for sports |
---|
0:24:50 | but the people like the examples that i gave alive |
---|
0:24:55 | systems people who actually |
---|
0:24:56 | we did some of the side we just a okay |
---|
0:24:59 | you talk with your how you want and actually did that actually see like a |
---|
0:25:02 | long story talked about it they whatever and this would people be seen that most |
---|
0:25:08 | impressive |
---|
0:25:10 | if you tool with the kind of in a stream of consciousness then you're gonna |
---|
0:25:14 | get answers which are maybe actions you probably |
---|
0:25:19 | question-response questions |
---|
0:25:22 | but it is very visited thing to think |
---|
0:25:24 | which is a trick |
---|
0:25:26 | i guess the robot could also to kind of a start to tell everyone sentences |
---|
0:25:31 | and here that would serve kind of well for example to the user's we hoping |
---|
0:25:36 | that we try like |
---|
0:25:38 | think of ideas on how to get the robotic system you like on the section |
---|
0:25:41 | of the users of like |
---|
0:25:42 | she can actually say tell me a story that yourself or something rather be nothing |
---|
0:25:46 | to |
---|
0:25:48 | directly usable |
---|
0:25:51 | well i think we of |
---|
0:25:54 | i |
---|
0:26:01 | thanks for an interesting things |
---|
0:26:03 | they got the implement so she doesn't all use of two |
---|
0:26:12 | make but sentence does smoothing is |
---|
0:26:14 | well yes |
---|
0:26:15 | are you in question about so |
---|
0:26:17 | of what you mean the nonverbal behavior |
---|
0:26:20 | so you we haven't actually |
---|
0:26:24 | i think she does some nodding at random fundamental watches a backchannel |
---|
0:26:28 | but we looking at ways they for example |
---|
0:26:31 | one vector we just only been ordering or one backchannel nodding and the verbal utterance |
---|
0:26:37 | what just the people utterance that we need to look at the research the three |
---|
0:26:40 | what distributions and i'm have actually this |
---|
0:26:44 | for the user so at the moment only backchannels available |
---|
0:26:47 | but you in the future will probably tried accept others modalities like about one actual |
---|
0:26:55 | thank you please thank your speaker once again |
---|