0:00:21 | or it but so slowly start them accession my name is for the province crevices |
---|
0:00:27 | evaluation very to session |
---|
0:00:31 | first speaker today |
---|
0:00:33 | is gonna be special colour |
---|
0:00:37 | we're gonna have a three talks in the session |
---|
0:00:39 | which random to lunchtime |
---|
0:00:43 | so we shall |
---|
0:00:45 | thank you |
---|
0:00:49 | can you hear me okay |
---|
0:00:51 | a high i michelle code i'm a close talking u c davis working jointly with |
---|
0:00:56 | department of linguistics |
---|
0:00:57 | computer science and psychology and to they'll be presenting a project i did with our |
---|
0:01:02 | bit chen and joe you |
---|
0:01:04 | so more and more humans are talking to voice activated artificially intelligent devices like amazon |
---|
0:01:10 | alexi to complete daily tasks |
---|
0:01:12 | like setting a timer turning on the lights |
---|
0:01:15 | and the new aspects through the amazon elects the price competition is the ability to |
---|
0:01:20 | engage real users in social chitchat three d systems many view here have competed or |
---|
0:01:26 | are competing but for those of you who don't know about it |
---|
0:01:30 | the amazon a leg surprises the competition to create social but that can converse coherently |
---|
0:01:36 | and engaging lee with humans on a range of topics like food music technology animals |
---|
0:01:42 | and so on |
---|
0:01:43 | and what's unique |
---|
0:01:45 | at least for researchers in academia is the ability to deploy the strap right in |
---|
0:01:50 | the wild and something dan bohus talked about yesterday |
---|
0:01:54 | so during the competition anyone with an amazon ago |
---|
0:01:57 | could say let's chat |
---|
0:01:58 | and get one of the computing chat bots |
---|
0:02:01 | you may be familiar with some other teams from twenty eighteen |
---|
0:02:05 | a including one from katie each phantom advice by gabriel's concept and light by patrick |
---|
0:02:10 | joan l |
---|
0:02:12 | but today i mean to be talking about gun rock the social but developed at |
---|
0:02:16 | u c davis advise by joe you and light by or pitch and make two |
---|
0:02:20 | corridors |
---|
0:02:21 | and gun rack a special as it won first place in the twenty eighteen competition |
---|
0:02:26 | i you can see joanne and our bit here |
---|
0:02:30 | so when i might show in our bit july last summer a contract team was |
---|
0:02:34 | about halfway through the competition and i was working on other projects related to how |
---|
0:02:38 | humans talk to voice ai so it's |
---|
0:02:40 | interested in seeing how |
---|
0:02:41 | users would engage with the social but like can rock |
---|
0:02:45 | so we started to collaborate recording these user interactions you can see my microphone there |
---|
0:02:51 | but we notice something as he listens to how these interactions unfold it |
---|
0:02:56 | alexis speech was relatively flat |
---|
0:02:59 | and really lacked the dynamism in human interaction |
---|
0:03:02 | we're speakers very their speech just to show their excitement |
---|
0:03:05 | their interests and their understanding |
---|
0:03:08 | and this is important |
---|
0:03:09 | is users for example were offering information about their favourite movie lx i really didn't |
---|
0:03:14 | sound like she cared |
---|
0:03:16 | and others have noticed this flatness in the alexi voices well here's an echo review |
---|
0:03:20 | where they mentioned that it would be nice if alexi didn't sound so monotone |
---|
0:03:25 | and that she needs to have a little more expression one she speaks |
---|
0:03:29 | and another where they say that they're having a lot of fun with her |
---|
0:03:33 | but her monotone productions can make things difficult for us to understand so this flatness |
---|
0:03:38 | could also effect user's ability to understand her speech |
---|
0:03:42 | so this slide to several research questions the first was how can improve a lexus |
---|
0:03:47 | expressiveness in a social dialogue system like on rock |
---|
0:03:50 | a especially given the time constraints of being in a competition |
---|
0:03:54 | so we know from work on human interaction that cognitive emotional expression is important for |
---|
0:04:00 | the quality of our interactions with others |
---|
0:04:03 | we see that readily in people's faces such as happiness and excitement |
---|
0:04:07 | we need to go to the vast a museum or contemplation and interest |
---|
0:04:12 | but we also see that in the way we produce and perceive speech so for |
---|
0:04:16 | example how emotionally express if we are relates to perceptions a speaker enthusiasm in human |
---|
0:04:22 | conversation |
---|
0:04:23 | so this is something we wanted to mimic in a lexus speech |
---|
0:04:27 | so how do we make a lexus a more expressive what one option is to |
---|
0:04:31 | completely overhaul the prosody |
---|
0:04:33 | we really didn't have that as an option we didn't work controlling the tts models |
---|
0:04:37 | in the competition which are given by amazon |
---|
0:04:40 | we can adjust the tts in my in minor ways using s m l |
---|
0:04:44 | but again we are on the time crunch and |
---|
0:04:46 | we also wanted to very carefully specify a where cognitive |
---|
0:04:51 | emotional expression would be inserted |
---|
0:04:54 | so we asked whether we could add discrete units of color emotional expression or voice |
---|
0:04:59 | them jeez add to improve expressiveness of the lx a voice |
---|
0:05:04 | so we identified to that we were interested in expressive interjections and these are ones |
---|
0:05:09 | that we're pre-recorded by the alexi voice |
---|
0:05:12 | here's an example |
---|
0:05:13 | wow is a |
---|
0:05:15 | and filler words like or |
---|
0:05:20 | and their relatively easy to add in the a lexus skills k just with a |
---|
0:05:24 | simple ssm l tag to adjust expressiveness |
---|
0:05:27 | i here for speech call an interjection |
---|
0:05:31 | or to add in a pause to make the filler words sound more natural |
---|
0:05:34 | so this is not modeled off of human |
---|
0:05:37 | interaction where |
---|
0:05:39 | individual signal their cognitive emotional states |
---|
0:05:41 | using these smaller response tokens |
---|
0:05:44 | so for this project we focus on these two types of voice emote jeez interjections |
---|
0:05:49 | and fillers |
---|
0:05:50 | and interjections can signal different things |
---|
0:05:53 | like the speaker's the motion |
---|
0:05:55 | but also how interested or surprise they are about information |
---|
0:05:59 | or whether what we're hearing about is newsworthy |
---|
0:06:03 | the other type of voice emote these are fillers |
---|
0:06:05 | like and |
---|
0:06:07 | which can also signal information about the speaker |
---|
0:06:10 | such as the speaker needing more time to collect their thoughts inconsiderate topic their degree |
---|
0:06:15 | of uncertainty about a topic and even their level of understanding |
---|
0:06:20 | so well are first research question was how do we add expressiveness are second is |
---|
0:06:25 | how will people respond to alexis expressiveness |
---|
0:06:29 | series of computer personification such as clifford nasa's computers are social actors framework propose that |
---|
0:06:35 | when a person sense as a cue few manning the system we automatically treated like |
---|
0:06:39 | a person so here are question is really theoretically important in considering the degree to |
---|
0:06:44 | which users personify voicing i |
---|
0:06:48 | what users develop greater report with a |
---|
0:06:51 | expressive alexi |
---|
0:06:53 | or will it be creepy falling into the uncanny valley |
---|
0:06:56 | the idea that the more similar nonhuman entity like a robot or alexi is to |
---|
0:07:01 | person the more people like it to look at a point where they find it |
---|
0:07:05 | incredibly creepy |
---|
0:07:07 | so here's an overview of the rest of the talk |
---|
0:07:10 | first will go over some prior work looking at interjections in fillers in human computer |
---|
0:07:14 | interaction |
---|
0:07:15 | then i'll go over a study we did our dialect surprise track pop and rock |
---|
0:07:19 | and then go over some conclusions and future directions |
---|
0:07:23 | so they are actually very few studies that have tested adding interjections and exclamations in |
---|
0:07:28 | the dialogue system |
---|
0:07:29 | and there's been a lot greater focus on overall prosodic adjustments to fraser utterance |
---|
0:07:35 | i one side you did test the impact of non-linguistic affective burst |
---|
0:07:40 | so buzzes and b |
---|
0:07:42 | you know robot than our robot and they found that |
---|
0:07:45 | kids sixty years old readily attribute motion to those noises |
---|
0:07:51 | and will not using interjections per se sort all colleagues found that speech trained on |
---|
0:07:55 | a corpus of positive exclamations like great |
---|
0:07:59 | resulted in higher listener ratings |
---|
0:08:01 | in a seven utterance simulated dialogue |
---|
0:08:04 | but they observed no such a fact when the tts was trained on negative exclamations |
---|
0:08:08 | like dear or groups |
---|
0:08:10 | so really overall adding interjections as in |
---|
0:08:13 | under study area in human computer interaction |
---|
0:08:17 | and there's a bit more work looking at adding filler words but the findings have |
---|
0:08:21 | been mixed |
---|
0:08:22 | so i'm the one hand some studies have found a facility were effect |
---|
0:08:26 | for example users have reported having a greater sense of engagement |
---|
0:08:30 | with the robot if that robot uses filler words |
---|
0:08:34 | and in another study independent raters keep higher naturalness ratings |
---|
0:08:38 | for human computer conversations |
---|
0:08:40 | when that voice included filler words |
---|
0:08:43 | but others are found no positive affective introducing filler words or even a negative effect |
---|
0:08:47 | for some listeners |
---|
0:08:49 | so it's really an open question as to how humans might response to voice ai |
---|
0:08:53 | systems |
---|
0:08:54 | using interjections and fillers |
---|
0:08:56 | a whether these voice mode jeez for example might be beneficial or detrimental to user |
---|
0:09:01 | experience |
---|
0:09:03 | okay so now think and rock |
---|
0:09:07 | here's the overall architecture i'm just gonna provide a brief overview there's a technical report |
---|
0:09:12 | if you're if you're curious |
---|
0:09:14 | so the asr and tts models were provided by amazon |
---|
0:09:19 | they we have a multi step and all you pipeline including sentence segmentation constituency parsing |
---|
0:09:24 | in dialogue prediction |
---|
0:09:27 | and then gonna has a hierarchical dialogue manager with higher level higher level topic or |
---|
0:09:31 | organizers well as |
---|
0:09:34 | template specific dialogue flows and that's for about been different topics so includes animals movies |
---|
0:09:40 | news books |
---|
0:09:42 | and so on |
---|
0:09:44 | and this dialogue manager pulls an information from e v a factual knowledge base and |
---|
0:09:49 | the can rock persona |
---|
0:09:51 | database |
---|
0:09:52 | questions about who elects it is |
---|
0:09:56 | next we have a template based nlg module where the system fill slots with data |
---|
0:10:01 | retrieved from various knowledge sources such as i am db |
---|
0:10:05 | and then |
---|
0:10:05 | finally we adjusted the prosody by adding the fillers and interjections so this is really |
---|
0:10:10 | the focus of this presentation which were then output by the tts in the i |
---|
0:10:14 | d for all x of voice |
---|
0:10:18 | okay so how are we going to insert |
---|
0:10:20 | interjections and fillers |
---|
0:10:22 | we can't just insert them randomly that's not how language works |
---|
0:10:26 | it's ten mentioned in his you know yesterday placement of these elements is really you |
---|
0:10:31 | words so together we created a framework |
---|
0:10:34 | for context specific placement of interjections and fillers into existing |
---|
0:10:38 | can rock templates |
---|
0:10:39 | and again we didn't manipulate any other prosodic aspects of a lexus speech we just |
---|
0:10:44 | added these discrete words and phrases |
---|
0:10:48 | okay so starting with the interjections we define two five context |
---|
0:10:52 | for each we defined a list of possible interjections which could be used in that |
---|
0:10:56 | context so we defined a list and then they're randomly pulled in |
---|
0:10:59 | so the first is to signal interest this was really important because we wanted the |
---|
0:11:04 | user to elaborate |
---|
0:11:06 | so for example |
---|
0:11:09 | so tell me more about it |
---|
0:11:12 | since the goal the competition is to get users talking as long as possible |
---|
0:11:16 | we want really wanted them to expand on their experience and make it seem as |
---|
0:11:20 | though alexi was actually interested in what they had to say |
---|
0:11:23 | so here we used |
---|
0:11:25 | a lot of in different interjections which could be randomly inserted |
---|
0:11:28 | into this word in a phrase initial slot |
---|
0:11:32 | the second context what's for error resolution or to show a lexus feelings about her |
---|
0:11:37 | misunderstanding |
---|
0:11:38 | and this was a really important one since a lexus a often misheard the user |
---|
0:11:43 | we wanted to convey for disappointment |
---|
0:11:45 | in not getting it right |
---|
0:11:47 | again with lots of possible variations for example |
---|
0:11:50 | there are i think you said probably can you say that one more time |
---|
0:11:55 | the third was to except the user's request |
---|
0:11:58 | for example |
---|
0:11:59 | l t here is some more information |
---|
0:12:03 | this we didn't have as many as to signal interest since it was a social |
---|
0:12:07 | dialogue system is less |
---|
0:12:09 | task based then elect is usually |
---|
0:12:13 | the fourth was to change topic as it alexi just remembered something she wanted to |
---|
0:12:17 | share the user |
---|
0:12:19 | and this was the part of a strategy to change the topic if the user |
---|
0:12:22 | wasn't being very responsive giving a lot of one word versatile answers |
---|
0:12:26 | well i've been meaning to ask you do you like animals |
---|
0:12:31 | and the fifth was to express agreement of opinion |
---|
0:12:34 | yes |
---|
0:12:36 | we share the same fox |
---|
0:12:37 | and this to happen is often in the gun rock template so we just used |
---|
0:12:40 | to interjections here |
---|
0:12:42 | but if you had a lot that really wanted to agree with people you could |
---|
0:12:45 | out a lot of others like also more cool |
---|
0:12:48 | so in addition to the five context we also included some interjections meant to convey |
---|
0:12:52 | a lexus playfulness |
---|
0:12:54 | and these were all utterance specific and not interchangeable so for example or in |
---|
0:13:01 | that's so cute |
---|
0:13:03 | and what one so get ready for a cheesy joke |
---|
0:13:06 | what you call blueberries playing the guitar |
---|
0:13:09 | a jam session while |
---|
0:13:12 | okay and now for the fillers like the interjections we identified certain types of context |
---|
0:13:17 | where fillers would be used in human conversation i so you're the three |
---|
0:13:23 | the first was to change the topic a freebies we also tune the production of |
---|
0:13:27 | so slowing it's rate and adding the parts |
---|
0:13:30 | so i've been meaning to ask you do you like to play video games |
---|
0:13:36 | the second was when retrieving a lexus back story so this was as if |
---|
0:13:41 | it to make it seem as though alexi had think about a response that was |
---|
0:13:45 | less automatic |
---|
0:13:48 | i'm not all animals that i think my favourite is probably the elephant |
---|
0:13:53 | and the third was to acknowledge the user's answer to a lexus question |
---|
0:13:57 | a like an interesting choice |
---|
0:14:01 | so following that framework this resulted in interjections and about twelve percent |
---|
0:14:05 | of turns in a conversation and filler words in about eighty percent of turns |
---|
0:14:09 | the exact number of fillers and interjections varied based on whether that particular utterance was |
---|
0:14:15 | engaged in |
---|
0:14:16 | being the conversations you could |
---|
0:14:18 | can think of it is each user got a slightly different sampling of |
---|
0:14:22 | of these interjections and fillers based on the course of their conversation |
---|
0:14:27 | okay so here's an example i'll up maybe user so he lx that let's chat |
---|
0:14:33 | hi this is an annex a prize social that and black to median how are |
---|
0:14:38 | you doing today |
---|
0:14:39 | doing okay how are you |
---|
0:14:42 | and good sell tell me more about yourself do you like sports |
---|
0:14:47 | not really |
---|
0:14:49 | so have you seen any movies lately |
---|
0:14:53 | a star is boring |
---|
0:14:55 | a i seen a stars board to what would you rate this movie on a |
---|
0:15:00 | scale from one to ten times |
---|
0:15:04 | wow |
---|
0:15:05 | a perfect and it really is an exceptional movie here is something i just learned |
---|
0:15:10 | some of the scenes were filmed at the co-channel a festival in twenty seventeen we |
---|
0:15:14 | only got how was the first female headliner in ten years |
---|
0:15:18 | festival goers pay ten dollars to see the scenes being filmed all the recording devices |
---|
0:15:23 | were locked away to prevent any footage being read any thoughts |
---|
0:15:27 | so you can see it's really a discrete |
---|
0:15:29 | phrase in a very long utterance |
---|
0:15:32 | okay so does adding interjections and fillers have an impact |
---|
0:15:37 | so we conducted a user study through the devices themselves so this is in the |
---|
0:15:41 | wild as part of the lx the price competition |
---|
0:15:44 | so we had four conditions |
---|
0:15:46 | one with interjections |
---|
0:15:47 | one with fillers one with both and one with the night or and these are |
---|
0:15:51 | these conditions push live to all alexi table devices from november twenty have to december |
---|
0:15:57 | third so this was after the competition was over and know what other code updates |
---|
0:16:02 | were happening that's very crucial |
---|
0:16:04 | and this methodology extends prior work on human computer interaction |
---|
0:16:09 | giving us large sample size for over five thousand unique users individuals who actually wanted |
---|
0:16:15 | to talk to the device and we're doing so on the place most comparable to |
---|
0:16:18 | that |
---|
0:16:18 | in their own homes |
---|
0:16:20 | and the reader so at the end of the conversation they would re the conversational |
---|
0:16:24 | scale from one to five so |
---|
0:16:26 | the raiders where actually the users in the conversation itself |
---|
0:16:31 | this also consists of users anyone with the device so it's not constrain to the |
---|
0:16:35 | eighteen to twenty two year old slice that we generally test |
---|
0:16:38 | but it's still likely skewed by social economic status and finally users have more experience |
---|
0:16:44 | with the specific system so perhaps they have more familiarity and report with their legs |
---|
0:16:48 | that |
---|
0:16:51 | so we analyze the reading at the end of the conversation with a linear mixed |
---|
0:16:54 | effects model weather conditions and values are random intercepts |
---|
0:16:59 | we only included data for conversations with at least ten turns |
---|
0:17:01 | and for the ones that had a filler interjection both |
---|
0:17:05 | that had at least one of those |
---|
0:17:07 | or two of those options |
---|
0:17:09 | so i'll take you through the results one by one i here we have the |
---|
0:17:12 | conditions on the x-axis and the rating on the y |
---|
0:17:16 | i here we can see the baseline model this is the one without interjections and |
---|
0:17:19 | fillers had an average around two point eight |
---|
0:17:23 | then we site |
---|
0:17:24 | the linear regression model revealed a main effect of condition so we see significantly higher |
---|
0:17:29 | ratings for conversations with interjections this is all relative to the baseline |
---|
0:17:34 | we also see higher ratings for the conversations with fillers |
---|
0:17:38 | and also for the conversations with both with an average increase of about |
---|
0:17:43 | point seven five |
---|
0:17:45 | we are curious to see if the combined condition |
---|
0:17:48 | was different from the |
---|
0:17:50 | single interjections and fillers and we did |
---|
0:17:52 | indeed thought that was the case |
---|
0:17:55 | so adding voice them jeez inappropriate context |
---|
0:17:59 | improves user ratings |
---|
0:18:01 | and |
---|
0:18:02 | this shows that even adding discrete elements may improve overall expressiveness of a social dialogue |
---|
0:18:07 | system in this provides support forecaster frameworks as humans appear to be responding positively to |
---|
0:18:13 | human like displays of cognitive emotional expression |
---|
0:18:16 | in an alexi voice |
---|
0:18:18 | in may in some ways be responding to the system or like a person |
---|
0:18:23 | we also see that the effect is additive for different types of voice m o |
---|
0:18:26 | g so users keep the high ratings or conversations with both fillers and interjections |
---|
0:18:32 | and overall this effect is robust we see it over thousands of unique |
---|
0:18:36 | users can conversations |
---|
0:18:38 | but one limitation perhaps you've already thinking of is that these ratings are really a |
---|
0:18:42 | holistic measure of the overall conversation so we wanna do one more controlled study |
---|
0:18:49 | to confirm that the voice them jeez do indeed improve the ratings of the conversations |
---|
0:18:55 | so we did a mechanical turk experiment with any five turkers |
---|
0:19:00 | and the similar conditions structure as in the user study |
---|
0:19:04 | with two dialogues one to signal interest |
---|
0:19:06 | and one to resolve endeavour |
---|
0:19:10 | so just as in the main study we had the baseline one with fillers |
---|
0:19:15 | one with that interjection |
---|
0:19:16 | and one with both yours an example |
---|
0:19:20 | movies can be really fun |
---|
0:19:21 | so i've been meaning to ask you what else are you interested in do you |
---|
0:19:27 | like animals |
---|
0:19:29 | what we're animals |
---|
0:19:31 | some i think my favourite animal is the elephant |
---|
0:19:35 | and then same for the dialogue or the error resolution dialogue |
---|
0:19:38 | i one with night or fillers or interjections |
---|
0:19:41 | one with fillers only |
---|
0:19:43 | one with interjections only in one with both |
---|
0:19:46 | that's pretty interesting |
---|
0:19:47 | so have you seen any movies lately |
---|
0:19:51 | the not really is really in good |
---|
0:19:55 | darn i didn't catch that can you say that again |
---|
0:19:58 | so these are real user interaction caesar once we scripted loosely based off of topics |
---|
0:20:04 | in gun rock |
---|
0:20:05 | so the turkers heard these two dialogues and all possible conditions randomly and then for |
---|
0:20:11 | each dialogue they heard a raster radial x a voice on a sliding scale so |
---|
0:20:15 | how engaged is a lexus sound how expressive does a lexus sound how likable and |
---|
0:20:20 | how natural |
---|
0:20:21 | and we analyze these ratings with separate linear mixed effects models |
---|
0:20:26 | since i'm running on a time ago through this quickly |
---|
0:20:29 | so here's what we found as with the overall user study we found a main |
---|
0:20:33 | effect of condition |
---|
0:20:35 | i get relative to the baseline |
---|
0:20:38 | my computers |
---|
0:20:39 | having some issues |
---|
0:20:43 | so we see an increase for |
---|
0:20:48 | so conversations with interjections shown in red show significantly higher readings of all of those |
---|
0:20:54 | social variables look for |
---|
0:20:59 | for those four dimensions |
---|
0:21:02 | i'll just give you a quick summary my computer it's frozen so overall what we |
---|
0:21:07 | found perfect so what we saw that the results for the user study me were |
---|
0:21:12 | what we observed in the mechanical turk study instances of social ratings we saw something |
---|
0:21:17 | a little bit different with the fillers so users |
---|
0:21:20 | the mechanical turkers actually redid the voice as having lower likability a low-rank each meant |
---|
0:21:26 | when that voice had the fillers so this is a little bit different in suggests |
---|
0:21:30 | that the role of the reader so if you are is makes a difference so |
---|
0:21:35 | if you're the person in the conversation you tend to like the interjections you don't |
---|
0:21:39 | also like the fillers |
---|
0:21:41 | but if you're an external rate or listening and on the conversation |
---|
0:21:44 | you really pick up on those fillers and that really made from yours what we |
---|
0:21:48 | seen in research one human interaction |
---|
0:21:50 | thank you |
---|
0:22:04 | we have some five questions |
---|
0:22:12 | very interesting topics i'm wondering about how |
---|
0:22:17 | given the way that you're adding this |
---|
0:22:20 | fillers and interjections it seems like it somewhat stochastic us to when they come out |
---|
0:22:25 | an s one and f |
---|
0:22:27 | all the dialogues that included them have roughly the same percentage or number wrong number |
---|
0:22:32 | more work number per term are or where there's a big variance within the different |
---|
0:22:37 | dialogs and if there's variance whether you |
---|
0:22:41 | john more carefully at a whether having more fillers robust fillers changed the rating is |
---|
0:22:46 | actually question we didn't look at that are so we looked at the number of |
---|
0:22:50 | fillers encryption particular conversation |
---|
0:22:53 | and didn't seem to your relationship at least with reading |
---|
0:22:58 | is related to overall turns that that's |
---|
0:23:00 | let me to be expected |
---|
0:23:12 | that backs fascinating and results reducing and |
---|
0:23:16 | i was wondering having looked at the data |
---|
0:23:20 | do you think doesn't is goal for building a model that can you know look |
---|
0:23:24 | at context and decides yes or no we're gonna put a veteran seems likely |
---|
0:23:27 | limits the yes right so this was just a very simple kind of way to |
---|
0:23:32 | test this but we it was not the most sophisticated way that we could we |
---|
0:23:37 | could do it by definitely |
---|
0:23:39 | but i mean if you look at the conversations in the ones that looks like |
---|
0:23:42 | it's going well looks like number do you think there's some signal on the |
---|
0:23:46 | but there could be a model to train or |
---|
0:23:49 | i noticed in the increase in user studies |
---|
0:23:53 | that the users would smile if you had interjection |
---|
0:23:58 | and some actually |
---|
0:24:00 | mention the filler words themselves |
---|
0:24:03 | it's so |
---|
0:24:04 | i mean that's a very explicit sort of q by if you're able to record |
---|
0:24:09 | we you know you could |
---|
0:24:11 | use |
---|
0:24:12 | you know the smiling the facial expressions |
---|
0:24:14 | to know if it's |
---|
0:24:15 | if it's going well that's appropriate |
---|
0:24:20 | more question |
---|
0:24:24 | since build a t vs to keep people engaged for longer what has the effect |
---|
0:24:28 | of length of conversation |
---|
0:24:30 | there wasn't a clear relationship so there are two so we wanna keep people and |
---|
0:24:34 | each as long as possible but also |
---|
0:24:36 | however in meaningful conversation |
---|
0:24:38 | really feel for so there was no relationship between number of |
---|
0:24:42 | okay utterances but well only with reading |
---|
0:24:47 | in the this is more common than questions |
---|
0:24:49 | sometimes people have news stories and they like tori the first time then after a |
---|
0:24:55 | while a good point five t |
---|
0:24:57 | in have you sort of making in experiment over time |
---|
0:25:03 | well |
---|
0:25:04 | you see if this really works |
---|
0:25:06 | in the long time that's a great that's a great question no we haven't but |
---|
0:25:10 | that's already down |
---|
0:25:15 | and we have time for one last question |
---|
0:25:21 | just for clarification what you're fillers seem to be all the sort of a turn |
---|
0:25:26 | initial did you have them you know like the most notable fillers a like you |
---|
0:25:31 | know in noun phrases just up the services |
---|
0:25:34 | so we didn't so we just put them in the same location is the interjections |
---|
0:25:39 | but if you're absolutely right they occur in a lot of different places if you |
---|
0:25:43 | have a hesitation for example or of false start sometimes you get fillers there as |
---|
0:25:48 | well |
---|
0:25:49 | you're just trying to keep it very simple |
---|
0:25:53 | but stack the speaker model |
---|