0:00:17 | two |
---|
0:00:30 | hello and the lightning |
---|
0:00:33 | again and welcome to the next fashion and on policy and knowledge and we will |
---|
0:00:41 | start this test set with the talk |
---|
0:00:44 | on the reinforcement learning for modeling chitchat dialogue with this we actually it's |
---|
0:00:50 | and that i did they are is by |
---|
0:00:53 | seen that the right channel chi and c g rather and the presenter is a |
---|
0:00:58 | g |
---|
0:01:10 | i works |
---|
0:01:12 | and you at the trial run |
---|
0:01:14 | hi everyone |
---|
0:01:16 | thank you for be here and it's pretty exciting to be at sig dull |
---|
0:01:20 | i'm cg it's probably and let me give a little background intro |
---|
0:01:25 | to what we do i'm from global a i'm a research scientist healthily multiple machine |
---|
0:01:30 | learning groups my group is focused on dealing with a lot of deep learning problems |
---|
0:01:35 | where you actually have to inject structure into deep networks like only combine graph lining |
---|
0:01:39 | the traditional graph learning approaches |
---|
0:01:41 | with deep learning so we've actually released like a bunch of things and doing semi |
---|
0:01:45 | supervised learning at scale if you using any of the good products g mail so |
---|
0:01:49 | to anything et cetera where you will actually be using stuff that people |
---|
0:01:53 | we also do count as a actually i so i'll show you want example of |
---|
0:01:57 | that on detecting intends but also like multiple times |
---|
0:02:02 | board for language and also for revisions of using state-of-the-art vision |
---|
0:02:07 | technology |
---|
0:02:09 | misnomer |
---|
0:02:10 | people might think google |
---|
0:02:12 | large companies have a lot of resources we label all the data sets that we |
---|
0:02:16 | have |
---|
0:02:17 | do you actually able to set of god recognition image recognition system that you using |
---|
0:02:21 | google photos and cloud |
---|
0:02:23 | we have less than one percent |
---|
0:02:25 | annotation |
---|
0:02:26 | and the reason it works is |
---|
0:02:28 | in like two words |
---|
0:02:30 | semi supervised |
---|
0:02:31 | thus |
---|
0:02:32 | deep learning and a lot of other optimisations that are going on under the hood |
---|
0:02:36 | to my group is responsible for some of these things |
---|
0:02:39 | and finally |
---|
0:02:40 | a lot of the problems that we have to do with |
---|
0:02:43 | actually require a lot of compute on the cloud |
---|
0:02:45 | my group is also looking at things like how to do things on device |
---|
0:02:49 | imagine you have to build a dialog generation system |
---|
0:02:52 | or a conversational system that has to fit on your watch that cannot actually have |
---|
0:02:56 | access to gigabytes of memory or even you know a lot of compute unlike you |
---|
0:03:01 | know the cloud where you can do cpus gpus and all the latest generation hardware |
---|
0:03:06 | so with that |
---|
0:03:07 | hope is gone just mapping of things we work on |
---|
0:03:09 | this is joint work with |
---|
0:03:11 | my fabulous interns to know the right who couldn't be here is from y'all are |
---|
0:03:15 | from us images lab |
---|
0:03:18 | the talk is gonna be about deep reinforcement learning for modeling chitchat |
---|
0:03:22 | dialogue with discrete attribute if that's quite a mouthful |
---|
0:03:25 | all it means as |
---|
0:03:26 | we try to do dialog generation but controllable semantics |
---|
0:03:30 | and i will give you an overview of what we are talking about here so |
---|
0:03:34 | first off |
---|
0:03:36 | like for any generation system you have to predict responses |
---|
0:03:39 | here to applications where we have to predict responses and these are not more data |
---|
0:03:44 | and but equally hard |
---|
0:03:46 | at the order of like millions or even billions of predictions per day |
---|
0:03:50 | one s market by which our team double up |
---|
0:03:54 | several years ago |
---|
0:03:55 | i mean if you're familiar with smart reply |
---|
0:03:57 | okay quite a few if for those of you who don't know |
---|
0:04:00 | if a using g e mail |
---|
0:04:02 | on your phone |
---|
0:04:03 | if you see those blue suggestion box that pop up at the bottom that's exactly |
---|
0:04:07 | what it is |
---|
0:04:08 | so |
---|
0:04:09 | if you have any email or chat message it actually contextually generates responses that are |
---|
0:04:13 | relevant for you and if you notice these are actually very different responses that all |
---|
0:04:17 | the three suggestions and not necessarily the same so this is the smart reply system |
---|
0:04:22 | and for free folks who think that this is a simple |
---|
0:04:24 | and coder decoder problem |
---|
0:04:26 | i can sure you that |
---|
0:04:28 | to get it to work |
---|
0:04:29 | it's definitely not there's a lot more things going on you can either paper from |
---|
0:04:33 | ktd |
---|
0:04:34 | but out that someone some of these attributes later in the talk today as well |
---|
0:04:37 | but you can take this to the multi modal setting as well so we all |
---|
0:04:40 | really something called for a reply after the initial smart of like version |
---|
0:04:44 | where now you lead to you receive an image and you have to understand the |
---|
0:04:49 | semantics of the visual content |
---|
0:04:51 | and generate an appropriate response so if you look at the picture |
---|
0:04:55 | and it shows a baby |
---|
0:04:56 | the system would say so cute |
---|
0:04:59 | and you probably send it unless probably you don't have a hard |
---|
0:05:02 | right |
---|
0:05:03 | or if you see like other favourite things that would like if you see skydiving |
---|
0:05:07 | video or a image it'll actually suggest how brave |
---|
0:05:11 | i always been a very good the start |
---|
0:05:13 | one more suggestions how stupid should come at the end of it as well but |
---|
0:05:17 | b control for those set of things |
---|
0:05:20 | so these are just examples of generation systems but |
---|
0:05:23 | like the task that we're trying to solve in this paper is well basically we |
---|
0:05:26 | try to model open-domain dialogue so everybody here i don't need to introduce |
---|
0:05:30 | task-oriented dialog systems are available in everyday systems i mean you're talking about booking reservations |
---|
0:05:35 | like you know playing music et cetera there is a task and all the you |
---|
0:05:39 | know prediction a system that you bill |
---|
0:05:42 | parameters are optimized towards solving the task |
---|
0:05:45 | open-ended dialogue is much harder |
---|
0:05:47 | and one of the common way that people's all this is the standard |
---|
0:05:51 | sequences sequence model |
---|
0:05:52 | but you try to modeled as a machine translation problem so you given a history |
---|
0:05:56 | of dialogue utterance sequences |
---|
0:05:57 | and then you're trying to translate |
---|
0:05:59 | some representation of that encoded sequence |
---|
0:06:03 | into |
---|
0:06:04 | you know decoder sequence in this case an utterance that you're going to |
---|
0:06:07 | like send |
---|
0:06:09 | what's the problem |
---|
0:06:10 | almost every system especially the neural systems |
---|
0:06:14 | that you have today |
---|
0:06:16 | like doesn't matter which over time when you use seem quite repetitive and they sound |
---|
0:06:20 | very redundant right so the problem as a like from and ml perspective |
---|
0:06:26 | the unlike the task oriented dialogue the we cover is much larger and |
---|
0:06:30 | there's a high entropy that you have like few responses that are very commonly occurring |
---|
0:06:33 | but then of this long tail off like red responses so |
---|
0:06:37 | given a choice most of these systems are trying to maximize likelihood in some form |
---|
0:06:41 | of the other |
---|
0:06:42 | ill actually pretty big to generate responses and give you the maximum |
---|
0:06:46 | likelihood or the lowest perplexity |
---|
0:06:48 | so this is a common problem of course it's not a new problem like anyone |
---|
0:06:52 | who's |
---|
0:06:53 | both systems would have realised this and there are many ways to address this like |
---|
0:06:56 | people afraid doing adding like you know loss function objective function extending the loss functions |
---|
0:07:02 | you basically by sir system to produce longer sequences you know non-redundant responses |
---|
0:07:08 | adding an rl layer on top of the you know the deep learning system so |
---|
0:07:12 | that you can actually optimise your policy to do something that is non redundant and |
---|
0:07:16 | even injecting knowledge it's from sources like we need but a et cetera |
---|
0:07:22 | so |
---|
0:07:22 | in our work |
---|
0:07:23 | what we propose is instead |
---|
0:07:25 | do |
---|
0:07:26 | conditional model where we're trying to condition the utterance generation that the dialog generation |
---|
0:07:30 | based on interpretable and discrete dialog attributes |
---|
0:07:34 | so |
---|
0:07:34 | i will unpack each of those phrases like it within the next you slide but |
---|
0:07:41 | here the building block for the model |
---|
0:07:43 | so we use the standard |
---|
0:07:45 | encoder-decoder model but this is a hierarchical encoder-decoder model like originally introduced in serving at |
---|
0:07:50 | all |
---|
0:07:50 | and |
---|
0:07:50 | you can think of the says like to levels of and rnn recurrent neural network |
---|
0:07:55 | where the first layer is actually operating over words in the utterance |
---|
0:07:59 | at any given time step and then that generates a context eight |
---|
0:08:02 | and then you have another rnn that operate over a sequence of |
---|
0:08:06 | timestamps |
---|
0:08:07 | so basically that operates over the multiple turns in the dialogue |
---|
0:08:11 | simple enough of course |
---|
0:08:12 | training these things a never ever simple enough is like you know all kinds of |
---|
0:08:16 | hyperparameter tunings et cetera but we're not gonna talk about that |
---|
0:08:20 | instead what our model does as we propose a conditional response generation model |
---|
0:08:24 | where we trying to learn a conversational network that is conditioned on interpretable and |
---|
0:08:29 | compose able dialogue attribute so |
---|
0:08:32 | you have the same the first layer of rnn operating over be what in the |
---|
0:08:36 | utterance |
---|
0:08:36 | but instead of actually using just the context it to start decoding and generate a |
---|
0:08:41 | response we now going to model attributes |
---|
0:08:44 | dialog attributes in a tell you what does dialog attributes are |
---|
0:08:47 | these are interpretable and discrete attributes |
---|
0:08:50 | just not like there's been what do not like latent attributes where you have continues |
---|
0:08:53 | representations like the model a dialog state et cetera but here we can use discrete |
---|
0:08:58 | attribute |
---|
0:08:59 | which are predicted |
---|
0:09:00 | and model |
---|
0:09:01 | during the generation process |
---|
0:09:02 | and now want to predict the attribute at a given time stamp |
---|
0:09:06 | that last the context state is |
---|
0:09:09 | together used to generate the decoding state that means then you're gonna start generating the |
---|
0:09:13 | utterance after that point |
---|
0:09:14 | so what is a dialog attribute |
---|
0:09:17 | so we chose intentionally chose things like |
---|
0:09:20 | dialogue acts |
---|
0:09:21 | sentiment emotion speaker persona these are things that be actually want to model about a |
---|
0:09:25 | dialogue |
---|
0:09:26 | so the reason is we want to get control the semantic so |
---|
0:09:29 | it's not just about |
---|
0:09:30 | saying that hey does it look fluent or not |
---|
0:09:33 | but imagine what i want to if i want to say that |
---|
0:09:37 | make the dialogue sound more happy |
---|
0:09:38 | or |
---|
0:09:39 | for example |
---|
0:09:40 | and that the specific speaker style |
---|
0:09:43 | or a specific emotion |
---|
0:09:44 | or in the extreme and this is like |
---|
0:09:46 | for their along if you want your dialogue systems to start becoming empathetic et cetera |
---|
0:09:52 | like first of all quantifying what that means is also hard problem like there's i |
---|
0:09:56 | we don't have a whole talk and just that |
---|
0:09:59 | and |
---|
0:10:00 | this is that |
---|
0:10:01 | crucial part here |
---|
0:10:02 | so we are trying to force the encoder not to just generate the con contextual |
---|
0:10:06 | state but instead use that also degenerate a latent but interpretable representation of the dialogue |
---|
0:10:11 | at that particular time stamp and together use it to start the generation process |
---|
0:10:16 | now these are composed of lies has said |
---|
0:10:19 | so it's not just one single dialogue act or dialogue act to be that you |
---|
0:10:22 | would predict you can actually predict multiple ones of them so you can have a |
---|
0:10:25 | sentiment and a dialogue act |
---|
0:10:28 | and any motion and a style all being represented in the same model and in |
---|
0:10:33 | a few slides will be tear why this is useful |
---|
0:10:36 | so |
---|
0:10:38 | this is pretty much the just of the model |
---|
0:10:40 | so the |
---|
0:10:42 | but that you change are now you wouldn't model the attribute sequence |
---|
0:10:45 | and predicting the attribute itself is a simple mlp multilayer perceptron you can have more |
---|
0:10:50 | fancier things |
---|
0:10:51 | but this is integrated with the joint model |
---|
0:10:53 | and then used are the generation process |
---|
0:10:55 | during inference the best part about this is you would say that now you're complicating |
---|
0:10:59 | model even more |
---|
0:11:00 | you just introduce another bunch of parameters there |
---|
0:11:02 | obviously is gonna do better perplexity but |
---|
0:11:06 | what are you going to do for annotation like do you need another system just |
---|
0:11:09 | to give you manually labeled annotated data at the attribute level now for your dollar |
---|
0:11:14 | the good news is that you don't need it so here's how you do the |
---|
0:11:17 | inference |
---|
0:11:18 | so you start predicting be dialog attributes of the dialogue context so at any time |
---|
0:11:22 | to time you use the context vector to predict the attribute |
---|
0:11:25 | now condition on the previous attribute |
---|
0:11:28 | you actually predict the next |
---|
0:11:30 | i'd view that means that time stamp i use that attributed i minus one to |
---|
0:11:34 | predict that you know the dialogue act |
---|
0:11:36 | combine it with the context aided i minus one |
---|
0:11:40 | to start the generation process |
---|
0:11:43 | and as i mentioned the |
---|
0:11:44 | attribute annotation is not required during inference you just user during training |
---|
0:11:49 | now there is a whole |
---|
0:11:50 | bunch of things you can do together we even from the actual adaptation during training |
---|
0:11:56 | time for example |
---|
0:11:57 | you need to say that |
---|
0:11:58 | i need my training data also to be tied with semantic labels or like you |
---|
0:12:02 | motion labels or dialogue acts |
---|
0:12:04 | you could learn |
---|
0:12:05 | an open-ended |
---|
0:12:06 | set of things like for example open-ended topics of the dialogue |
---|
0:12:10 | and i want getting to that and the startling it but if a person to |
---|
0:12:13 | be happy to answer that you to |
---|
0:12:16 | so |
---|
0:12:17 | this is the crux of the model |
---|
0:12:19 | of course it doesn't stop there |
---|
0:12:21 | for most dialogue systems we also have to do in a rl reinforcement layer on |
---|
0:12:25 | top of that where you try to optimize a policy gradient |
---|
0:12:28 | usually these objectives a slightly different from the maximum likelihood objective that means you're trying |
---|
0:12:33 | to bias along responses or some other goal |
---|
0:12:36 | use the standard reinforce |
---|
0:12:37 | and usually the policies are initialized from the supervised pre-training so the |
---|
0:12:42 | attribute conditional the hierarchical recurrent |
---|
0:12:44 | and coda model is the one for screen and then you initialise the rl policy |
---|
0:12:49 | parameters |
---|
0:12:50 | from that state |
---|
0:12:52 | in standard works the this is how it looks like |
---|
0:12:55 | you formant formally the policy as a token prediction problem so this database is basically |
---|
0:13:00 | represented by the context at that means the encoder state |
---|
0:13:04 | and the action space is you trying to predict the token vocabulary one at a |
---|
0:13:08 | time |
---|
0:13:09 | what's the problem with this |
---|
0:13:11 | besides the double countries large for open-domain |
---|
0:13:14 | usually what ends up happening is these |
---|
0:13:16 | policy grading methods exhibit high variance and this is basically because of the large action |
---|
0:13:20 | space |
---|
0:13:21 | and |
---|
0:13:22 | the rl which is actually introduced to actually buys this surprise learning system some you |
---|
0:13:26 | know away from what it was supposed to line and like printers |
---|
0:13:29 | do meaningful dialogue |
---|
0:13:31 | instead tries to step away be linguistic and that's language phenomena |
---|
0:13:35 | simply because |
---|
0:13:36 | certain words are more frequent than others |
---|
0:13:38 | again |
---|
0:13:39 | the policies friend |
---|
0:13:40 | big |
---|
0:13:40 | those words |
---|
0:13:41 | from the vocabulary that will actually maximize its reward or utility function |
---|
0:13:46 | so |
---|
0:13:47 | of course |
---|
0:13:48 | training and convergence is another issue in this |
---|
0:13:51 | setting as well |
---|
0:13:52 | instead would be say is like |
---|
0:13:55 | instead of doing be |
---|
0:13:57 | token generation be formulated policy as a dialog attribute prediction problem the state space now |
---|
0:14:02 | becomes |
---|
0:14:04 | a combination of the dialogue context |
---|
0:14:06 | and the contextual attribute and these attributes of the dialogue at with the dimension in |
---|
0:14:10 | the previous slide |
---|
0:14:11 | the action space is |
---|
0:14:13 | the set of dialog attribute |
---|
0:14:15 | something more latent |
---|
0:14:17 | something more interpretable |
---|
0:14:18 | and |
---|
0:14:19 | in fact |
---|
0:14:20 | think about it like if you capture some aspect of a semantics of a sentiment |
---|
0:14:25 | you need all the words possible |
---|
0:14:28 | in the english vocabulary or any language vocabulary to generate that specific sentiment i mean |
---|
0:14:33 | as soon as you gotta that just |
---|
0:14:35 | the generation can actually downstream do much more interesting things so you're elevating the problem |
---|
0:14:39 | from the lexical level to the semantic level |
---|
0:14:44 | so |
---|
0:14:45 | there's a reason why this so people might say okay you introduce another attribute or |
---|
0:14:50 | like another set of parameters a latent layer there this is interpretable it's great |
---|
0:14:56 | of course this is gonna improve perplexity |
---|
0:14:58 | i'll show you that it's not just about complexity what ends up happening is even |
---|
0:15:02 | from the |
---|
0:15:03 | learning theory perspective |
---|
0:15:05 | because you're introducing these |
---|
0:15:06 | latent models and interpretable discrete variable models |
---|
0:15:10 | it actually converges better and learns to generate much more fluent and smooth responses |
---|
0:15:15 | and explore parts of the search space that it wouldn't the before |
---|
0:15:19 | simply because as an on almost every problem in the space is nonconvex so here |
---|
0:15:24 | we start with that but |
---|
0:15:25 | so here you're actually using the semantics or the user not language phenomena to guide |
---|
0:15:30 | it in a better |
---|
0:15:31 | what was it speaks |
---|
0:15:33 | so the experiment results conform the same like so we runs on a bunch of |
---|
0:15:37 | datasets like there's a perplexity and the table shows basically |
---|
0:15:41 | the columns are how much training data was trained on |
---|
0:15:44 | obviously if you go from left to right |
---|
0:15:46 | the more data trained on the better the perplexity of the generated dialogue that it's |
---|
0:15:50 | e |
---|
0:15:51 | and here are the attributes that we use a to model the dialogue |
---|
0:15:56 | now |
---|
0:15:57 | like sentiment means you're actually incorporating sentiment in the dialogue attribute stage of the model |
---|
0:16:01 | prediction switchboard is basically the dialogue acts frames is not a set of dialogue act |
---|
0:16:06 | so |
---|
0:16:07 | this can all be mutually exclusive all to be complementary or even overlapping |
---|
0:16:12 | and what we know what is this it's actually even beneficial to compose them of |
---|
0:16:15 | these attributes so they provide very different information so |
---|
0:16:18 | the fact that you model sentiment is not the same as you fact that you |
---|
0:16:21 | model |
---|
0:16:21 | dialogue acts the fact that you model dialogue acts from one particular |
---|
0:16:25 | john does not the same as modeling |
---|
0:16:27 | dialogue act from a different drawn so you can actually compose these attributes in very |
---|
0:16:31 | flexible fashion and in fact it actually improves the generation |
---|
0:16:34 | but the means the perplexity goes down |
---|
0:16:38 | so overall would be c is that the |
---|
0:16:40 | both the attribute conditioning and the reinforcement learning part |
---|
0:16:44 | generates like much better responses and more interesting in diverse responses |
---|
0:16:49 | so one we obviously |
---|
0:16:51 | as i said i keep repeating perplexity because every time you see a deep learning |
---|
0:16:55 | system i mean it's easy to improve perplexity try to me you add more parameters |
---|
0:16:59 | the system i mean |
---|
0:17:00 | the |
---|
0:17:01 | the weight works is like more parameters means and you add more data you can |
---|
0:17:05 | actually improve perplexity by optimising towards better state to the other parameter settings configurations |
---|
0:17:12 | now we also in addition |
---|
0:17:14 | did you many bows on the generated responses to see if it actually makes sense |
---|
0:17:18 | i mean because as a whole goal of generation i believe every generation system should |
---|
0:17:22 | do |
---|
0:17:22 | human about some setting if at all possible |
---|
0:17:26 | and what we notice is like |
---|
0:17:27 | a standard sequences sequence model compared with the attribute conditioning |
---|
0:17:32 | obviously the i could be conditioning actually helps the varsity and also relevance |
---|
0:17:36 | better that means it has much more winter loss ratio compared to this baseline model |
---|
0:17:41 | now in addition |
---|
0:17:42 | when you add the rl conditioning on top of that the means like we do |
---|
0:17:46 | the policy optimisation from this implies pre-training step |
---|
0:17:49 | it does even better |
---|
0:17:51 | so the rl as i said is actually knew |
---|
0:17:54 | move or nicely supervised training states from that initialization state to a better is good |
---|
0:18:00 | a lot about a policy but instead of learning it over at the token level |
---|
0:18:02 | now it's actually gonna learned that the attribute so we injecting attribute conditioning both the |
---|
0:18:06 | b r a level and also this approach training model |
---|
0:18:11 | if you compute the score is already but see discourse and their standard ways to |
---|
0:18:15 | do these based in the literature |
---|
0:18:17 | look at the responses and you can do automatic |
---|
0:18:20 | you know computation of the about metrics like |
---|
0:18:23 | compute the number of you know n-grams |
---|
0:18:25 | that are overlapping et cetera |
---|
0:18:27 | a how many distinct phrases or you know generated in the system |
---|
0:18:31 | overall the |
---|
0:18:33 | sequences you can model is worse than the attribute condition model and the other one |
---|
0:18:37 | is actually even better than both of that |
---|
0:18:42 | in addition |
---|
0:18:45 | if you take like the said |
---|
0:18:47 | of the response space that means like the most likely responses |
---|
0:18:50 | and you look at the percentage of them generated in the new systems |
---|
0:18:54 | the percentage goes down significantly how many times have you seen a chat or anything |
---|
0:18:58 | or any of the voice's systems you ask a question says i don't know right |
---|
0:19:02 | so the goal is |
---|
0:19:06 | that's a default you know fallback mechanism but the goal is like instead of that |
---|
0:19:10 | can be model something about for example |
---|
0:19:13 | emotional responses or other things just sort of engage the user in a better fashion |
---|
0:19:18 | what this allows to do is like you don't get the |
---|
0:19:20 | standard frustrating i don't know instead you get something mourn once it may not be |
---|
0:19:24 | the answer directly but it'll probably d the quantisation a much better five |
---|
0:19:28 | or direction |
---|
0:19:31 | and you're some examples which are one go through but like |
---|
0:19:34 | for standard inputs or not the standard either from read it so that never standard |
---|
0:19:39 | you get like interesting responses instead of think saying things like |
---|
0:19:45 | you know i don't know or you know leaving i don't want to have no |
---|
0:19:48 | idea used are getting like longer responses but also things that like mitch you know |
---|
0:19:53 | probably make more sense like for example i'm honestly bit confused |
---|
0:19:57 | why |
---|
0:19:58 | no one is brought me or my books any k might but it should be |
---|
0:20:01 | box i think at kick |
---|
0:20:04 | i don't think i don't think anything that's with the sequence a sequence model would |
---|
0:20:07 | even but that you conditioning |
---|
0:20:10 | voices are all say i can't wait to see in the city |
---|
0:20:13 | some of the context is missing from this example because the previous dialogue history it's |
---|
0:20:16 | been cut off here but there's something about the c d being mentioned there that's |
---|
0:20:20 | why it's to see |
---|
0:20:22 | okay just to summarize i-th |
---|
0:20:25 | we propose a new approach for dialog generation with control the link opposable semantics i |
---|
0:20:29 | think this is a super important then interesting topic because |
---|
0:20:33 | it's very easy to |
---|
0:20:34 | begin or what can generation we can do jans and all kinds of things like |
---|
0:20:38 | that but |
---|
0:20:39 | making it actually interpretable uncontrollable in this fashion believe also how that these in our |
---|
0:20:44 | empirical experiments tell the learning process as well it's not just about saying that this |
---|
0:20:48 | is a good knots language for non that we wanna model |
---|
0:20:51 | both the rl and look at the conditioning |
---|
0:20:54 | gender improves the baseline model by generating interesting and it was responses |
---|
0:20:58 | their number of things that b |
---|
0:21:00 | you know are looking at in the future |
---|
0:21:02 | in addition to incorporating multimodal but |
---|
0:21:05 | what is the impact of debriefing |
---|
0:21:07 | classifiers like for example as is that like we didn't use pre-trained classifiers as the |
---|
0:21:11 | attribute prediction problem there |
---|
0:21:13 | and how do we like |
---|
0:21:15 | measure the interpretability via modeling this during the training process |
---|
0:21:18 | audrey dialogue data generated actually |
---|
0:21:22 | respecting the semantics of the attributes that it actually predicts i mean there's that even |
---|
0:21:26 | makes sense |
---|
0:21:28 | and then like how do you know do this for |
---|
0:21:30 | speaker persona an extended to more open-ended concepts |
---|
0:21:34 | these are |
---|
0:21:36 | questions in like you know thoughts |
---|
0:21:37 | if you have any questions related to any of these things hundred runs of them |
---|
0:21:50 | i am residuals from start of five am i was very interested in your training |
---|
0:21:54 | corpus size of the examples you gave for the dialogue model training we've had up |
---|
0:21:57 | to two meeting million training examples obviously in a situation assume you're not a manually |
---|
0:22:03 | generating them are you getting them for me to give examples or where else you |
---|
0:22:06 | get it's a user some of them are from |
---|
0:22:09 | that dreaded and the open-set i was corporas these are available |
---|
0:22:13 | as it is said |
---|
0:22:14 | the attributes |
---|
0:22:15 | themselves i'm not necessarily always manly annotated for example for so which but i believe |
---|
0:22:20 | first part of that behind it |
---|
0:22:22 | a for one of the dataset but what we ended up doing is like you |
---|
0:22:25 | can take the |
---|
0:22:26 | standard lda or any other you know tool |
---|
0:22:29 | actually label them with the center so you can have a less a high precision |
---|
0:22:32 | classify image actually do |
---|
0:22:34 | a runaway training corpus so these can be single label for instance |
---|
0:22:37 | and interesting part is that |
---|
0:22:40 | after modeling all this like the it's not necessary the accuracy of the dialogue act |
---|
0:22:45 | to be prediction will go are in the latent system |
---|
0:22:48 | even though that might be really eighties or something like that it still is good |
---|
0:22:52 | enough for the generation system |
---|
0:22:54 | it so there is a so there's something work to be done about like |
---|
0:22:57 | how good can we get like i mean should be bumped up to like to |
---|
0:23:00 | ninety nine percent then whether that have an effect on the generation |
---|
0:23:04 | things that we are looking at |
---|
0:23:18 | i am adding more german research lab just had a question about i guess did |
---|
0:23:21 | you look at speaker persona at all i was only curious maybe you can speculated |
---|
0:23:25 | about it do you think with enough data |
---|
0:23:29 | with the conditional model you could model individual users |
---|
0:23:32 | maybe like to read it user names or something like |
---|
0:23:35 | there is a joke when we really smarter clapping after the first |
---|
0:23:38 | further for version assume |
---|
0:23:41 | i think it was a some professor from universities it |
---|
0:23:43 | this modifies and getting seem very snotty to me |
---|
0:23:46 | as like |
---|
0:23:47 | it's training on your own data i mean we don't look at the data but |
---|
0:23:50 | you know it's basically reflect in yourself |
---|
0:23:52 | so show an answer is yes but of course you want to do this what |
---|
0:23:56 | you know data right and you also want to do it in the privacy present |
---|
0:23:59 | manner which i haven't talked about here at all right part of my group focus |
---|
0:24:01 | on like |
---|
0:24:02 | how do you do this all in the privacy preserving manner right for example you |
---|
0:24:05 | can build a general system |
---|
0:24:06 | but then |
---|
0:24:07 | all the inference and things can happen only on-device are in like sort of like |
---|
0:24:11 | your data is like silent off from everybody else |
---|
0:24:14 | and the question is again |
---|
0:24:16 | deep really do you feel like you have a specific personality or what you feel |
---|
0:24:20 | was is what you actually right |
---|
0:24:21 | might be very different right so that their aspects of that to be considered |
---|
0:24:35 | i'll be here if you want |
---|