and the last talk of the social beyond characterizing the response place of questions
a corpus of us english and polish
so
relevance
in some sense
all in the sense of a competition coherence
which is amplified here in this example one
where
the on the on the first line of answers are kind of relevant
is that chain you yes it's a louis fourteen replica you
and second line is i
the ones that are not appropriate
so this notion is the cornerstone of fears of dialog the same way that say
grammatic allergies to syntax
and you could argue that basically what the during test is about is exactly relating
to
to relevance and whether that the that that's a good test for
you know when we managed to simulate a human intelligence
so i'm gonna a restrict attention to data corpus study relevance relating to queries
possible responses to queries
and
a bit surprising they perhaps
even if you restricted in this way the been actually very few
comprehensive attempts to characterize it
the some references in a paper and also in some early work that we done
the that a talk about moment
so the some
early
discussion of this in the language acquisition a literature
and some discussion this in some conversational analysis
literature
which is primarily that it to show that there's a at a difference between
three classes on so as not know announces a non responses
and
so in this study were based i've as it looked at n different languages and
show incident gone distributions and
we also flying i'm not the paper they won't korean the child quite different distributions
between say you the results that style but i don't english and the results and
korean but mainly about this
basically about this these three classes on says non ounces and a
a non responses
so today i'm gonna talk about starting by taxonomy that we developed for just characterising
query responses appear responses to queries
something that's sporty characteristic to certain race
and
we then will mention of a basic hypothesis that we used to just scale up
to the general case
talk a bit about annotation scheme in the results
and very briefly
talk about how one might model relevance and what the ready complications then hence
so the starting point of this work in the typology that we developed in the
top level of work of skin myself developed in some work that was published in
the
in two thousand sixteen in the journal of language modeling
and this is a wide-coverage taxonomy for question-question a sequence is was tested on
the bnc child's
the be corpus mammoths corpus
and there was also their formal modelling of the resulting classes in the framework of
our costs t l
so
this study consisted of about fifteen hundred slightly less than computing how to query a
response to as
and what are merged with seven classes of questions what are called elegy classes for
all not all corinne sponsor but for the two people those one
in this study
so we have yet clarification requests
things like hamlet as a response to what time it about
depending questions
so these are things like does anybody want to one m spread a given the
way
where you can do the inference that one question depends on the other whether anybody
wants to ban and strive depends on whether you gonna give it away
a classical motive which questions about underlying motivation what's the matter why
for class
whether responses and the changing the topic
well as you on so always yours
a fixed cost questions the duck a wet whether the you're trying to understand what
which way you're supposed on sit you know what makes black coffee is
which country
and the final to one is questions with the presuppose on so whether
question-response is somehow
indirectly in indicating on the to the first question
and the seven cases where
the
response ignores initial question but still addresses the same situations of things like
do you go you wanna go down have a look at that now what is
what when there and the response is why haven't they finished yet white with of
a is about the workmen so it still about the same situation but it's not
at all responding about the to the question
so that was that those of the seven classes we found the i need to
characterize
cool question response to questions which is about twenty percent of all at least at
the time we found was about twenty percent of all
responses to questions
and a main above hypothesis for this study
e is
that responses drawn from all concerning these class of questions
plus direct indirect on the food
that's going to exhaust the response space of a query
okay
so basically
you get the following kind of scheme
so a response to question can either be a non se and here you have
to subclasses direct on as an indirect on says
and ultimately in the paper also discusses that these actually needs some extras the process
within them
and if it's not i don't on so then it can either be a question
response like we've
already discussed with these seven classes
or it could be a noun so it can be the kind of gone response
so a kind of an acknowledgement
two classes that i'll give an example a second this the i don't know class
and this is difficult to provide response glass
and then declarative responses the about
the these issues that also already all rows in the in the question kind of
response
so the i don't know is this kind of
very not uncommon kind of response where and equally the this is a difficult provider
announce the case and acknowledgement of course you all the very familiar with those guys
so the data for english comes from the bnc
the be corpus and not on the map task corpus
so as you
you probably most of you familiar with these corpora the bnc is a
ta p honestly conversations
be contain speech or dialogues from the class courses
and map task consists of donald the code for direction providing task
so we took about five hundred past and b and c two and fifty from
b and about
slightly less and five wonderful map task
and
basically
the way this a good was a random ten selection of turn units ending with
a question mark
where we also eliminated type questions and turns with missing text and tens of missing
text
the polish data was taken from the scruggs corpus which is basically the spoken part
of the polish at national corpus
and that consists of that corpus consists of about two and fifty thousand utterances
and for this we chose about two hundred yes
okay so the basic results
all that for english the
the other classes is it is less than three percent so we have
more or less close to ninety something ninety seven percent coverage with this taxonomy
perhaps not hugely surprising
the most we can cluster responses in all three corpora in english
and approach i direct on says
in the bnc the biggest next biggest classes clarification requests
so be the next biggest classes indirect announces the map task the second biggest is
you know actually
ignore the case where you respond with another utterance which is about the same situation
but it's not respond to the question
so you can already see that is fair amount of variability across corpora
for polish the two most frequent last response is a on says so direct ones
and indirect ones
and then the next to a frequent classes or the i don't know class
and the ignore class
so this is roughly the results and obviously it'll be a bit hard for you
this is all in the paper so you can if you in the resulting in
detail you can you can see it there but you can see at the top
you have the of course the most of the masses taken by the
the direct on says
but with
the task oriented of course getting much more direct then
something like to be and see that the and open corpora like b and c
and spokes
and there you see and then you can see
that's there's a fair amount of variability
across corpora fulfilled you different kind of classes telling you that you know you're not
gonna get a good on you can't there's no chance of getting good characterisation of
this problem just by looking one corpus
and as we found in the in the question study at is quite a large
variability across corpora in terms of these kind of distributions so the nature of the
corpus really
again that's not very surprisingly influence is very much the kind distributions you get
as far as real reliability goes
so we did a in a i just speak about the english part of reasons
the time but the polish is discussed in table two so we did an intent
eight is a study we had would had to my main annotators were also
paper
and a work try to students in object linguistics
l two speakers of english and then to when assemble training sessions with the me
and
both annotated around five hundred paths and from this we extracted five hundred calmly bad
has
and
we got a cap of for our about one sixty five a group and of
about one sixty six
there
ninety four cases where the annotation to the disagreements where annotations agreements a occurred
the main disagreements concerned direct on says this is indirect on says so weak that's
about a third able to disagreements
it could no versus
change the topic acknowledgement a direct depending question and a direct answer
and acknowledgement this is the
so direct indirect disagreements mostly occurred with why questions how questions and what is x
doing questions
and visa cases where on says all by a lot sentential
and for which has been significant can promising theoretical literature on how to characterize onset
so just to give a couple of examples
so we have here case with the why question why deep tan'll to know that
well as the new guy
so the annotators disappear i was a direct or indirect and eventually was a resolve
to indirect
and is another example a web
this is a four to one again to why question i thought very nice is
it no it isn't what is why isn't it "'cause" it isn't
and this with again to go clean direct on statistical model
and eventually resolved to an indirect on sit since it indirectly indicated is actually no
reason
okay so this is just we just to give you a sort of flavourful for
full
the nature of kind of disagreements and
ultimately
the fact that probably
this is a kind of task
where
a notion of annotate a more sophisticated notion annotation we wait which doesn't necessarily
lead to a resolution but leads to actual different kinds of judgements having to maintain
it is probably needed
okay so the final thing i'll just mention is
that sort of formal analysis that it
that is needed in order to solve this to
two can describe this problem formally
so in our original paper we provided rules within the cost is the follows them
that
characterized
how the coherence or of
these seven class of questions that can the kind of "'cause" response to questions
and to the extent that what we've
what the study is shown is that basically
the class all of
responses
is
basically on says
plus
direct indirect bounces plus
things that are address these basic issues
then we already have essentially a complete characterization of the response space
which i in again potentially in the in implement able form in the sense that
this is the cost easier formalism is i is it is a sort of information
state type formal is them so it's but actually giving you a
has potential for implementing a kind of
a for dialogue manager
so just to make a few a few comments in that respect the most basic
a notion of answered i you might we might say is
something one has been a cool simple answer would
so if you think of what a question is for mathematical if you essentially a
some kind of a allowed abstract
where does for broke white ball questions it's a i'm abstraction of a empty set
of variables and for the rich questions over a set from of one or more
variables
then
a simple utterances are of course for polar questions just the two polar opposites
and
for
for all other research questions they are on the instantiations and then negation is
and this is actually a system plots the hood if you're the corpus has pretty
good coverage as
we know from this is of course and a way of they're pretty a direct
way of talking about slot filling
but that the ultimate notion of on subword which a goal here about nist had
encode about a similar in the real lecture
have to be
actually ultimately
if you want really wide got a coverage have to include things the go beyond
simple onset would so it has to accommodate conditional
we demoralised and quantification on says
so this addresses some of these kind of questions that the silicone is been asking
what all these poor people who are just a filling slots
and so
that was so
again i'm not of the i don't have time hated to see how to say
how you can have formally deal these opinion the discussed in the recollection questions
but at the same time even though that there has been discussion of how to
accommodate these kind of
on says to
so that the that also direct on says
with still lacking a comprehensive empirically based experiment extracted tested account for
of right a wh words okay so the all of this the reading lectures based
on based on very small number of a examples just for a small number wh
words
and of course additional notion of their questions needs is some if an exhaustive knows
which has to prevent wrap traumatised
and
whether responses exhaustive well
can determine whether response of except the required for a query so this leads to
what we i mentioned before that we need to find a great sub division of
the honest categories
and therefore on the base of about this and some notion of a source the
best one can define question dependence
and that's that the basis for instance for kind of a rules that you can
give the dialogue manager like if a question some discussion respond with an utterance which
is a few specific another with this either provides an answer a whole a dependent
question-response so that an example of the kind of way of
characterizing the coherence of
various some classes of a responses
the fine across all mention which is the another very big class and has again
fit as a important
implications for the kind of information that you need annual
representations is clarification requests
sold in work by again there's been a quite all of the reckon work on
that going back to what by a matthew purver myself
where we showed how to account for the main class of clarification requests
users using rules that enable clarification questions to be relevant a given utterance
so the basic idea
we are going at any for details is that involves accommodating to context certain kinds
of clarification questions
with rules of this basic format
so that the input is so much as you given so much as you want
something state you would the constituent of this is actions on that application
then you can accommodate any of these kind of a this class of questions what
a mean by you one what would today is that you one or a kind
of a confirmation kind of questions
but knows to do this you could not do this just on the basis of
having
content based that the content of the question as input you need the whole sign
that's associated with an utterance
okay
so
conclusions so i presented here and initial study for the for
what we've as possible we can see the first detailed form in depend characterisation of
response basic queries
and k s
a lot of things that need to be done
so one thing is cross question type in comparison so as to set the that
the question-response pairs that we looked at
was selected randomly and obviously it's interesting to consider distribution responses relative to fix parts
of questions
so different foster wh questions polar questions and so on and again we can be
facial that they'll be different distributions different fit for different parts of questions
we need to apply machine learning to acquire the response classification scheme of course so
there's been some work on this severance than the men ability of nonsentential utterances
so that that's a
subclass of the kind of was a response exist
so that gives hope for the non the bit of some of the sound classes
we anticipate that it's a some of these classically pretty difficult to learn for instance
the ones that a
heavily based on inference like indirect ounces and a more will change the topic
obviously
as everybody here is interested in down to just implementation so we'd like to test
these in a in a dialogue system with a fairly sophisticated management the of that
for instance of the goat is class
so there's been some initial experiments and work by arrive at allen and gotten but
and of course another here we gave you a
bit of work on english and polish
but of course get which only show you some differences
so is a signal us a given challenge we think is of see how you
test this classification with languages that
lack
speech corpora such as
about ninety five percent ninety percent of aligned is on have
so we we're starting doing some work on this respectively we go and we and
just by using online games
online games the proposed
we have a few minutes of questions
hi first of all thank you for your talk this is really interesting
so the question that i had was that if you could go back to slide
seven please
so you're example here for the changing of the topic a it seems to me
that this is not exactly a changing of the topic because you're staying on the
same topic they asked you what your answer was in us to what there's at
the same general topic but rather more of an indirect refusal to insert
so i was wondering it seems like changing the topic is always an indirect refusal
to answer and we consider a refusal to answer as part of your ontology
so i mean this is a kind of indirect basically thing that you might is
an implicature all providing this kind of response is that you know you're trying to
i mean you certainly not addressing this issue right
so we i
in our original work we actually suggested that these kind of it commented that when
you provide is kind of response
then
maybe for posting reasons that the most common way to do this is by taking
have a question which is kind unify able with the with the original one with
more general one you know we should talk about what or whatever is well though
this is you know
so this work some for many cases but
i mean the more general thing is just to provide i mean you could you
can you know you can sort of do this kind of changes topic
in a way that it can be less smooth of course but you know by
throwing something that is quite different and these things also happen so this
this is this the smoothest way of doing it just from in proposing point of
view
but it's not have a well that's gonna be
in this way
so you the basic dynamics
for this coherence have to ultimately allow you to that
as a consequence potentially
to get one of these questions eliminated
so that works in the in the setup did you did you see any instances
of like direct refusals to answer a question in your corpora
where somebody asks a question somebody just as i don't i don't one answer that
are i refuse to install i mean they're the coming the not very calm the
this another common but and how would your skin like what class would you with
that many well that's the
so that's here so these of the character ones that a about the issue that
is down the underlying issue of changing the topic i see thank you
we have time for another question
thanks a i wanted to follow up on your future work about the relating to
quite a types of questions
and i guess you have and only about but i wonder how much of your
differences in the corpora might be due to different distribution of questions that are in
those corpora versus distribution of types of answers to those types of questions
then also maybe could use quickly comment on how you define question "'cause" i think
you so there's just a question mark them a corpus so you're
you can probably one reason you're not including declarative for direct a no so basically
we in terms of a pharmaceutical questions we just
doing this kind of family
so i mean that's kind of building on the fact that
you know transcription has decided this is a question
and so that basically means that it's typically going to be i'd average questions all-pole
questions
which could be either you know they could also be so the declarative the data
that have a question mark the end so but they usually have the same basic
function as well as a sort of draw people what
so you're set your i mean i guess because since we have done this we
don't know
and we i think it's all the cnn interesting question also exactly you know
i again i'm not aware of what the street address this all that is look
at you know what the difference we have actually in a as the c l
paper that we had a two thousand seven of oracle financial mapping and me we
actually did have some tables of the different distributions of different kinds of wh questions
both
clauses and lexical ones so there is some work on this actually but you know
that was just for one of hope that i think of some b and c
so it's
forward
i don't think the speaker one moment