thank you
so electrical because a recent work and the to be a scaled down there
exactly in collaboration with a one company the company score the added combine the
and is this company is of interest e and e commerce in the promise a
scenario a we have been working on a and decoder can you show no for
conversational agents a in this in this scenario
so they get the general idea is that and that the long-term goal of this
work is a kind of conversational agent that kind of the shop assistant that the
media but users in a buying products in that one income in any comments you
know
so forth for instance i e for the user say second i find a kind
of that was the arc of rules for my most of the suppose behavior of
these shop assistant who presented the user really to the products
this is a kind of task oriented scenario and the and basically it can be
approached the with the traditional as follows really approach
so we have several what the system is supposed to recognize in than the likes
of a
and then the to it's a classifier
because if i you put the using the categories to provide the brand the colour
and the and other properties
so the approach should work on a seven dollar on a many to me for
instance the cameras for mutual or groups through the and that that's
why a relative and the probabilities of these us in its that the basically there
are there are no i'm not stated that utterances so that i and all sentences
or
request from the user that which i i'm not stated that we the
the i
the properties the intent and the and the properties that
of the specific domain
and another problem another relevant factors that
it might be easier
to find that handles where
information about verb that the is present
so given these along with a scenario with focus the on the specific issues in
the in this work
so we focus on entity recognition
so for instance the cup on the capacity to recognize kind of who's park that
is one lady with the
a user utterance
we based our work on a gaseous
that's its in this
in this scenario are catalogues basically comparable opera dataset
that we can get the from vendors the yukon the subband
the main research question for as he is how far we can go without any
and located and so this is why we call the other so the link this
approach
so few words about the specific issues of and it's nice product needs to and
in this in this is in hong kong
so basically this is different from traditional named entities we have what they in that
the tradition of information extraction has been called the nominal in
so for instance an entity may contain also connectives like a black and white fisher
we have a black a black paint
so black is a property all of open-ended
entity names may contain bright adidas so wide beam that's for sure
or even a proper names that i've a lot if you if the think about
okay
and use it you know how many
and need to add products
another a very important property for a for our approach is composed is compositionally
so being and nominal entities we may assume and they'll respect to some competition my
principle of the lane
so if we have of for instance in the folder domain pasta with broccoli
i and but we base also this is the and now plus the positional modifier
we can add the objective alamouti five times faster would based
but then knowing that we may have a slight but we may still
a we may in fact that the
being getting having both past that wouldn't broccoli and but it will broccoli may be
maybe it's a it's a good name also spaghetti we'd based okay even if it
is not as never been
see before
so this means that we can
our the approach the should be able to take advantage of composition on
a then might be the case of having a multiple of convinced it's all this
of a semantic category in the same utterance
which is not the colour of
it's a d r that was synonymous a like a in booking flights usually one
just unique shown
one at time of our evaluation but not maybe
it's quite well maybe i would like to work the salami pizza and most analysis
entities
so two entities to of the content of the scene categories to in the same
utterance
and then there is a strong the need of multilinguality of course so that it
is
you can to spend also they
they need to translate accuracy in a in multiple languages
okay so which are that were working hypotheses we would like a twenty nine but
the model for entity recognition based only on if you look at how to cut
the ropes like i got it
and then we would like to apply this model
to in order to label
unseen entities in a user
so the only is that we have a density is not nothing at and we
will open all
right to understand how far we can go
that's the working on a on the t
so the main into a ds set of our approach have the following
take advantage of composition and nature of problem needs so we want to extract as
much as possible knowledge from got it
use as much as possible syntactically generated data
a having no reality come from users
a we need to work on syntactically generated data
and then we would like to be as much as possible language independent
so this is the approach basically for states to
at the beginning we collected against india
for a step in domain
then the starting from these get here we generated both positive and negative example
of all the and it needs can be regarded e
so that will be on the base of positive and negative example
will be the a classifier in our case a new classifier
be able to recognize the entities in that in this in that the specific domain
having these are classified this model that this classifier
which is able to discriminate the weather
is to design a sequence of tokens is that is contained in a in a
second domain or more than we are we want to apply this model
to recognise
and names in utterance is
and so we apply the classifier to all the stuff sequences
of in user utterance
in order to select and select all the best the sequence is that which are
normal that like
i will see the forced that we wish to the force that with that was
some example
so collected additive for a separate for the domain
we just by screen and the website of a fine so for the number of
the main okay like for the real the and four
the underlying assumption here is that the screen being a website collect the
well as it often fails on the all four entity names that is marks g
then i'm not eating good data
particularly because we don't have
after a system you not
so this is the first that
just collecting
the second step is to generate the
positive and negative examples okay so the positive example
i at least in our money now our approach our initial approach is quite simple
all of the old i frames university of it are also conducts
okay we downloaded them from a website so we trust the website
as for melody that
for each
well as you can example which have at the and negative example or and number
of negative examples that
spalling disciplines and the rules
okay
so for instance at each step sequence of imposing example is in it
okay that's
that is simple
we have the second row and second the
perceive you
we have a positive example one token a randomly selected from the forced located in
the list of the data in the data yes or the last okay
okay so we
compose a negative the negative examples
for instance if we start with the black and white t shirt the
okay this is the positive and negative a to all the some sequences play why
the black and white black and white and so on but negative but also black
and white to show the preceded by as being the randomly selected the from the
data
i think that in this capacity of a which i've downloaded from the web there
is a local noise
okay we don't have any control on that we all the vendors a ride the
needs of products
well
make sure that might be completely you know
so the second step we generate positive and negative now on the basis of positive
and negative we built a more than
and you modality
so a classifier which is able to say even in a sequence of tokens
yes this is a
a full that it's a novice is not the full
this is the
for mature not this is not performance
so we was the not really x easy to
classifier a so we this is based on a new world model proposed by the
lamp holder and the and the others a couple of years ago
and uses a kind of a classical l target you know that detector is you
know that that's all both a word embeddings that and are active embeddings
we have data a few handcrafted feature
which we are available on the for this classifier you like the features about the
relative to a certain token
the position of the token the frequency the length of the to enter
you don't probability of a token the and also all the this is the only
linguistic information that the we have using all the a part-of-speech for that though okay
so without any disambiguation
so at the end of this classifier assays yes this is the this is a
sequence of cocaine
is a multiple
a first step in for seven containing
and it is a confidence score that so the thing that simple
so no we have this classifier
okay you mode that we have but we want will but our goal is to
recognize and easy and it needs to in a so in that sentences request okay
so
i think about the this example this is a possible a request that from one
user and looking for a building the yellow sure so and that
lucia the
we need to the classifier although this sequences of these additional request
so and the we asked to the classifier to say whether x sub-sequences positive or
negative
so in this case the
a positive will be sure cellular shorts a little bit in the yellow sure to
and then i will be i'm looking for a gold in a short and darpa
and blind
then we train k
of the well the or the policy
and
classified stuff sub-sequences on the base of the confidence to all the new remote
okay and we select the rules which have not of them that a simple i
so in this may rewrite golden yellow shore so that a short so that loser
a this set one that is discarded the because these the overlap between the first
one and so we looked at i'm looking for a we will close
golden yellow sure so and the data which
okay
so this is the methodology we want to apply
we would like to know how we can will with this is impermissible
so we did the some experiment so as aforesaid that we collected the got cynthia
set as i mention that
now we have a density as for a three domains to the for sure across
the two languages english and italian with different characteristics
so for each that it here we have number of and it is the number
of talking to
the lane the and the standard deviation of that okay so the standard deviation a
and
kind of index that can scroll how much is the complexity albany
so the more this contribution to the more likely it is of the complexity of
the names university of
we have that i do you know ratio so they are is that it indicates
a high lexical but at feast or more complexity again
a real also added to the what the proportion of time that the first token
appears in the position or the name
and this may make it is a sum of our educational
about the
well matched is how much the
the
semantic a that i need a stable okay
and these with different
different designs here at like you see that the project i mean and is like
to the italian
this is and low-level value then the each
it means that in time and the first okay and it is usually the hey
why this is not the this is not for english
and the last the last feature that we want to point see that are easy
to how maps
and the proportion of anything the that can be in that entity these we give
some idea of the
compositionality of a of a set thing or something that
okay so the moral you the more we can find the with the intent in
any and now the name pricerange is a good the more it is it is
composition
this is the experimental setup the
we have the six the comments that the sse domains two languages to just the
just mentioned
we split each density of it in that way in that case
one import company is that there is no
anoint at present in the training is present in the text
for and it's innovation also negative entity needs to see you later the one information
to displease one quality for each for a very positive we generate pruning
then test this is important we don't titanium we are that a real test data
so you can that has a synthetic it is you know rate
okay a start from a number of templates are a little bit more than two
hundred the template so both for english and italian
a typical in place a do correspond to intensity comments apply templates for selecting a
plate for asking description templates for i think it a product of two that will
use the
like i'm finally we the name of the and whether the name of the entity
and the in from it is the
that is the a part of the data
we have two baselines atlanta's of that a simple rule based to a baseline where
a us this sometimes you get
and
a time to in a certain utterance is recognized as belonging to look at the
early eva
any all of the all that all kinds of the chunk of present indicative for
something typically
and then we wanted to test also
a new and more data we live in of rollment
syntactically generated that screen the
okay so we apply the same methodology for generating testing data testing data also for
generating synthetic for synthetic data generating the training data
this have the result of this of our experiments of the two baseline class of
our system and the
the last row so we see that the for all our dataset a
the system based the convexity of significantly outperforms of the that will be the two
baselines
which is already i think original result
something more about that the problems that this is the more complex or
as you can image and as it
it amounts from the got it is for the ease has a high us the
variability and they have just compositionality both in italian and in english so the results
are nowhere among the three the three domains
for an issue to is the last compositional so basically they easier
it's the one who percent or more tool named entity on a set then project
the point of view but actually this is the smaller dataset that we are okay
just a few one of its of with respect to
save about a thousand one of its for
and cruel think is very regular graph and the high composition
so here we have a good results
okay so just want to compute the
so this i where i have reported the some experiments about the other short approach
for entity recognition on the web we can see that gravity is only as the
only source of information
so it does not assume any annotated sentences the training it but also for testing
we have generated the syntactically the
we focus on a nominal entities because this out of domain entities in the second
also for naming the product so
and the we
the approach to tries to take advantage as much as much as possible or extract
noted from density as a particular due to the compositionality all the names or products
and the menu of respect this is a very initial work the and the we
see that quite a lot a room for improvement
three activities are going to for us to
the first one is just considering the fact that the state of a column of
sequence labeling is improving are actually about daily we have new and approaches the new
and more there's a for instance we tried the last the
more data the value by my and all the and the this is maybe better
than the previous there is a lot of room for experimenting and improve even acknowledges
for a generating a synthetic data
so we have we experimented with some parameters c one positive to negative but well
out that there might be maybe that all of the model setting for these parameters
and then of course it might be very interesting to integrate the exact idiots to
and he is that we have a some data maybe a little data few data
i i'm not take a few sentences and an integrated to
what also integrated the guys at a more than what we call and then g
we
a syntactically anatomy the more than a from an okay the doctor acts
so the reason i think about a lot of work of four
where the forties to make these approaches as much as possible domain independent soul and
be able to move from one domain to another with the same technology and also
language independent
and you
yes sure
so templates that are disjoint
both entities
and templates are disjoint so we try to separate as much as possible
training from that's
that woman
i
or maybe it's a good question but i don't think i have any also for
the moment so the focus was
and to do recognition in basically isolated sentences and the right so i don't have
any
so these are asked to be probably c and consider data even a and a
broad three more of a dialogue system
actually a this work is closer to traditional information extraction then
problem so we have no still there
sorry i think i to the possible
not all even that will the word embedded the word vectors so i'm generated the
front cavity
that is it's a good point so we don't vector or server for all mixing
we can be or other stuff everything is generated from a density
so this is the only source of information that