okay so the what should follow we should be the up on L on the
application the end we should have the selected posters
and i as i have found out D V somehow didn't manage to organise the
think well so we don't we didn't have an exactly posters so i quickly around
and i was searching for the best posters on that would fit the supplication the
N actually found that we have them here so i found it best posters all
these
google new ones
microsoft research for next we also i would invite deep
of this so these are probably not all sorts of these posters but
i would invite some people to this point L so you can discuss the application
issues but maybe let me do that way that we i don't in white again
the i'm sorry with the database the last
speaker a still here so i if i can there i would just invite all
day speakers that we had here today to the cd here and the of the
people that we see on the on the posters you like somebody from nuance microsoft
i dunno we have anybody here if you if you want to join us to
you are just our company joint also and
can i keep you you're for a little longer so that and then we should
we should the
well i help that the audience we'll help me to ask be important questions that
we can ask the people from industry and
a wicked the people that build the application we had several talks about applications
do we haven't nikon lever because they here we have you mural source of the
people are talking all the people are talking about application because also talking about how
to calibrate system that they work for all the operating points and we can use
them for all the different applications
that's so the first think that i well i want to talk to think that
may cost of was the most interesting today
i shouldn't have probably all this question about the i mean they my question here
will be i did we actually find this a they useful and the real and
something from the people that the presented out what present that presently that's some think
about what they were common and you do we want to organise such sessions maybe
at some
other conference do we think that this was actually some review lance anything useful or
the what the people at the parallel thing that
we should have learned from that more maybe you have now sounds even to
to tell us what should have been the take out the message from your talks
and again in a short summary and what you think that we should have lunch
from your data research you should have planned for you
i mean
numb
very interesting because you kind of to me
okay i mean
and technology
product
we had wrote the mean and i think it's
one for researchers that are working
to be able to
explain what we do one shows the importance and ultimately the fact that can
and we now we have all these like this talk and we get using of
them but i think that have also and
did you notice how much they thought they had that's not very right only result
we have so much actually for that
a better so that you are collecting how much like two thousand
hours per second or what was it by our
i haven't done in my no i
my lack of envelope estimate is
but once you once you told me that with a
there some speech and six companies that process
of thousands of hours of audio for a right you matching although all reported in
call centres
when you say these all maybe my dream order is always recorded reliability purposes right
so that
not much of it is processed except for
more and more thinking
industry companies that are lines we know the mean and you know but that means
thing is really
for tens of thousands of hours
so
it sounds like to see
really but i i'm never will be well the privacy issues but you might model
you really collect something like thousand hours so i
our
i guess that you could even do the things like in negotiating with your customers
that they would be willing to give us one second per hour for free and
if you were willing to share that we thought that would be actually now nine
thousand hours per year and it would be pretty happy about that so
you know this comes out of
that problem is that
the you got framework
the signal and you know many people i don't know if i would like
boy samples to be available
E it's a lost battle a there's no way that the cost was reworked for
no one's for us for whatever is doing a speech at this scale
is not in favour i was telling somebody that before that i think that's actually
we do collect this initial databases that you know at least in the case of
we send people to a country and we collect like a couple hundred hours
those are collected with consent from the uses
that those databases might be feasible to open sort the problem is that and not
sure that the consent agreement that the wording of the consent agreement says that
you know the data will be available outside i don't know
anybody in the audience any
only opening
it does help me push them that if they should be possible
okay so i think we sort of where you sort of no work we want
from you just data
and i was curious that mouse sensor sitting on the other side of this terrible
what is that you would like to see this community really be working on
from your perspective
i mean that's a little all the work done on neural networks is great i
mean and we have been actively participating in that
there's another thing google that
just funding we use pen
unlike few million dollars evaluating grants many of what many of which are go to
places like cmu i don't know you're word about one i know
people seem you get them
so it's not just
the we have they we keep money
a
a joint here listening to me
we might
a
i'm not sure i will
have you know a nystrom suggestions i think of the work that designed a common
at least relevant
it is true that i
the kind of things we care about
in more big data and we can also would you so that that's a problem
we need to think about some mechanism to
to help i mean we have listings likely they'll art n-gram corpora
because in all those are wanted to statistics on it is text and its not
so
subject to all these
privacy considerations
i think they in a work related to semantic understanding composition systems
it's just really want to us
i wouldn't call it a universities to send proposals from that area i think that
will resonate well
they were they working in languages i have to say that
we don't feel is that relevant to us because
i mean we care about language is that have everything system
a lot of the limitations that us are operating are kind of self imposed
right we can collect two hundred hours in that we store a lot of the
stuff is not available on
lexical mean for example that's interesting
you know learning pronunciations from data
but we have a lot of research in the area to
i'm not what does
i
i have another comment about sharing of data this is not directly relevant for speech
recognition but it works for a speaker and also for language recognition
so
many of you probably already know what the i-vector is you take a whole segment
of speech possibly even a few minutes long and
you
basically trained at that the gmm model to reflect what's happening in the speech and
you projects the parameters of the gmm model onto a relatively small vector maybe four
hundred six hundred dimensions
and
that works really well for recognizing languages and speakers so
people are or less reluctant to ship data in that form so people will give
you
that allow you to type of their sites
a bunch of i-vectors because you cannot your what is being said
so one example is there is currently nasa's has just launched a new
speaker recognition evaluation
i've made a whole bunch of i-vectors available this is data which that are normally
shabbily with the world it's the it's the
that's the some ldc data i believe
so that a strings attached to the ldc data but they're giving away these i-vectors
basically without conditions
so
i like to implement and a lexus question
i think there's actually disconnect between the research and then the in this is going
with regards to the applications are actually the driving the speech work might be
and most of the in a bigger companies the going off the conversational systems
this a design example google now and then a there's a microsoft as experts
so what i see even though this is that actually a speech recognition and understanding
workshop
and that only a handful of papers on understanding and everyone is working on speech
recognition
that is what you know it's that it's not balanced right now and i look
at the em an L P A C L
you know who all this at a data model on the theoretical side you know
they're not as much since this is a application i see that this is the
community we should be investing more because this is the right people but i know
we're not doing that
and the second piece is there there's search why we observe that expert actually launch
the T V signal it's free for natural conversational search in entertainment search you look
at the most frequent scabies people are using single bird to word cured is then
not really using
you can say show me movies with tom hanks from nineteen eighties
today don't search even though the system handles it so there is the barium now
in a keyword based search and more and alan conversational a typo search and of
course the you know a search in keyword search voice search those of the blockers
all the priors on people's mine
and how are we going to get over this in is the going to take
time or what do we need to do about that
i will make comment so what on the a question about the amount of the
data the latter speaking hit a ball right about the internet there is a lot
of data is
given that of the proposed to be sure to
on the you to one another
or close
the people are about that this database public we should to find of a how
to use this the
source
i will figure at ibm in your position and i understand the problems of sharing
data
but
and also on the side and apply them are a little bit about
problems with models
and i must say from my perspective
the things that you could do for us
is you could share the error analysis of your data
now i must so
and i can say
as strongly as i can
i don't know any scientific endeavour
the made progress but how big the number of errors
that that's that simply counting
but i'd analysis of the kind types of errors that you see
types of conditions under which those errors happen would be very helpful for the entire
community you guys see a tremendous amount of data and i'm sure that you categorise
the errors of that data
we would love to see the categorisation
some jewel if i don't know if it's here
he argued earlier that
the quality was much more important than quantity of data of that we have the
quality guys out there and all that with the back
could you argue this is the way
i think you need both right
and
that the long run that's useless
activity
i wouldn't call it useless
but you know then within a willis each team we
we have a little bit of these quality because of our acoustic modeling team for
the most part they use a annotated data
transcribed data while a on my team we don't do it because we have it
once in charge of maintaining
forty eight languages anything all the training room so
so i always argue that
some of the techniques that they
or improvements that they manage to get my not be
translatable to the other situation where you are in a supervised weights all
i think realistically
i
personally i would argue that are unsupervised
is the way and i would work only the community
could get more and more a research in this area because this is very open
we still don't know
you talk to people in my children in a about the way we do training
and it will be shock
like what the herald we have because we're getting i mean you think about it
is a lot of all
scan all we are right you're using a system and you are using the prophecies
tend to train itself
a this something bizarre and four and there were a but it works right
and if i was
trying to organise some a word so but
with high i mean we thought about it about this particular topic unsupervised
acoustic and language anymore lexical modeling
for the next interspeech you know
in singapore i just
it was a little work on and just lazy but that i would encoded somebody
to organise got or so and i will make scroll wheel and help
so i
should be up there but here
tired
there is that the elephant in the room
we heard a little about it
but in the this we used to say that a we're looking for the keys
on the white and that's why we use cepstrum
and now for doing very well and asr about the real
problem is not asr this semantics
and that it's not being addressed at all
this
community supposed to be with you are in the U is very important
you wanna get very good the transcribing in a them on the bigger the to
transcribe as well as the amount of data that you work training well never be
able to be read by anybody you really need to go much further and going
to
language understanding some sort remember before this becomes
so i'd like to follow a primer comment there
all of you seen lots of great papers and presentations here at asr you still
have to mark to take place a year from now we'll have S L T
and like to how and so i'd like to ask if anyone i'm handle here
might have some suggestions on your challenges are things that you sign here
that might motivated challenge or some type of collaborative effort that it might take things
that we've learned from this meeting
and maybe try to deal planning for next december
to train addressing issues that may come up from this discussion
no one says
i mean if it's some of the things i mention anything our would be very
valuable such as distant
speech recognition in fact just being able to recognise that this speaker is too far
away let alone correctly recognized what they're saying would be useful i just anything at
the relates to finding stuff
realising that the speaker is in a sub optimal condition that'll be useful
okay
ten fifteen years ago when i started of the speech samples lot of work multimodality
seems to be
totally data
heard the word once or twice today
is that something that universities could work on the rest of something that you guys
of honour
drive down with thousands of hours of
annotated or unannotated data are as well and we shouldn't even bother to look at
it again
multimodality use robots or
video material
i mean we have an application that has video feed constantly on our user and
i think that would be useful for us to be able to make use that
kind of data
to improve speech or any number of other
types of inputs from are users
that being said we have devices like that now that have a camera aimed at
users all the time i don't know that was necessarily true fifteen years ago that
was always count
now we cameras and microphones carry around in our pockets constantly so
from my perspective be lovely the inverse is to solve the problem for me it's
like it just take a nice black box employed in a get twenty percent better
success and everything
that the same time just saying you got thousands of hours of
that they know that we won't have
also you have ten a hundred grad students i don't have so
where
maybe not right there but i know there are a lot of grad students at
cmu
all slave them for you
i wasn't to say that i think microsoft has done it very good job with
that they can and right
where you can capture adjusters
i found that really interesting because
you know home environment
i
maybe you can even compensate
for everything the recognizer so i personally think is interesting but i would like to
you can so as to say
so it is also my within that it is connected so it's a device that
can be easily used for data collection and the committee gonna buy a voice and
the by a human and likes and the like bodies they shows so if the
research is very important
quicker corporate you know how to or comments
if a we're here for actually are why don't have a simple right
yes so for our language model training we use
a lot of sources as i mentioned
i'm one of the sources we use is also the transcriptions of the record
after some filtering
i actually you do some sort of into voice down
a standard place in techniques and you look at which data source contributes the most
of the quality of the language model then supervised data source a contributes a lot
so we will use
not quite there are here for training a company wide or compare from this one
from agnitio information silence
okay yes we will have access to other are what i call that there are
a little information for example whether they use their click on the result meaning they
accepted they hypothesis we provide
or whether the user to stay in a conversation seems like that
a
it's can actually this whole thing is surprise to us initially we look at this
kind of data and we figured this is going to be great because we will
be able to sample
from
regions in the confidence distribution where the confidence is lower
i'm compensate because the user click right basically is telling us
we did something right but we haven't seen any improvement i turns out that at
least so far that confidence scoring placidly states and things like that works pretty well
so i mean it has being a bit of a disappointment to us that this
latter signals don't seem to have much
thank you
the normal
questioned the moment let me may be written to D what you were talking about
before there was the what's rarities i-vector mentioned so actually what i have seen just
during the approach of idiot that you were
people working with us
from google he can with interesting problem that he wants to train neural network on
on i-vectors but since you have you could extract i-vectors from a thousand millions of
for of
recordings then he could use completely different technique and eventually he was successful for short
duration is something that possibly we would be also interested and if you had available
though those i-vectors and
we could eventually be interested in running something on such data because at the end
the only thing that we care about is that the next asr you will be
again on some nice sunny place and we need to write paper for that
so and so perhaps the components could be more proactive in this sense that you
maybe you see this interesting problem so maybe you could think of
how to generate something that you can actually share with us which is actually no
real value for us in the sense that we could train our system on that
but generating these kind of challenges that you give us these i-vectors and just play
and whatever you want with that and because this is something that we are interested
in
in fact we know that such problem would exist for google or we could guess
but it wouldn't know what kind of i thought how short segments and
what kind of data are you interested in running a language identification that and i
guess the similar problem would be even maybe natural language understanding you would have some
sparsity problems you could possibly extract something information from the data ensuring with us
we can maybe people are not working on such problems because we again we don't
have this they also this is so you say that maybe we should sign up
for the we should think of some
some project that google would be even willing to pay for but maybe people don't
even think of such project because they didn't have the initial data play with and
then to find that there is actually some interesting problem
anybody else's anything close like to
i knew that the problem is that you have and then we use a lot
number that i think what the locations saying is it's a matter of a mindset
then we give an example from my side but not my mindset is the mindset
of incorporate department
no says that this is the danger and doesn't make compensate analysis it's really important
but
no need so maybe i should give an example rate so i'm johns hopkins and
while i think we a little bit speech and language groups in the movie actually
known for the hospital not medical school
and that is gobs and gobs of medical data which is similar to extremely valuable
and anytime a large medical dataset is collected belief into the work on it they
every look for bayes to make it available in other words that tendencies not of
the large decrease in the not so that's not bothered about it they were clearly
had to figure out how to the an animal i do but anonymized it'd be
identified or whatever they call it
and so that's and i have guided of saying this data we can get good
things out of it but maybe someone out that in the world will get something
more out of it so let's see how we can make it available and like
and the cosine but speaker id language id dataset like it turned out that given
the state-of-the-art it might be enough to give people i-vectors i've seen other examples of
this
does a lot of jean had a essays and things like that better you take
into the be identified and then you give it out so if you started thinking
that and start pushing back because he these liars as the same know their first
answer little bit no
right so it don't take no for an answer
and just try to explore what will pass legal master because it is really in
that addresses the community to expose students to these kinds of datasets and problems and
again of innovative next breakthroughs gonna come from
so i think they should satisfy commit yourselves to say
let's try and they cannot for example a lot of gaily google in particular there's
a big commitment open source
and that didn't come about easily i mean you remember the days when companies are
the copyright everything in a local used to go out
but that change in the same way i think we should actively push
these lawyers and say it this is necessary to go
i think that is another aspect
but it it's definitely as i see your point and at some level i say
so there is the legal aspect is that privacy aspect a day
the trouble that will
goes the perception that all their collecting data privacy these privacy that so
there is the public relations aspect this is have to be managed very carefully because
you'd only takes and generally saying all goal is collecting data and setting you with
everybody
analysis us that of that i remember some years ago a well i can't remember
quite what they did
but we try to italy some chat data and some audible happened then somebody found
out something about a woman has a huge P R disaster and things like that
make these large scale so you just saw
so it's difficult at you know i have to be honest is very difficult to
two pass to these
all these barriers and then and then the other thing you have to deal with
is we executives that sound of then they look at
i data as a competitive advantage
so
it is possible it has been blinded pass like when we will use these n-gram
corpus
but it requires a lot of work been all non on or been taught
a
well i during the students here
so they can work money or whether you by that fact wanted to spend
so what we got with this
and
it is difficult
i know the success stories so i don't live many people know this but and
then but we started working on penalty he was at microsoft
and microsoft initial reaction was to we can keep it all in house and i
believe just like
for really heart and that gives jeff created for making sure that kaldi state open
source so i didn't know that
examples where we have succeeded should try
i agree with that i really would look like me to work on child speech
and we have a dataset that we've been collecting that we would love to be
able to release a the problem we have decide legal is you know word twenty
percent company
we have a problem like that we're gonna doing
that they will just be gone
because we get to we're gonna be crushed we have you know you're wanting left
and if someone's users because we still their kids voice and then knows what happened
i mean we're spurt completely and i think from a cost benefit analysis like that
risk is just we to be to take for a company of our size
but that doesn't mean that we would not love to have
the bright minds in this room around the world working on children speech we think
that's a wonderful problem that has
interesting and unique issues that are not present an adult speech
especially the conversational aspects that you generally don't see very much of a with love
to be able to do it
getting that
if the identification is challenging because the regulation the us that if it has maybe
a child's voice on digits personally identifiable there's no way to de identified and still
have audio
that's challenge
and a large amount of data to drive the research i don't remember and i
think the this should start with the end the in an S F or darpa
red and they should i know create the next babble or something about along the
lines almost the model
information search using speech as the main interface
they should generated data rather than looking up the global or microsoft
that won't happen now the thing is that you're to push the envelope so it's
i'll give an exact another example the google in the microsoft and gram carb i
and show you can harvest trillions of web pages be kind and you say to
be very useful so in other words
let's start by finding point solutions and hopefully a act in the limit individually the
liars we get the message that these kinds of thing okay but i think we
really should take an expectation say can we have this problem by giving it can
be a that maybe that's way to go
so i will say that one there is a will there is a way
and
corpora
the corporations like google and microsoft really are hiding behind the lawyers
and i have a very specific case
which is in our
program
to read documents
i don't
we had made ldc generate data for us and that was good but we know
that there would be other phenomena that would happen in the field
that happens to their happen to be in a huge collection form that you're as
your are core in nineteen ninety three
that was actually released totally cleared and released but somehow somebody in the government decide
that
that it really could not be released and we classify the data put it away
however
through a lot of paints mostly me and my staff
we manage to get that data we were least on the condition
and that cost a bit of money that somebody would have to go through all
release data and simply remove all the pi a personal information
and once that was done we have an incredibly valuable corpus
to work with
a so
it may be able to go over all microsoft
amazon facebook a to go through some expense make sure that a the data is
cleansed and then release it to the world so i give them the challenge to
try to the
i just thought of the suggestion
that might help with these which would be
if it comes from the user
let's say that we allow the user to opt in
and click as checkbox it says whenever use google voice i actually one these data
to be shared with the research community in the same like that there's is on
that you can decide whether you wanna be an organ donor right i'll you could
and the thing is the new generations
are also much more eager to should basically share everything right but i'm sure that
the evil it is just one percent of the users would be happy to let
that data used for any purpose that would be already you know millions of out
of hours
and so maybe it's not that far fetched and then there's no issues and so
as more and more people quote unquote transparent if you've read the circle for example
so it be an easy way to just have this state available and in fact
it could even be
kind of a requirement to say one donating this speech to well so i wanna
actually needed to
you know that the whole research community
i like to ruin microsoft better wanna donated the work into your sorry a so
if i can maybe i can make a know it
challenge or something that's a for microsoft and google would you consider maybe you bring
in you know some summer internship students and because even if you are to kinda
go through megabit same type of data here in setup and work a nice piece
they could be shared with the community because
even if someone's gonna release in check out that box assembling to really sit there
can still be sensitive information in there they do not thinking about when you're actually
kind of doing this and so if there some way to kinda have like a
litmus test of what
constitute something beyond
you know what would be publicly you available or something i i'm just trying to
identify the space and if it's trays out of that remove it
so would you consider supporting a couple of summer internships used to go bill that
for the community
i wear expect a small startup
a i don't know i mean
this is not something always can decide
you think i have a lot of power i don't
a
not i and just on a
i bring it up but i you know i have low expectations
to be on this is a lot of work
but with all this talk about data in back to a better they you had
mentioned the fifty languages are so you've collected in one week at a time i
presume they're sort of the network of contractors out there that are actually doing the
crowd sourcing in providing some of the language expertise could you say something about that
so
when we just of the language is therefore we
we basically made a conscious decision to not outsource
the whole
and for to
to work still not companies
because
we realise it was easier faster for us to do it ourselves
so we build this organisation to a lot of data collections and the linguistic annotation
so it's a combination of actually so the smallest that is like five
people full time
and then there is a lot of contractors that we bring cap linguistic teams for
three six months
we have all the tool infrastructure so they can work remotely
and a lot of the work from our stuff is managing this organisation because that
at any time that is like a hundred and fifty full timers and it's only
a contractors the linguistic annotations
and then for some so we
consciously made is the system to do it internally to have control of the whole
thing so for things that are small annotations that will require
to quickly we use that what on teams whistle so it so we have a
linguist and they annotators
and then when we require a large volume annotations then we use mentors we use
a lot of vendors not just one
mostly to keep a little bit of competitive person
and we force then to use or tools
so that the advantage of doing that is that as they're not if they use
our tools
you know the annotations come into our web tools and
in this what based also immediately
the comment or system and they we started then to our process
but at least at that level you know you sounds like you are i don't
i mean i sounds like you are
applying a reasonable a lot of
of annotation in quality control and is your process isn't all that different from what
mary describes with a with the babel program
is i mean is that reasonable
to say to i mean a lot of this stuff is for testing sets right
so it's not necessarily training corpora is mostly testing sets that because of the scale
of languages is a lot of late that right evaluate if every quarter you transcribe
thirty thousand utterances their language and then you focus on three or four domains
but language model for the top the languages you are talking is only about
i do not have a million
utterances per month been transcribed just for testing purposes
so
lexicons
in something which is we also
i mean as i said lexicons is something that
probably we need a little bit more work to automate but that the thing also
is
from the point of view of quality
there are things you can the with money or that it is you can do
investing a lot of a algorithms
and
and you know we have okay i want to sound we're more limited in engineers
and a speech scientist that in money not as much or something but
so it's easier not seriously it's easier for us to spend money and get data
transcribed
the and
to hire are
a lot people sometimes
so it
i all the way it is
this conversation because it still staying
with all let's get a lot of data
and let's get by better asr unit
and one of the problems and i saw that in the past
one we had lots of computing powers forces people with didn't when you got corrupted
by all this data keep working the same paradigms lately have a slight paradigm shift
and nobody bothers to
so that
think
come up with new methods of dealing with that
and
the entire black all of semantics will not be solved in the matter how much
data are going to
so it's i just the ldc you delete all the database is that we have
at the moment and we start from scratch and you're it should start thinking about
what kind of data we should actually start collecting now because i think again the
data that we have at the moment would be boring would be the same thing
so i have one question
the biggest part of this community i think is the graduate student
or at least part of it and i see that
the
the work is more is heavily driven by what's happening in the industry there's you
know it's very fast but it's very changing
and we have and a very good banner good i think
do so to tell us what we
she wouldn't and worked
the
university programs where that you could recommend the steps that you good data for
i was to so to get up to speed with
what's going on
but that's my first question
and the second question is more to better your presentations very good
i just wanted to ask how to do so to scale up from the university
to
to what it is that you doing so those are two questions thanks
let the first one
actually going back to the having
maybe we should change the way we have no real expecting companies to do stuff
for you for us
i think this is a large can be the and you know i can collect
the type of data that you and need and that crowd sourcing with the people
here and there's a logical mean and i know
if you look at interspeech i classes on the order of thousands of people one
in this community so you know one can develop an application where you can get
all the data i would trust sounds you for creation able to as my personal
data
so that's one layer perhaps getting data and rather then you know who's gonna give
me the data can we generate the data
and going back to the question as i said i think there's a disconnect real
companies are going is you know the they had the data is the most important
thing it's not really machine learning or techniques that you're using
and they also all the devices to access the they on the have the they
on the software they on the data to they want to control how you access
just data and speech is the natural user interface one of the modalities that this
and they want to control speech that's why you want you know you see apple
use amazon microsoft other companies investing heavily in the city a that is a high
would you know like to have the students working on and there are challenges
and also there's another gap between you know search committee and language understanding speech community
the new did action is actually falling in between them that slap scale language understanding
and those are the areas i would in intended to focus
a so i either a very statistical right is the relation between a because us
speech and text to be because we had to domain of for a text processing
for data mining cut some sort so we need to get the any data from
B C doesn't need to be and i'll people but the analysis of the data
and the
analysis of correlation between the data those also so we can expect so anything from
speech but there is so huge
the possibility for the analysis
i in this day so but very important topic
or about solar this a big data analysis the system so it is here and
you can delete
there was the other half of the question for but
okay to the other half of the questions about how to scale from my university
to business that
i would say that the
the simple also these
go outside and ask the user does who really needs we were able to do
you use this really neat course if you do you go up to company so
that the work so the speech data the data immediately tell you will target difficulties
of would be to solve
this and the user this companies have money so if you are able to save
them some money or vq customers today i the if the money to
that it would have anything today
i guess that was originally
multilingual you
the group so that it goes from the university research to
kl
well i guess that was but a question compare draw originally like what i will
go manage to scale up from the university research
the expertise better right now that's a i think everybody came from the induced
the seed of this it's team is on industry people
i be an identity labs
a speech words
can i speak
so i just had a couple of
thoughts about some of the various things are going on first like and i can
agree that the connectors been a great resource to people doing multimodal research in universities
it's really it's a nice piece of hardware that it's easy to using like gestures
of those people in our lab and other places i know are using it
as well as sort of or publicly available speech recognizers
on the on the issue of the data i think
i don't think anything's ever gonna happen of companies that are collecting the data for
the reason to have been described all through the years even joe bell labs when
they had all the data
it wasn't share with the community sometimes these things later in time come out through
the ldc
but for the various reasons that pedro one others describe for
privacy issues and potential competitive issues
it's not gonna be really still take students there about the students work on the
data as interns
but having said that the techniques that they're using
it's not impossible to collect data ourselves there are
efforts to collect the data from different languages you can go out yourself and make
a apps
and have people read speech there's mechanisms to crowd sourced annotation if you really want
to do that the community could do that we've deployed apps and
you're not gonna collect data on the same scale but you can certainly as people
said it there's away all you can make it happen so i don't think we
should look to that be companies the feeders crumbs we can work on we can
if something really important we can go out of the community and make it happen
another thing talking about what research should people not of the company to be doing
or what should students be looking at
joe mention the analogy of she's under the spotlight well publicly available corpora sure they're
spotlight some people tend to work on those problems and the problems that companies are
working on also tend to be spotlights and you think about that but there's a
lot of heart problems out there
joe mentioned semantics
there are plenty of others that maybe are not commercially viable better are really heart
and interesting problem and i think would come back and benefit a more conventional thing
so people shouldn't just look at what's out there right now as what they should
be working on but think about
what are people not working on that are interesting are problems
so that's my two cents
so what i'd also like the question
it does seem to me that industrial research is really development it tends to be
in your term
and universities should be doing basic research and possibly things they could feed into development
type work
all i personally think universities and industry have to find a way to partner
in order to make sure that there is relevancy in terms of the research but
that you don't
for the basic research that has to go on at the university level and the
question is i think there's attention there data is an aspect of the data certainly
does drive problems people will go and participate in and in an open evaluation because
of the data the question i have is
what do you see is the ideal partnership between you are companies in universities because
ideally it shouldn't just a matter of recording there has to be a reason why
you wanna come to these conferences
and you have a potential to be able to shake that future students the future
phd students in a wide variety of countries and it does seem like something along
those lines seems an important thing to do
so at the content but i also think i would like to hear a little
bit about your thoughts about what the ideal partnership might be
i think there has to be an incentive
there enough problems
we had a size team in working in the product group
and we another that you know if you are not in research you are not
really setting your agenda in terms of the time schedule
you have certain deliverables you have a great ideas but you just it's not really
the priority because of the next deadline so that as a summer intern that actually
lifeline for so we have these great problems we just don't have time to what
a hand them and we have the summers to that's working but that's not really
the solution the solution is
you know the problems of a are all this and the i can then you
will be a hand those it's just what is the incentive on the university side
then we'll engage them working on these problems to me that is missing
and also say that there has been some more shift
so that you have been and research
when i first started long term research was about fifteen years
the a long term research is three years
and that's a real problem
and
to answer your question mary i'm not sure that industry should rivalries
i think
if the heart problems
artifact and possibly solve
eventually they'll find their work research if you're wall
the industry to do that the research
most likely the heart problems will never get done
i wasn't to sit
but idea where is to
and a lot of this in summary than right i mean a
induced response or things like a johns hopkins also
in this true sense employees there a on the company salary i mean i know
everybody that C
we sponsored conferences
students through some of programs
and actually that's an indirect way of influence i think many idea to this is
a initiated because of the student grams and they work with rookie or
whoever and they say hey that's like this at the end in it might expanded
there are university grounds that most companies ones they have a size they used to
ready to
the research and it is the care about
a son not sure
there is anything extra to be that and then of course it is that personal
connection right
a
i mean the fact that i'm afraid with some fact that the
we definitely it's totally it definitely works
so
and the coming here i always say that when i come to these conferences
but this particular one is a small enough that i can actually see that posters
but a larger conferences like i guess for me to value is to two cats
a without people in academia and see what they're doing in a dog and drink
a beer
i sit more kind of informal
way of an and sometimes tell than a weather you submit the world around we
would be interested in that
so as you know the more indeed it was of influence i don't think we
need to formalise it
so much
there i think they have been exceptions where
who'll a research lab something created with the sponsors it phone
university
i from the company i know
for example bewilderment set typically small seventy thousand dollars fifty thousand a list but they
have been cases where have the million dollars million dollar something given to university
to see in you centre
so i mean sometimes that happens
but that again is not at midas at my little pieces of the security vehicle
comes from a some foreign all these guys a then they given half a million
dollars
so i guess we have done the time that would result for this panel discussion
so i we should remember actually the idea that the next i guess maybe there
should be special discount for the people that are willing to record a conversation and
then we can collect the data and i'm not a of course also the conversation
ended maybe there should be special discount for the people that make this conversation at
the end of the blanket which would make it
would be more difficult condition and i guess that we know should all go and
practise for that
so let me thank all the all the speakers again