hi my name is are free a i'm a computer science phd student graduating then
this year
and i work at the international
i'll give you guys abroad interview of icsi is recall
it's a off campus down by the bart station
we have multiple groups down there are have an open house
where will have a shuttle leaving here also just look for me an orange jacket
or just walk down for support set nineteen forty seven centre street
so my group in particular dixie is the speech group where you would lead by
professor nelson morgan he's the each professor
able recently transitioned a task even like men as a leader he's
gotta industry background he's been and you want to voice signal
dragons this go companies or a well-known in speech recognition field
about a half dozen phd students clean myself
so our main research areas we have speech recognition
this is speech-to-text taking an audio signal and converting it to words it's a field
with a long history that's a system from the really nineteen sixties at i b
m the recognized just the digits zero to ten
later in the nineties and now it's used for dictation in medical nickel confessions
and most recently you know it's been adopted in smart phones and digital assistants likes
your
we also do work in speaker recognition this is sort of voice biometrics
and we have a multimedia processing group which works in collaboration with a computer vision
researchers
both on campus and i dixie
so speech recognition this is the main thrust of our groups work
one of the project we have is looking at hidden markov models which underlie every
single speech recognition system
but it's based on a few flawed assumptions so years a graphical model representation of
a hidden markov model
we have observations and the darker shaded scores above and
hidden variables states on the bottom
and if you are familiar with the graphical model for lance were saying that all
the observations are conditionally independent
given the current state
i know that a simplifying assumption that makes algorithms tractable
but has consequences because reality is that these observations do you have a lot of
contextual
correlation
our work in acoustic features we look at basically dealing
with these shaded circles on top on their very noisy
you know speech recognition what's great clean conditions when real conditions the data is different
there's been a resurgence of interest in neural networks this is sort of a throwback
to the nineteen eighties that's been
kind of in some excitement because the restricted boltzmann machines and more computational power allows
a bill much larger
before networks and the research that i do specifically is related to
accordingly system that are very complex and somewhat targeted two major european languages
to languages i have fewer resources
so on this map of language families of the world
the parts and blue and red are the major european languages and the rest the
world most the stuff on the right
i have languages very different characteristics
in speaker recognition our work is really its robustness
because this work is related to
to security this work is funded by the army for the air force research labs
and darpa
and we're dealing with very difficult signals collected like
and
jet fighter cockpit and spire planes
but there's also another application which is speaker diarization
and we've done this in collaboration so far left building a real time online system
which given up
recording of a meeting say with multiple speakers
can
segment and label the regions which correspond to different speakers this helps you if you
want to build a transcript afterwards and have a labeled with different speakers
like the other work we do is related to this in
in terms of speech activity detection language identification and these dialogue systems we have speakers
who
speaker over each other they speak with each other and so we have to deal
this differently
a lot of it has to do with source speech understanding also rather than just
getting the words we try to understand what dynamics are what the intent is behind
the language and some of the more exciting work we're doing also involves looking at
actual
i data collected by scientists who
have access to
to brain measurements from small mammals or from patients
for example they do surgery on epilepsy patients and they stick electrodes quality in the
surgery
in the range
thus circles at icsi are we wanna have more robust signal processing for realistic conditions
we want to understand the scientific principles behind these engineering systems
and in general we support open collaborative research we work with
a lot of international universities and false domestically and i encourage you guys to check
up website and company open house today
policy of harmony