0:00:11hi my name is are free a i'm a computer science phd student graduating then
0:00:15this year
0:00:15and i work at the international
0:00:19i'll give you guys abroad interview of icsi is recall
0:00:22it's a off campus down by the bart station
0:00:25we have multiple groups down there are have an open house
0:00:29where will have a shuttle leaving here also just look for me an orange jacket
0:00:32or just walk down for support set nineteen forty seven centre street
0:00:37so my group in particular dixie is the speech group where you would lead by
0:00:41professor nelson morgan he's the each professor
0:00:44able recently transitioned a task even like men as a leader he's
0:00:48gotta industry background he's been and you want to voice signal
0:00:52dragons this go companies or a well-known in speech recognition field
0:00:56about a half dozen phd students clean myself
0:01:01so our main research areas we have speech recognition
0:01:04this is speech-to-text taking an audio signal and converting it to words it's a field
0:01:09with a long history that's a system from the really nineteen sixties at i b
0:01:13m the recognized just the digits zero to ten
0:01:16later in the nineties and now it's used for dictation in medical nickel confessions
0:01:21and most recently you know it's been adopted in smart phones and digital assistants likes
0:01:25your
0:01:27we also do work in speaker recognition this is sort of voice biometrics
0:01:31and we have a multimedia processing group which works in collaboration with a computer vision
0:01:35researchers
0:01:36both on campus and i dixie
0:01:40so speech recognition this is the main thrust of our groups work
0:01:44one of the project we have is looking at hidden markov models which underlie every
0:01:48single speech recognition system
0:01:51but it's based on a few flawed assumptions so years a graphical model representation of
0:01:55a hidden markov model
0:01:57we have observations and the darker shaded scores above and
0:02:00hidden variables states on the bottom
0:02:03and if you are familiar with the graphical model for lance were saying that all
0:02:07the observations are conditionally independent
0:02:10given the current state
0:02:11i know that a simplifying assumption that makes algorithms tractable
0:02:15but has consequences because reality is that these observations do you have a lot of
0:02:19contextual
0:02:20correlation
0:02:22our work in acoustic features we look at basically dealing
0:02:25with these shaded circles on top on their very noisy
0:02:28you know speech recognition what's great clean conditions when real conditions the data is different
0:02:32there's been a resurgence of interest in neural networks this is sort of a throwback
0:02:36to the nineteen eighties that's been
0:02:39kind of in some excitement because the restricted boltzmann machines and more computational power allows
0:02:46a bill much larger
0:02:48before networks and the research that i do specifically is related to
0:02:52accordingly system that are very complex and somewhat targeted two major european languages
0:02:57to languages i have fewer resources
0:03:00so on this map of language families of the world
0:03:03the parts and blue and red are the major european languages and the rest the
0:03:07world most the stuff on the right
0:03:10i have languages very different characteristics
0:03:14in speaker recognition our work is really its robustness
0:03:18because this work is related to
0:03:20to security this work is funded by the army for the air force research labs
0:03:25and darpa
0:03:25and we're dealing with very difficult signals collected like
0:03:28and
0:03:30jet fighter cockpit and spire planes
0:03:33but there's also another application which is speaker diarization
0:03:36and we've done this in collaboration so far left building a real time online system
0:03:40which given up
0:03:42recording of a meeting say with multiple speakers
0:03:45can
0:03:46segment and label the regions which correspond to different speakers this helps you if you
0:03:51want to build a transcript afterwards and have a labeled with different speakers
0:03:57like the other work we do is related to this in
0:03:59in terms of speech activity detection language identification and these dialogue systems we have speakers
0:04:06who
0:04:07speaker over each other they speak with each other and so we have to deal
0:04:11this differently
0:04:12a lot of it has to do with source speech understanding also rather than just
0:04:16getting the words we try to understand what dynamics are what the intent is behind
0:04:20the language and some of the more exciting work we're doing also involves looking at
0:04:24actual
0:04:25i data collected by scientists who
0:04:27have access to
0:04:29to brain measurements from small mammals or from patients
0:04:33for example they do surgery on epilepsy patients and they stick electrodes quality in the
0:04:38surgery
0:04:39in the range
0:04:40thus circles at icsi are we wanna have more robust signal processing for realistic conditions
0:04:46we want to understand the scientific principles behind these engineering systems
0:04:52and in general we support open collaborative research we work with
0:04:55a lot of international universities and false domestically and i encourage you guys to check
0:05:00up website and company open house today
0:05:07policy of harmony