Speech Transcript - Text-Dependent Speaker Verification System in VHF Communication Channel

how can open them

everyone

the paper

i would like to peace and is in type of

text dependent speaker verification system in the did you have a communication channel from is

to do fourteen for comb research syllable

here i show you the all night of this representation improvement you overview of the

paper and four by the hit you have communication introduction and i will show you

the biometric assistance of for this we hit you communication

speaker verification systems

the name

i will keep the performance evaluation flow by

conclusions

firstly

for the task of these research projects that is pure a and biometric systems that

recently that you of communication for building she the revision control

and

this means that you have to the means benny a high frequency but duration communication

channel

the main device and phone these the usage of communication channel is what keep okay

this for kentucky is actually use the embedding issue communication so this approach that is

a focus on realistic communication

then

for the navigation control phone the for the authentication of the speaker so

especially for the c must the when this

go into

sub-turn seaport people

in the control pendant this one though full is

is in this one and

sun

some people register of this

of the nice the presence of

only a tonight the person speaking

can trying to see the two and the

the sufficiency part

so is the

point been enforcing to set up these the

systems of one the project

and but the problem phone be so the

the average of communication speaker verification is that

this is a

speaker verification system we hasn't but initial durations

and this is short duration

maybe about how one second per second

and up to

chi seconds and

so that's compared to the conventional duration like

one meeting put leading up or

ten seconds and

i usually use the in this as i

alright is a quite short so

we may

focus on this

up opens the by sun solutions and

under the age of

communication in all database

hasn't many problems and

and i goes all you

those of problems and these the

of phone this a

speaker verification

and

so we see some solutions of by using

pass phrase

a pass phrases

what three screens the

so we also collect

some proper database

i use the

two

improving in the past and those are all the speech data

verification systems

well also applied the

marty system combination

to form a improvement

the performance of the systems

no we go to

so you

a few hedge of communication

power

so in this

this a finger

so you

bussing that application of the usage of speaker communication so you can see that too

one is the

from the user as it

like six must the u s c n and

so the other part isn't the control not in a purely so this person is

to pass in communication

so we had you have to devise a

unlike what exactly device

and that use the first three seven and

thank initiate quality to the control synthesis and the control centres we applied the

by we present

for the c must and then so at this moment that's it must the so

speak tune the

what we talked is that

with his the name

speech is that

and this piece you transferred to look control panel and the control site also input

these the

speech is to speaker verification and use them for verification

so at

for example and the same time and the console on the also can beep as

a banana

speech is that

we present for certificate the identity

numbers of four

for

verification and we also

combined is to the netting and the idea is to can a

two

to improve this the verification performance

now

for

speaker verification

proposal

the nine hundred

correct that is because of the usage of speech you that are

alright

as shown here

facility this is a

j of communications speech you

has quite noisy

because i

i is recorded in

in on what development there is a in this environment this noise in baby c

and d noisy this the quite strong

another problem is that the is the a bunch and the

for verification

the open channel the means

then the channel probability can be norm

the speaker verification systems

so that so is the quite

ugh

so for this case the

we of course

quite big problems are for channel compensation

so we cannot use the question a

channel compensation

then enclosed

two

but we use the channel mismatch effects for example you cannot use jfa it can

channel factors the or even we cannot use appear at the a

channel factors a

for this the proposed so it is a ha

how difficult t

for this the project and not know why is that not be friend not those

and speech

speech is speech

that means the

during the

you don't then

and

yes speech you is recorded in of these development

so and is obviously but and

is the

up i

is apply applied

why

quite

that you element so

for on the one a test

environment that is the

in a six

so maybe there is this engine

so we sent is the so now engine

the speaker we have speech may be louder than in

in all these development

so also

well that's because speak to now maybe

this speak

speech is speaking we have be plastic

so not a

problem is that the channel frequency and imitation

with the usage of one we had to have a guy so you

and this whole spectrum

range that

for comparison

the first one

it's normal recording without you had you have

communication

and this one is the

recall that with we had you of china

so you can see

the high-frequency part is a sub present match and

and we know for speaker verification

the major

speaker features a

is in the high-frequency part so if this the information is not large

much so maybe

this is a speaker

but if based on performance the we have dropped and whatnot

known disco to the by energy

since the introduction

in this is systems

a bus or you know

all pass phrase based those speaker verification

systems the

this is the input to the g

subsystems a

with the

gmm-ubm but there is a twenty conversion to

systems the jfa and i-vector

because they a

gmms you audios and so they're having many problem and planted has a can be

shared each other so for example as a

the cash

generally

but ubm parameters a and they can share the supply sense that it occurs a

so on the proposed systems that

the computation complexity we have be drawn and table two is just reading

so we

with sony's one and then entice systems the is actually the fusion of the

cheese expensive

the fusion

calibration parameters the and the big

can be

changed by using but you a set of development database

and then finally we

with what we get

this goes from the combination of the

g systems

and then he we so

you

the pass phrase and three screens the

and is the verification

personally

for pass phrase and watering knitting is a

what each pass of phase the

of a speaker

we are here the

is the corresponding models and

for the modelling so

a certain that there are k plus phrases the for speaker i and then

we are

you're k plus place

model was and for this because the

so if speaker

say

for one to crying

to be as the speaker i and

with this the

pass phrase and all so we will

if this and autoseek ha

and all up to you compare although with the all these utterances all j at

all

and finally we get

that verification

scores no

we so

the database the

clustering phone this

point is if you had you have communication speaker verification

projects and

this database is it was still for parameter changing

presenter's the

they are used the

for ubm training and values for symmetry total variability in the tree i

in i-vector systems chaining

and also used for plp a chaining and i either used for

i can

eigenvoice the fact the eigenvoice the metric chanting

one this database and now from different

you minimum and

from different recording bayesian

presenter's a they can

in office environment and visit you had to have china

and so with different distances

between the recording

and receiving

and then we also collect son database and forum

by using d that setting all recordings

you obviously you've elements of for example

is i as are

pending for clean and also we

because you on what you're element is it to simulate the

no real reason is because a development set up for communication

speech

second speech database the recording with the we had you have

and here is the recording devices

like what we talking

mike and on the microphones and also i pay that is the mobile phone

and

so we have development

a real time

systems the phone this approach that's

i think you know we so

know how well components of the voice the biometric systems the how about

improve the computer

and

this usb song call

and

this of you had you have

has said that there is a walkie talkie here

for receiving and also for just meeting

and here we so the so well user interface and in this the survey into

user interface and not cheap regions

the first case i is used for any stray showing up registrations

and then

the second one is a for enrollment

pigeons the and the so one is for test patch pigeons and so

this has being updating find the

by using the idea is to go

test inside

on what

now we go to the performance evaluation

so we can see

and in this the

the pass phrases the

for the evaluation proposal

you know

one this purpose of the participate the we s p

one by one the name

the i b

and

no but repeat several times it

in different sets when i in different development with samples the in different should

so here we so

than the

the evaluation database and the development

when database it

the number of what goes the we use the phone this the performance evaluation

and the number of chaining and has a

utterances used of one these evaluation

also we so the true trier the number of trying to try

number of impostor trials

use the phone the evaluation

and we separate the

and not twenty speakers

participating for these the database that

recordings the and we separate and check

this is because the and

and ten speakers and

i'll four

one evaluation and for development purposes

and here

we also given that

the averaging durations the

for the name and for the i d

for

and also for next bus i d

and you can see the averaging

duration is about one point joe four

for the name i is the one point g six that

for all i e

and one then pass i the it can reach

two point four seconds

we so

the performance

these are in terms of eer and minimum dcf

for each of

the single system is

and the fusion designs and

and you can see

for things and disaster is always better than the single system the gmm

and jfa and i-vector

why i like to is the performed

is not so cool

as compared to add rosa

so actually

because the in i characteristics than the

we only encode to the

those the

channel information as aforesaid pen

in reality a this there's a

but these channel compensation

"'cause" that is right isn't

what is "'cause" consideration is not so single

for this is duration so

so maybe i you we all make

then the pierrette the a performance draw time

in ten so meeting mindcf we also

so the best performance is all that fusion and

so compare always in the second leading and single id than the implies i

performs a

better

then every

in every

systems a

single or

that there's and one

so here we also

can

the

the fusion

with the name process id

current

alright at

in a ten point one cheaper same but

of eer

so is the

quite good results and

we expect

you know from the second

perform an performance of with so here

with the

det plots for you had you have

then

i t and

then they about i the comparisons

so we can see

these the things and results the opportunity better than and the other subsystems the

for name for i b phone n-grams id

and also can see

banana

i e

then the performance that is quite good

now we go to

the conclusion on this the presentation we haven't introduces a

we have introduced a possibly bayes the text dependent speaker verification system

against the industrial

duration condition

we have

develop appears as is then consisting of gmm ubm jfa and i-vector

among then

the ubm and the stuff reasons they do they got astra

and according to the different conditions that between enrollment and but indication we

besides the suitable in these four

for parameter changing and find us

system setup

experimental results or that there's insisting gives the of one system or what any single

system

then

two point four second duration or like eer of that's and then chapter seven

this is my presentation sink

for this application i assume

and correct me if i'm wrong where is your operating space i assume that for

the most part if boats are coming in the most the time it's expected that

the right person is gonna be radio in

so you really care about so my correct and that you really care about the

the very low miss rate

is that correct you basically you care what region are you most such an extremely

low miss rate

we just terrible

the identity of a person's

right jet set operating point

right so for this scenario laid out boats coming in this in general so that

it is sense

i guess and try to get a sense like here and i think even a

lot of the text and then applications people are talking about where kind of the

low road

we're focusing on a different part of the debt curve that we would when it's

trying to find a low prior target in the dataset this actually maybe in the

hyper prior target the cost of changing a lot so

do you care what region you gave like equal error rates all that you have

an idea where you really care about operating the system where it's gonna make its

threshold

we may consider okay

we go to this

this

for this part

then

we can see

for this one

we may consider to use the automatic speech recognition machine use it to replace in

this control on the so that means that we can improve the total performance of

less is then so that

by total automatically

sue

zero in this the

automatically by or not

systems that and get

then the then the information of the

us because the and to the verification when this task as

then this

idea can improve in the total

performance

the

sorry

in some way

i'm interested in the communication part of fuel system you talk is entitled v h

if communication

we had to have communication is not very specific it simply means that the radio

frequency ranges between city and three hundred megahertz but there are many ways and many

different channel qualities and signal quality set you can transmit over v h if so

i think you implied that you use marine radial which is the

usually analogue and f m but not necessarily you can transmit the signal digitally

in many different modulation the channels and then i assume that you talking just about

the marine the walkie talkies but then in your list of databases you also mentioned

mobile phone data now mobile phone data is not either transmitted on v h f

no analog so i'm confused how you use that data in analysing your the range

channels

so from this that isn't why we choose and those them about mobile phone devices

because the we don't have enough

database to use the

with the did you have

a friend from the system changing so we haven't tried several times it

by we use

by discussing this ball database of button the performance of we have dropped so

we i in the sun

some that are from this a mobile device and recording

i in many

had of course so this is a

one is a consideration

we only based on

the experimental results

sink

formatting communication whole most cases it we only use the

this we had you have

like walkie talkie for communication

is a popular so is a suitable

for universal

six communication with the control panel

so it normally when you look at ship to ship or maureen type communications in

be modulation demodulation process

quite often in the in b d modulation

if the speech bandwidth is not shifted back to the right location to be an

offset in there

and so that distortion will actually introduce a lot of problem so you have to

kind of a cadre normalization or adjusting here

are you looking at real data when you're doing you're testing and if so what

is be the plan to kind of interest some of the other

problems of used to be christian analogue

v h after i

communications because i don't see you have listed

Text-Dependent Speaker Verification System in VHF Communication Channel

Text-dependent Speaker Recognition

Changhuai You, Kong Aik Lee, Bin Ma and Haizhou Li