i mean system we don't and i don't we are guide for the next twenty
minutes if you have questions please press the power button and whatever you won't
meanwhile lists internet and actual three
okay
this one bound together with the shuffle file now so
we work on effect of the waveform we may have on this point detection in
this time it's for clean data or physical condition
it is a continuation of the were deemed on the same
challenge
for the most common conditions
we define the problem you the motivation why to use the waveform
we show will show several examples
this way for be and have
and
will describe you know musician process
which changes in may have all their plane data
and show how to fix
on the
i is moving recognition and other effects
the examples we show the results of the evaluation and then the big or
so
we can
five problem then the three
one two
classify speech segment rather means gene speech
or one speech
one generally small speech can be synthesized over a door may
or any other way but this work will focus on the data
the motivation for this work is due to the thing that a lot of more
than on i spoofing in the frequency domain
maybe features were applied like mfcc uses c and the c
and more
but not much down with time domain
and we want to learn what happens
with the time domain statistics of the wave form
and see how
we can find changes between the union speech and
shall let's take an example
of a speech segment
and see what
if we look at the waveform and able to model
we see john speech segment
and then
we want to find the probability mass function
of the art students
this statement will of sample queries
sixteen b
we also a person
so we have our sixteen uniform distribution to be between minus one and one
we show here only those two
in the
no range between mind zero one three and zero one three
it can be seen that the
i do
and system will do you
very similar to the last distribution normal distribution
and its well known in the literature at least the speech
no
let's the they the samples for the evaluation of is reasonable
twenty nineteen physical condition
we
evaluated the be an f
all the genes speech brought about
and this was speech the raw
below
and we see that there is the
big difference between them
especially
around zero
so
it can put on the
maybe easy even
by human only by looking at the b m f
to distinguish between
these two
classes union and
replay data
so if you want to make a group of feeding
of course not too if so using to distinguish between them
and we would like to have a similar distributions for all class
so this process we then
is a generalization
will style shows from continues random variable
and then goal is for example of a temporal
to show how we
d is
our one dies samples
so soon we have
source in the f
and
we want to make transformation that it will have
the
pdf of the destination
maybe f
so we have
two probability distribution function
all the sort of
and all the destination
in our case the stores it is well speech while the destination is the engine
speech is we want to convert the
spoof
same and to have the same statistics as the gmm speech
so first for every sample
from the possible speech
we wanna we will find v
value of the
c d f
then we will go in the general speech and you have
where am will be the same value
all the c d f
and the range
vector you're on the
several i will be
so
i have to zero
for this one speech will have no new value of better zero
s in simple
and these procedure we can do sample by sample for all the samples in this
world speech
of course in our case the distributions are no you know but
discrete
and the algorithm the legion be more again
in discrete case
the line is not movement email but have this continues
and
it looks like steps
so for each time a from the small speech
we see why use the
a c m f relative mass function
and now we will move and engine each have
and it's not exactly this that's the values and the same place
so we decided to take the lower bound
in this case
instead of this statement for four we have
still you equal for the new value but it's not true for every
so that it can change from sample stuff
and of course we do it
for all the samples here of the exact boundaries
three increase in our case yes sixteen weeks
so for my own
the logical conditions
and we see the results
the graph about
is the graph of the
suppose speech
while in the middle it's a graph of this of speech
a little aging decision process
and below use the
be a ubm have all the original speech
we can see that the algorithm works well
and the
generalize speech read
is similar to gmm speech
however when we try to apply the same algorithm
for physical conditions
we have a phenomena
that
in the engineering guys speech in the middle
we have like in a bunch around zero
jehovah sees the y-axis of the ml
for speech
the maximum zero one while other grass the maximum zero one four ensures
vol in to make it better visible but we see that
then generalize speech is far away for jane speech
this phenomena was french and we wanted to
understand what happened
so we can see and in the these video
around zero this speech
we have a very big
john responding
which are several
levels
of a window of
the may have been gmm speech
so in when we
convert
this both speech would you know speech writing iteration process
all three levels in this example
of four and five
are you and get an o b
in the engine you guys five
so to overcome these
problem
we can certainly db or duration of each
so i performance of speech
we had it is for small noise
and such way
we have more steps
more available from invisible speech in these investment
we had indeed
three beats
of uniform loans
so we have
eight times more
dis-continuous level
and that josh a lot more in this way now we can reach
and level
in the gmm speech
in our case
in real experiment
to sixteen be additional noise of five b
it means
each level
now have sort into
levels of floors of
when we apply these algorithm
we can see the results
the p m f or generalize speech is very similar religion speech
so we or are the problem of the four previously
of course we tried we also be the logical conditions
and the results were who is pretty with
so it doesn't diminish the previous results of logical conditions
but i improved dramatically the results
all of the generalization process with physical condition
now we want to see what happens with and spoofing system
well we use the generalization process
so
we to the baseline system that will provide by the organisers
in one
two classes for gmm speech and four
speech in each class is a gmm with five hundred twelve gaussian mixtures
there are two models well i four think uses in features and graph for eliciting
features
the baseline results are shown
it didn't column of the baseline
the next goal
we used a miss the
original gmm models but now try
tools
the one of the that a generalization
so righteously the results
all the models problem
in the next step
this data okay we will stay with real data before generalization
by the gmm and
of this model we are currently
generalized
data
and we see that
the generalization probability is very poor results
are very big
when we train
and then we generalize speech
the results are very on
we can say okay
we trained with one data and that the same data
logical of the results are
but i think a lot of
and
the control manager
is to
be able to recognize no admittance of a one thing because all the time you
matters timing algorithms
and
if
the system what well
vulnerable to the
new algorithms
and it's not robust it's not little because we never and always will be the
actual algorithm
so
to summarize
well maybe
we show that there is a big difference between the
waveform distributions of the
to really do you know speech
and the
speech
a the doors
a replay
and effective way
be easy to recognise in the time-domain the
as both speech
so
firstly try present unionisation process how we can convert of the
speech would be statistically more similar to human speech
and we show love it
it's better to a star
noise
to sample
so means of noise and
and better
and unionisation
then we tried this the control measure and we so that the results can vary
dramatically
with a friend use one data and try
is that a or of spoofing
in the form of understand the extendible
for a moving system
to behave like these
because it
must have very good generalization for be and
neither one will the
by national will have to be done
this direction to
may
seized and much more we will i
thank you very much and if you enjoy at all
you can press play and listen to be again and again
stay healthy by