hi everyone i
moneymaking sponsored by
i come from the null suppressed and only technical university over time
is that we deal is a presentation on my paper
although word for work or
workshop of the odyssey two so
to sound and the twenty
now that speaking
the title of this paper these partial using metric learning best a speaker verification back
end
in other wars
this paper proposed a shallow match learning back end algorithm both speaker verification
okay i will present it from this for aspects
including as the metric learning and of the motivation
the proposed objective function
some experimental results
and the and last i will give some conclusions
and i will also introduce several all of that works
this paper and do
our future plans
first
the maxent learning and the motivation
and illustrated in the title i thing well i can't on their these two questions
the motivation of this paper we are equally
the first one is what at the automatic learning and what i've we proposed a
metric learning passed back end algorithm
the mac learning em's to learn distance function to matters the similarity of them both
third and the mahalanobis distance
both speaker verification as displayed in the right speaker of this slide
we first extract it is speaker identity features problems what i'm she's by a front
and the speaker feature extractor
but and the i-vector of the extractor
and the thing we feed them to the metric learning past the back end to
calculate the here
similar just goals
for the learning of the metrics
we
employed a loss function best on the optimisation of the actual use the as displayed
in select speaker of this slide
follows them actually learning i thing the first other one g h e that's the
challenge of as a distance function is a consistent with the evaluation procedure
therefore it back into can directly optimize the
tom evaluation metrics the for speaker verification
such as the equal the rats the life use the
and style
thank and eat can be easily combined to these
accents front ends for them both the i-vector of the x better
third this channel matched learn a matter that can be easily extended to choose the
and to and the pram work
the second requesting i needed to uncertainties
what is the partial a use the
and the
why was them metric learning back end aims at its optimising
actually use the
in the
left finger of this slide
the power to use the divine and or small part of what re on there
is a all c call
like
this correct re
vol
the metric learning can directly optimize thumb evaluation metrics
its implementation fess these
some difficulties
as we all know
we needed to "'cause" tried to peer wise all triple edge chanting trials with speaker-level
labels to change is this function
in matched learning
in this edition
the number of all possible training trials
e is very large
besides many easily distinguishable channels unnecessary to the challenge of the distance function
in terms of these difficulties
i think
the optimisation of the pa use the has the
pointing to the ones you jeez
first
it is easy to select the difficulty samples by cindy a two
the overall
and the we'd have to
relative small value
in this to be
we can also progress the number of the
ct of the training trials
second we can optimize them interested the partial use the according to some specific applications
and obviously
a to z is a special case of partial using
next
in the centre part of your express the bedding comparing the impulse of the proposed
algorithm
in this slide i will introduce the whole to calculate to the partial use the
and i health and metric learning need to construct pairwise trials
here we don't see the whole to construct them
and the be the in that
t is an hour a day constructed this there'd
here x and y n
speaker features over two speech segments
our is the year round to choose level
you they come from of them speaker
l a equal one
otherwise i l and you quote the are able
besides the function of s
is use the to calculate the similarity
of two speaker features
here we used to the mahalanobis distance function
no creativity the level l had can be obtained by a comparison of the distances
calls
as a
and the is the threshold receiver
given a fixed the value of the hot we i about to compute to posterior
at t p r
and to
post
positive rats f p r
boundary of the hobby can get a theories o t p and the f b
r
which one
and are of the call
and the role in the speaker
and to really optimize the entire
optimising the optimize the entire roc call if an actual follows
you were this is not only costly but also unnecessary
because in most practical system
work
and only practical
because the most of practical systems
work
and the part of their our roc curves
walking them whole
back security system you're leave equalized smaller force posterior rats
in contrast
terrorist the detector system always hopes
we in
hyper record react
so without optimize the partial use the your the walk imports look at it is
a better choice
in this light
t even though constructed up here was trained if that's
key and do a
the positive and negative subset of t
then be needed to compute a new stuff that and the or
vol
from by eating that they'll
can stress of that's the value of p r is peachy
are far and the beta
you order to compute and the oral we first needed to thank you lance our
and the be higher but this formula
then all this values of connectives that
so
sorted in ascending order
and then e
and the overall he's is selected as a subset of the samples under the problems
at all
i was to be fast position of the result you discourse
after obtaining the overall
p a use the can be calculated and all
normalized
it was the
or p
and the and they are well
in respectively
the partial if the is calculated by they'll
that can a full meal or
of this light
you
all i
is an indicator function so directory optimising this formula is np-hard therefore we needed to
relax eight in the best if agree
elias there's no
here use the calculation function by replacing the indicator function v is a huge loss
function
here
third time is eligible hyper parameter and the it is larger than the oral
the
last from lord give of the relaxed the loss function
to prevent
it to bremen to this
loss function
or feed into the training data be also
indeed regular
not addition term
the land that all mean a
to the minimization problem
finally
this green part in large as the between-class distance
and this read the patch
try to minimize no between-class variance
in awards our objective function ends
and
enlarging of each he'd marketing
been to use the
pasta you and in
negative trials by minimizing they'll sitting at the various
of the two colours trials simultaneously
in the third part i go give some experimental results
this lighted display our experimental it's easiness
more details can be bounded in the paper
this paper
this table lists no comparison results on the conscience that's the data set
it is then that's of the proposed
pa use them actually it's better performance than p lda
given both the i-vector and the expected front ends
specifically the pac p a using them actually over ten s
not persons and to twenty percent relative improvement over p lda
in terms of the
pa use the and it was the
actually
respectively
no worry
it achieves models that eleven percent relative eer reduction
and five percent
relative this the effort reduction over p lda
table two at least the results on the core task
the s i t w data that is that
it is thing that's the problem lost
p a using matching it's better performance than p lda
specifically but the x factor front and is used
pa using matching achieve some of them
eight percent
relative pa use the
an improvement all work p l d a t
if the
it is also
of a tent
no then
twenty percent and the channel or since about it you a was the improvements on
the development and evaluation call tasks respectively
moreover it achieves
ten percent relative eer reduction and the
three percent relative dcf
reduction
although the performance improvement to be though
i-vector front end is not still significant
and that the extract a front end
the tense with different a front ends are consistent
this page displayed as some experimental results
bid i use the two analysis the if at all hyper parameters hopefulness
we adopt e d
read the source to study the impact of the values of common enemy performance
in the
a vector
yes
from these two tables bank and the data does double working region is quite large
this fink or souls the relative performance improvements all work p lda
in terms of the difference
of different adored
in the objective function
from this finger be fine this dances the pa use them actually is a robust
e o by the advantage of is the best value around do one point two
five
finally
i will give some conclusions and the introduced several for the works as you and
of our future plans
in this paper
mahalanobis distance past them magical learning back end is proposed to optimize partial a use
the both speaker verification
because directly optimize thing
partial you the at and b heart
be relaxed aid by a huge loss function
experimental results
carried out of the
nist is a risky and data
s i t w that have that's
that must just as the effectiveness of our proposed algorithm
after this work we also mad the general done normalization
and to compress the analysis
to the pac metric
we show me
published as the
without relative without
in this paper
besides
we also extended the extended to the
pa is the magic to an energy and the framework
more information can be found in this too
more information can be found in this paper
in the theatre
maybe all research more general mexican and best the speaker verification or rhythm
to optimize
evaluation metrics
in order to
further improve speaker verification performance
that all from my presentation
thank you for your watching