Changhuai
you Haizhou Li
and Ambikairajah Kong Aik Lee and
oh
presented by
good afternoon every one
the paper i would like to present is entitled
Bhattacharyya based gmm
SVM system with adaptive from
relevance factor for pair language recognition
and outline
oh for this presentation is shown here
in this pair language recognition system and we major focus by using
a studying the three
techniques including Bhattacharyya based gmm svm
an adaptive relevance factor as well as strategies for pair language recognition
given a specified language pair the task of
recognition of
language pair is to decide which of these
two languages is in fact spoken in the specified in a given segment
so we develop pair language recognition systems by studying bhattacharyya base gmm svm
by introducing mean supervector and the covariance supervector and we merge these two kind of
sub kernels together to form a better performance
for this
a hybrid system
we also
in order to compensate those duration effect
and we introduce adaptive relevance factors
alright
of
and MAP in gmm svm systems
and for the purpose of pair language recognition we introduce two set of strategies
for this a big
condition purpose
and we report our system design
for each progress
for LRE twenty eleven submission
so
in a speaker and language recognition system normally
and there are two typical kernals for gmm svm they are
kullback leibler kernel and bhattacharyya kernel
used
conventional kl kernel only includes mean information
for recognition that modeling
however
a Symmetrized version of the k l
can extend
it to include the covariance term
here
so why we choose
Bhattacha ryya based kernel for language pair
recognition
so based on many experiments
for speaker and language recognition systems
we observed the bhattacharyya based kernel has better performance than k. l.
so
in the bhattacharya kernel
there are
this kernal actually could be splitted
can be splitted into three terms the first term
can contribute is contributed by mean and covariance of
gmm
and the second term
involves the covariance term only the third term is
involves weight but
parameter of gmm only
so actually these three terms can be independently used to give
the recognition decision score
with different degree of information contribution
so by using the first term of the Bhattacharyya kernel
so with
keeping covariance
not updated
that
we can get the mean supervector train
stress
so
so these kind of kernel could be independently used as a sub
modeling
and then
second term only includes the covariance term
ah so we can get the
covariance supervectors from this term
we only use
the first two terms of the bhattacharyya kernel
for our
for our pair language recognition
system design
so the NAP for both
a mean supervector and the covariance supervector of Bhattacharyya
are trained by using different
a database with
a certain amount of overlap
this purpose is to
oh
to increase those compensation factors
so for this UBM database and the
relevance factor database training
we can
use the common to both
supervector mean and covariance
so in order to compensate duration variability we introduce adaptive relevance factor
sure
and this adaptive relevance factor of MAP
in gmm svm
here we show the MAP position
in gmm svm system
so this equation is the mean updated
of MAP
so here the x_i is the first of sufficient
statistic statistics
so you can see the relevance factor gamma_i can indirectly affect the degree of update
for the mean vectors of gmm
so
so we assume
once we
we have this relevance factor be a function of duration it is possible to do
some compensation work
in this
mean update
so far there are two types of relevance factors
one is in the classical MAP
usually we use fixed value of relevance factor
so the relevance factor also can be data dependence by this question
this equation is derived from
from the factor analysis research
here the phi is a diagonal matrix that can be trained by using development database
so assume this relevance factor be function of k. is related to the number of
features that is connected to duration
so we can see the occupation
count N_i
we do the expectation
on this occupation count and we can see this
the expectation of the occupation count is directly
proportional to proportional to the durations
so if we choose this function as the duration function for
for the relevance factor so we can have expectation of adaptation coefficient
of MAP mean adaptation trends to a constant vector so we can get this
adaptive relevance factor by this equation
so this equation will result in
g.m.m. being independent of duration
now we go to the third point of our presentation
we propose two strategies for pair language recognition the first one is one
to all strategy
also called core to pair modeling
this modeling means we train gmm svm models for certain
target language against all other target languages
so we can have the score vectors here
with this score vector and by using our development database for all the target
languages and we can have the back
the gaussian backend modelings
for this the end
for these N languages
so
finally
and language pair scores can be obtained
through the log likelihood ratios shown here
so the second
strategy is a pairwise strategy also called pair modeling
this modeling is very simple just use
two languages' database from the language pair
directly train the model of gmm svm and we get
this modeling
and we get
the scores
for the fusion of the two strategies
we only apply equal weights
for this
that means we assume
that importance of the two strategies
are the same
so we get the final score by fusion the two strategies
here we show a hybrid
pair language recognition system
we get the test utterance we can have
Bhattacharyya mean supervector and covariance supervector
together input to
the two
strategies
and we get the merging of the two supervectors in each of the
strategies
finally we fusion these two strategies together and we get the final score
we do the evaluation for our
pair language recognition design
by using
NIST LRE 2011 platform
here there are twenty-four target languages so totally
there are
two hundred and seventy six language pairs
so we choose
five hundred and twelve Gaussian components for gmm
and ubm and
oh we
do these experiments
and show the results based on thirty second task in this paper
but we also do other duration parts in our experiments
so here we use eighty dimensions MFCC SDC
and this MFCC SDC features
with energy based vad
and the performance is computed
as average cost
for the N worst language pairs
here we list
the training data base
for both CTS and BNBS
set
for our language pair recognition training
now we show the experiment results
by comparing firstly we compare the fixed relevance factor and adaptive relevance factor
effect
the table one shows
under
the core to pair
strategy we show
this
fixed relevance factor set to three different
value zero point two five eight thirty two and we give
the eer and the minimum cost
here and compare with arf that is
adaptive relevance factor and we compare these two
compare these data we can say
the adaptive relevance facotr performs
better than any of the
fixed relevance factor settings
so the similar observations
found
in this pair strategy
here and say twelve point
seven five percent for
in terms of eer
and
and the other one is higher one
with the relevance factor settings
the second experiment we are doing
is for
the effect of the merging.
the two sets of supervectors
mean supervector and covariance supervector
the blue color means the mean supervector
the green color we present
the
Bhattacharyya covariance
supervector with eighty dimension
MFCC sdc features
and arf is adaptive relevance factor
so we
we do this experiment
under
core to pair strategy and we show the red
color
this merging effect
in the red color and we can see
performance is obviously
over the previous one that's mean and covariance
this figure is based on
N top
language pairs that is
the worst
performance of EER
with N times N minus one divided by two
language pairs
so the similar
results
can be found in the
pair strategies
also the red color always
so
most of the language pairs is lower it gives
lower minimum detection cost
finally
we will show the fusion effect
with
the two pairs
the first one
the blue one is core to pair and the green one is for the pair
strategies after we merging this two strategies we can get the final results
with eer of
ten point
something percent
and the minimum cost is zero point zero nine
oh we come to conclusions for my presentation we have developed a hybrid
Bhattacharyya based gmm-svm system for pair language recognition
for the purpose of LRE twenty eleven submission
performance after the merge of
mean supervector and covariance supervector is obvious
we compare to the fixed relevance factor
and we aobserved the adaptive relevance factor is effective
for the pair language recognition
and
finally
we can say the fusion of core to pair and pair strategies
is useful
here we show some reference papers especially for the first one from patrick kenny he
proposed this database
data dependent relevance factor
thank you
oh okay
firstly we choose these
mean and covariance super vectors
this means we don't want to merge
this mean and covariance informations in one kernel
we want to separate it because we find if we separate it
we may get better performance after merging these two
supervectors together
we ever compared it
so that is when we
is
when we do the kernel with the first term and the second term merging together
to produce only one kernel and compare with the separated kernels that is mean kernel
and covariance kernel after that fusion together
the latter effect is better
okay
oh
okay
that is
i think at least
because it is based on different training and testing environment
and database
so totally the effect is obvious
oh