Speech Transcript - On the use of phone-gram units in recurrent neural networks for language identification

0:00:16	so we are presenting here our work using what we can have a four gram
0:00:20	units
0:00:21	using require program known and of course for language identification
0:00:27	so
0:00:28	where you know what we it doing with a regular and recurrent neural network is
0:00:32	to use phonemes as input to see now when i think with indication
0:00:37	begin and that of the number of phonemes
0:00:39	and then we have also incorporated in the context information use in a uniform slide
0:00:44	function trigrams
0:00:45	comparing them and their fusion all of them
0:00:48	and so we are proposing the concatenation of this in a descent phonemes in our
0:00:53	in our system
0:00:56	so this architecture apply to this language and the via some system
0:01:01	is based on phonotactic system i prepare landmark detection
0:01:05	so we have for its phonetic recognisers in the bruno recognizers we obtain a sequence
0:01:11	of phonemes
0:01:12	and in evaluations for each utterance we a compute like an entropy metric provided by
0:01:17	their like the network
0:01:18	and this entropy scores are calibrated than used later
0:01:23	we also present a word but don't with funded hubert representations a it used in
0:01:29	order to reduce the vocabulary in this a neural network using k-means to group a
0:01:35	similar from grants
0:01:37	and we have were with this keeper model at the phoneme level and we had
0:01:40	a relative improvement of seven percent
0:01:43	hence the despicable to read the text
0:01:46	also in the work we present like the study of the most role of and
0:01:50	their right brown report in that no one of course parameters
0:01:54	so here is the list of parameters have been working with
0:01:59	here in their results we can see this you have rate in our database used
0:02:04	in comparing the nice ones the diphones triphones and then we can see a the
0:02:09	fusion of them and the comparison with the work these landed pprlm
0:02:14	and a fusion with that and the standard acoustic system based on mfccs
0:02:19	and
0:02:20	c different portions finally where we can see that a
0:02:24	this approach also provides complementary information so there are a final improvements in our global
0:02:30	system
0:02:31	that's it

On the use of phone-gram units in recurrent neural networks for language identification

Poster Session 1: Language Recognition

Christian Salamea, Luis Fernando D'Haro, Ricardo Cordoba, Rubén San-Segundo