Odyssey 2016

The Speaker and Language Recognition Workshop

Odyssey 2016

The need for fast, efficient, accurate, and robust means of recognizing people and languages is of growing importance for commercial, forensic, and government applications. Odyssey is a Research Workshop organized every two years by the ISCA Speaker and Language Characterization Special Interest Group (SpLC-SIG).

Odyssey 2016: The Speaker and Language Recognition Workshop was hosted by the University of the Basque Country (UPV/EHU) in its venue Bizkaia Aretoa, Bilbao, Spain, from June 21 to June 24, 2016. The local organizers are GTTS from the University of the Basque Country (UPV/EHU) and VivoLab from the University of Zaragoza. Odyssey 2016 aims to continue fostering interactions among researchers in speaker and language recognition as the successor of previous successful events held in Martigny (1994), Avignon (1998), Crete (2001), Toledo (2004), San Juan (2006), Stellenbosch (2008), Brno (2010), Singapore (2012) and Joensuu (2014).

Website: http://www.odyssey2016.org


Keynotes


Text Dependent Speaker Verification

0:25:05

Uncertainty Modeling Without Subspace Methods For Text-Dependent Speaker Recognition

Patrick Kenny, Themos Stafylakis, Jahangir Alam, Vishwa Gupta and Marcel Kockmann


0:19:14

Deep Neural Networks and Hidden Markov Models in i-vector-based Text-Dependent Speaker Verification

Hossein Zeinali, Lukas Burget, Hossein Sameti, Ondrej Glembek and Oldrich Plchot


Speaker Recognition: i-vector approaches

0:25:27

Fast Scoring for PLDA with Uncertainty Propagation

Weiwei Lin, Man-Wai Mak

0:23:30

Rapid Computation of I-vector

Longting Xu, Kong Aik Lee, Haizhou Li, Zhen Yang


0:26:01

Constrained discriminative speaker verification specific to normalized i-vectors

Pierre-Michel Bousquet, Jean-Francois Bonastre

0:16:07

Iterative Bayesian and MMSE-based noise compensation techniques for speaker recognition in the i-vector space

Waad Ben Kheder, Driss Matrouf, Moez Ajili, Jean-Francois Bonastre


Poster Session 1: Language Recognition

0:01:38

Between-Class Covariance Correction For Linear Discriminant Analysis in Language Recognition

Abhinav Misra, Qian Zhang, Finnian Kelly and John H.L. Hansen

0:01:29

Incorporating uncertainty as a Quality Measure in I-Vector Based Language Recognition

Amir Hossein Poorjam, Rahim Saeidi, Tomi Kinnunen, Ville Hautamäki

0:02:33

Discriminating Languages in a Probabilistic Latent Subspace

Aleksandr Sizov, Kong Aik Lee, Tomi Kinnunen



0:02:41

On the use of phone-gram units in recurrent neural networks for language identification

Christian Salamea, Luis Fernando D'Haro, Ricardo Cordoba, Rubén San-Segundo

0:01:54

Language Recognition for Dialects and Closely Related Languages

Gregory Gelly, Jean-Luc Gauvain, Lori Lamel, Antoine Laurent, Viet Bac Le, Abdel Messaoudi



Speaker Recognition in Multimedia Content

0:18:17

Deep complementary features for speaker identification in TV broadcast data

Mateusz Budnik, Ali Khodabakhsh, Laurent Besacier, Cenk Demiroglu

0:25:53

First investigations on self trained speaker diarization

Gaël Le Lan, Sylvain Meignier, Delphine Charlet, Anthony Larcher

0:24:45

Soft VAD in Factor Analysis Based Speaker Segmentation of Broadcast News

Brecht Desplanques, Kris Demuynck, Jean-Pierre Martens



Speaker & Language Recognition Systems

0:25:38

BAT System Description for NIST LRE 2015

Oldrich Plchot, Pavel Matejka, Ondrej Glembek, Radek Fer, Ondrej Novotny, Jan Pesan, Lukas Burget, Niko Brummer, Sandro Cumani

0:24:57

The IBM 2016 Speaker Recognition System

Seyed Omid Sadjadi, Sriram Ganapathy, Jason Pelecanos

0:21:31

The Sheffield language recognition system in NIST LRE 2015

Raymond W. M. Ng, Mauro Nicolao, Oscar Saz, Madina Hasan, Bhusan Chettri, Mortaza Doulaty, Tan Lee, Thomas Hain


0:22:09

The MITLL NIST LRE 2015 Language Recognition System

Pedro Torres-Carrasquillo, Najim Dehak, Elizabeth Godoy, Douglas Reynolds, Fred Richardson, Stephen Shum, Elliot Singer, Douglas Sturim


Speaker & Language Recognition: Deep learning approaches

0:23:28

Augmented Data Training of Joint Acoustic/Phonotactic DNN i-vectors for NIST LRE15

Alan Mccree, Greg Sell, Daniel Garcia-Romero

0:22:59

LID-senone Extraction via Deep Neural Networks for End-to-End Language Identification

Ma Jin, Yan Song, Ian Mcloughlin, Lirong Dai, Zhongfu Ye

0:25:50

On autoencoders in the i-vector space for speaker recognition

Timur Pekhovsky, Sergey Novoselov, Aleksei Sholohov, Oleg Kudashev


0:19:49

Channel Compensation for Speaker Recognition using MAP Adapted PLDA and Denoising DNNs

Fred Richardson, Brian Nemsick, Douglas Reynolds

0:25:38

Evaluation of an LSTM-RNN System in Different NIST Language Recognition Frameworks

Ruben Zazo, Alicia Lozano-Diez, Joaquin Gonzalez-Rodriguez


Poster Session 2: Speaker Recognition I

0:01:23

Improving Robustness of Speaker Verification Against Mimicked Speech

Kuruvachan K George, Santhosh Kumar C, Ramachandran K I, Ashish Panda


0:02:00

VOICE LIVENESS DETECTION FOR SPEAKER VERIFICATION BASED ON A TANDEM SINGLE/DOUBLE-CHANNEL POP NOISE DETECTOR

Sayaka Shiota, Fernando Villavicencio, Junichi Yamagishi, Nobutaka Ono, Isao Echizen, Tomoko Matsui

0:02:33

A PLDA Approach for Language and Text Independent Speaker Recognition

Abbas Khosravani, Mohammad Mehdi Homayounpour, Dijana Petrovska-Delacrétaz, Gérard Chollet

0:02:01

Spoofing Detection on the ASVspoof2015 Challenge Corpus Employing Deep Neural Networks

Md Jahangir Alam, Patrick Kenny, Vishwa Gupta, Themos Stafylakis


0:02:30

Age-Related Voice Disguise and its Impact on Speaker Verification Accuracy

Rosa González Hautamäki, Md Sahidullah, Tomi Kinnunen, Ville Hautamäki

0:03:05

A New Feature for Automatic Speaker Verification Anti-Spoofing: Constant Q Cepstral Coefficients

Massimiliano Todisco, Héctor Delgado, Nicholas Evans

0:02:19

Multi-Bit Allocation: Preparing Voice Biometrics for Template Protection

Marco Paulini, Christian Rathgeb, Andreas Nautsch, Hermine Reichau, Herbert Reininger, Christoph Busch


0:01:40

Analysis and Optimization of Bottleneck Features for Speaker Recognition

Alicia Lozano-Diez, Anna Silnova, Pavel Matejka, Ondrej Glembek, Oldrich Plchot, Jan Pesan, Lukas Burget, Joaquin Gonzalez-Rodriguez


Industry & Forensics Track (Short Talks + Panel Session)

1:28:43

Forensic and investigative speaker recognition

Daniel Ramos, Jonas Lindh, Michael Jessen, Anil Alexander, Geoffrey Stewart Morrison

0:34:10

Commercial applications of speaker and language recognition

Sergey Novoselov, Carlos Vaquero, Antonio Moreno


NIST 2015 Language Recognition i-Vector Machine Learning Challenge

0:12:50

Summary of the 2015 NIST Language Recognition i-Vector Machine Learning Challenge

Audrey Tong, Craig Greenberg, Alvin Martin, Desire Banse, John Howard, Hui Zhao, George Doddington, Daniel Garcia-Romero, Alan McCree, Douglas Reynolds, Elliot Singer, Jaime Hernandez-Cordero, Lisa Mason

0:16:43

Out-of-Set i-Vector Selection for Open-set Language Identification

Hamid Behravan, Tomi Kinnunen, Ville Hautamäki

0:26:28

I2R Submission to the 2015 NIST Language Recognition I-vector Challenge

Hanwu Sun, Trung Hieu Nguyen, Guangsen Wang, Kong Aik Lee, Bin Ma, Haizhou Li



Poster Session 3: Speaker Recognition II


Speaker Clustering and Diarization

0:23:19

Semi-supervised On-line Speaker Diarization for Meeting Data with Incremental Maximum A-posteriori Adaptation

Giovanni Soldi, Massimiliano Todisco, Héctor Delgado, Christophe Beaugeant, Nicholas Evans

0:22:58

Influence of transition cost in the segmentation stage of speaker diarization

Beatriz Martínez-González, José M. Pardo, Rubén San-Segundo, J.M. Montero


0:21:54

Short- and Long-Term Speech Features for Hybrid HMM-i-Vector based Speaker Diarization System

Abraham Woubie Zewoudie, Jordi Luque, Javier Hernando

0:24:59

On the Use of PLDA i-vector Scoring for Clustering Short Segments

Itay Salmun, Irit Opher, Itshak Lapidot


Opening & Closing