Tutorials

Intonation Transcription and Modelling in Research and Speech Technology Applications

Cong Zhang, Amalia Arvaniti, Kathleen Jepson, Katherine Marcoux

Neural target speech extraction

Kateřina Žmolíková, Marc Delcroix

Speech Recognition with Next-Generation Kaldi (K2, Lhotse, Icefall)

Sanjeev Khudanpur, Daniel Povey, Piotr Żelasko

An Introduction to Automatic Differentiation with Weighted Finite-State Automata

Awni Hannun

SpeechBrain: Unifying Speech Technologies and Deep Learning With an Open Source Toolkit

Mirco Ravanelli

SpeechBrain: Unifying Speech Technologies and Deep Learning With an Open Source Toolkit

Titouan Parcollet

SpeechBrain: Unifying Speech Technologies and Deep Learning With an Open Source Toolkit

Aku Rouhe

Concept to Code: Semi-Supervised End-To-End Approaches For Speech Recognition

Omprakash Sonie, Kannan Venkateshan

Kategorie přednášek

Keynotes (4)
Survey talks (4)
Acoustic event detection and acoustic scene classification (5)
Applications in transcription, education and learning (8)
ASR Technologies and systems (1)
Assessment of pathological speech and language I (4)
Assessment of pathological speech and language II (13)
Automatic Speech Recognition in Air Traffic Management (4)
Communication and interaction, multimodality (8)
ConferencingSpeech 2021 challenge: Far-field Multi-Channel Speech Enhancement for Video Conferencing (5)
Cross/multi-lingual and code-switched ASR (7)
Disordered speech (3)
Diverse modes of speech acquisition and processing (10)
Embedding and Network Architecture for Speaker Recognition (3)
Emotion and Sentiment Analysis I (2)
Emotion and Sentiment Analysis II (9)
Emotion and Sentiment Analysis III (4)
Feature, Embedding and Neural Architecture for Speaker Recognition (8)
Graph and End-to-End Learning for Speaker Recognition (1)
Health and Affect I (3)
Health and Affect II (9)
INTERSPEECH 2021 Acoustic Echo Cancellation Challenge (3)
INTERSPEECH 2021 Deep Noise Suppression Challenge (2)
Keyword search and spoken language processing (3)
Language and Accent Recognition (3)
Language and Lexical Modeling for ASR (8)
Language Modeling and Text-based Innovations for ASR (3)
Linguistic Components in end-to-end ASR (5)
Low-resource speech recognition (7)
Miscellanous topics in ASR (3)
Multi- and cross-lingual ASR, other topics in ASR (8)
Multi-channel speech enhancement and hearing aids (9)
Multimodal systems (10)
Neural Network Training Methods and Architectures for ASR (4)
Neural network training methods for ASR (9)
Non-Autoregressive Sequential Modeling for Speech Processing (7)
Non-native speech (5)
Novel neural network architectures for ASR (8)
OpenASR20 and Low Resource ASR Development (3)
Oriental Language Recognition (3)
Phonation and voicing (4)
Phonetics I (1)
Phonetics II (11)
Privacy-preserving Machine Learning for Audio & Speech Processing (9)
Prosodic features and structure (8)
Resource-constrained ASR (8)
Robust and Far-field ASR (3)
Robust Speaker Recognition (8)
SdSV Challenge 2021: Analysis and Exploration of New Ideas on Short-Duration Speaker Verification (2)
Search/decoding techniques and confidence measures for ASR (6)
Self-supervision and semi-supervision for neural ASR training (5)
Show and Tell 1 (5)
Show and Tell 2 (5)
Show and Tell 3 (7)
Show and Tell 4 (7)
Single-channel speech enhancement (7)
Source Separation I (2)
Source Separation II (10)
Source Separation III (3)
Source separation, dereverberation and echo cancellation (3)
Speaker Diarization I (3)
Speaker Diarization II (9)
Speaker Recognition: Applications (9)
Speaker, Language, and Privacy (3)
Speech and audio analysis (4)
Speech coding and privacy (9)
Speech enhancement and coding (2)
Speech enhancement and intelligibility (12)
Speech Localization, Enhancement, and Quality Assessment (4)
Speech perception I (2)
Speech perception II (9)
Speech production I (4)
Speech production II (6)
Speech Recognition of Atypical Speech (11)
Speech signal analysis and representation I (12)
Speech signal analysis and representation II (4)
Speech Synthesis: Linguistic processing, paradigms and other topics (8)
Speech Synthesis: Neural Waveform Generation (6)
Speech Synthesis: Other topics I (4)
Speech Synthesis: Prosody Modeling I (6)
Speech Synthesis: Prosody Modeling II (3)
Speech Synthesis: Singing, Multimodal, Crosslingual Synthesis (8)
Speech Synthesis: Speaking Style and Emotion (7)
Speech Synthesis: tools, data, evaluation (8)
Speech Synthesis: Toward End-to-End Synthesis I (7)
Speech Synthesis: Toward End-to-End Synthesis II (8)
Speech type classification and diagnosis (8)
Spoken Dialogue Systems I (2)
Spoken Dialogue Systems II (5)
Spoken Language Processing I (7)
Spoken Language Processing II (2)
Spoken Language Understanding I (8)
Spoken Language Understanding II (3)
Spoken machine translation (12)
Spoken Term Detection & Voice Search (9)
Streaming for ASR/RNN Transducers (7)
Target speaker detection, localization and separation (5)
The ADReSSo Challenge: Detecting cognitive decline using speech only (7)
The First DiCOVA Challenge: Diagnosis of COVid-19 using Acoustics (6)
The INTERSPEECH 2021 Computational Paralinguistics Challenge (ComParE) - COVID-19 Cough, COVID-19 Speech, Escalation & Primates (8)
Tools, corpora and resources (11)
Topics in ASR: Adaptation, transfer learning, children's speech, and low-resource settings (9)
Topics in ASR: Robustness, feature extraction, and far-field ASR (8)
Tutorials (8)
Voice activity detection (5)
Voice activity detection and keyword spotting (10)
Voice and voicing (6)
Voice Anti-Spoofing and Countermeasure (11)
Voice Conversion and Adaptation I (7)
Voice Conversion and Adaptation II (4)
Voice quality characterization for clinical voice assessment: Voice production, acoustics, and auditory perception (4)
Opening (1)
Closing (3)