Streaming for ASR/RNN Transducers
Lecture Categories
- Keynotes (4)
- Survey talks (4)
- Acoustic event detection and acoustic scene classification (5)
- Applications in transcription, education and learning (8)
- ASR Technologies and systems (1)
- Assessment of pathological speech and language I (4)
- Assessment of pathological speech and language II (13)
- Automatic Speech Recognition in Air Traffic Management (4)
- Communication and interaction, multimodality (8)
- ConferencingSpeech 2021 challenge: Far-field Multi-Channel Speech Enhancement for Video Conferencing (5)
- Cross/multi-lingual and code-switched ASR (7)
- Disordered speech (3)
- Diverse modes of speech acquisition and processing (10)
- Embedding and Network Architecture for Speaker Recognition (3)
- Emotion and Sentiment Analysis I (2)
- Emotion and Sentiment Analysis II (9)
- Emotion and Sentiment Analysis III (4)
- Feature, Embedding and Neural Architecture for Speaker Recognition (8)
- Graph and End-to-End Learning for Speaker Recognition (1)
- Health and Affect I (3)
- Health and Affect II (9)
- INTERSPEECH 2021 Acoustic Echo Cancellation Challenge (3)
- INTERSPEECH 2021 Deep Noise Suppression Challenge (2)
- Keyword search and spoken language processing (3)
- Language and Accent Recognition (3)
- Language and Lexical Modeling for ASR (8)
- Language Modeling and Text-based Innovations for ASR (3)
- Linguistic Components in end-to-end ASR (5)
- Low-resource speech recognition (7)
- Miscellanous topics in ASR (3)
- Multi- and cross-lingual ASR, other topics in ASR (8)
- Multi-channel speech enhancement and hearing aids (9)
- Multimodal systems (10)
- Neural Network Training Methods and Architectures for ASR (4)
- Neural network training methods for ASR (9)
- Non-Autoregressive Sequential Modeling for Speech Processing (7)
- Non-native speech (5)
- Novel neural network architectures for ASR (8)
- OpenASR20 and Low Resource ASR Development (3)
- Oriental Language Recognition (3)
- Phonation and voicing (4)
- Phonetics I (1)
- Phonetics II (11)
- Privacy-preserving Machine Learning for Audio & Speech Processing (9)
- Prosodic features and structure (8)
- Resource-constrained ASR (8)
- Robust and Far-field ASR (3)
- Robust Speaker Recognition (8)
- SdSV Challenge 2021: Analysis and Exploration of New Ideas on Short-Duration Speaker Verification (2)
- Search/decoding techniques and confidence measures for ASR (6)
- Self-supervision and semi-supervision for neural ASR training (5)
- Show and Tell 1 (5)
- Show and Tell 2 (5)
- Show and Tell 3 (7)
- Show and Tell 4 (7)
- Single-channel speech enhancement (7)
- Source Separation I (2)
- Source Separation II (10)
- Source Separation III (3)
- Source separation, dereverberation and echo cancellation (3)
- Speaker Diarization I (3)
- Speaker Diarization II (9)
- Speaker Recognition: Applications (9)
- Speaker, Language, and Privacy (3)
- Speech and audio analysis (4)
- Speech coding and privacy (9)
- Speech enhancement and coding (2)
- Speech enhancement and intelligibility (12)
- Speech Localization, Enhancement, and Quality Assessment (4)
- Speech perception I (2)
- Speech perception II (9)
- Speech production I (4)
- Speech production II (6)
- Speech Recognition of Atypical Speech (11)
- Speech signal analysis and representation I (12)
- Speech signal analysis and representation II (4)
- Speech Synthesis: Linguistic processing, paradigms and other topics (8)
- Speech Synthesis: Neural Waveform Generation (6)
- Speech Synthesis: Other topics I (4)
- Speech Synthesis: Prosody Modeling I (6)
- Speech Synthesis: Prosody Modeling II (3)
- Speech Synthesis: Singing, Multimodal, Crosslingual Synthesis (8)
- Speech Synthesis: Speaking Style and Emotion (7)
- Speech Synthesis: tools, data, evaluation (8)
- Speech Synthesis: Toward End-to-End Synthesis I (7)
- Speech Synthesis: Toward End-to-End Synthesis II (8)
- Speech type classification and diagnosis (8)
- Spoken Dialogue Systems I (2)
- Spoken Dialogue Systems II (5)
- Spoken Language Processing I (7)
- Spoken Language Processing II (2)
- Spoken Language Understanding I (8)
- Spoken Language Understanding II (3)
- Spoken machine translation (12)
- Spoken Term Detection & Voice Search (9)
- Streaming for ASR/RNN Transducers (7)
- Target speaker detection, localization and separation (5)
- The ADReSSo Challenge: Detecting cognitive decline using speech only (7)
- The First DiCOVA Challenge: Diagnosis of COVid-19 using Acoustics (6)
- The INTERSPEECH 2021 Computational Paralinguistics Challenge (ComParE) - COVID-19 Cough, COVID-19 Speech, Escalation & Primates (8)
- Tools, corpora and resources (11)
- Topics in ASR: Adaptation, transfer learning, children's speech, and low-resource settings (9)
- Topics in ASR: Robustness, feature extraction, and far-field ASR (8)
- Tutorials (8)
- Voice activity detection (5)
- Voice activity detection and keyword spotting (10)
- Voice and voicing (6)
- Voice Anti-Spoofing and Countermeasure (11)
- Voice Conversion and Adaptation I (7)
- Voice Conversion and Adaptation II (4)
- Voice quality characterization for clinical voice assessment: Voice production, acoustics, and auditory perception (4)
- Opening (1)
- Closing (3)