I am a computer science PhD student at Mila and Concordia University, supervised by Professor Mirco Ravanelli. I have a broad interest in deep learning for Conversational AI. My research focuses on discrete self-supervised learning for speech and audio, exploring its potential to bridge audio and language models. I am also one of the main contributors to the SpeechBrain project, a popular open-source conversational AI toolkit.
🎙️ Consider joining my weekly Conversational AI Reading Group! We invite researchers working in conversational AI and speech processing to present their papers. 📚 Tune into our talks and discover new approaches, methodologies, or tools that you can start implementing in your work! 🚀
Happy to share a preprint from our recent work "Discrete Audio Tokens: More Than a Survey!". Huge thanks to all our amazing collaborators. Explore our website to browse the tokenizer database or submit your own tokenizer to be featured!
Excited to annouce our paper "LiSTEN: Learning Soft Token Embeddings for Neural Audio LLMs?" has been accepted at Interspeech 2025.
Our paper "ALAS: Measuring Latent Speech-Text Alignment For Spoken Language Understanding In Multimodal LLMs" has been accepted at ICML 2025 Workshop on Machine Learning for Audio.
Join our Conversational AI Reading Group! Every Thursday, leading scientists share insights on the latest advancements in the field. Everyone is welcome to join.
Excited to annouce our paper"What Are They Doing? Joint Audio-Speech Co-Reasoning" has been accepted at ICASSP 2025.
Mila/Concordia University (Gina Cody School of Engineering and Computer Science)Sep. 2022 - present
PhD in Computer Science
University of Texas at Dallas (UTD)2018 - 2021
M.S. in Computer Science
Most recent publications on Google Scholar.
‡ indicates equal contribution.
Discrete Audio Tokens: More Than a Survey!
Pooneh Mousavi, Gallil Maimon, Adel Moumen, Darius Petermann, Jiatong Shi, Haibin Wu, Haici Yang, Anastasia Kuznetsova,Artem Ploujnikov, Ricard Marxer, Bhuvana Ramabhadran, Benjamin Elizalde, Loren Lugosch, Jinyu Li, Cem Subakan, Phil Woodland, Minje Kim, Hung-yi Lee, Shinji Watanabe, Yossi Adi, Mirco Ravanelli
preprint
ALAS: Measuring Latent Speech-Text Alignment For Spoken Language Understanding In Multimodal LLMs
Pooneh Mousavi, Yingzhi Wang, Mirco Ravanelli, Cem Subakan
ICML Workshop on Machine Learning for Audio, 2025
LiSTEN: Learning Soft Token Embeddings for Neural Audio LLMs
Pooneh Mousavi‡, Shubham Gupta‡, Cem Subakan, Mirco Ravanelli
Proc. of Interpeech, 2025
What Are They Doing? Joint Audio-Speech Co-Reasoning
Yingzhi Wang, Pooneh Mousavi, Artem Ploujnikov, Mirco Ravanelli
Proc. of ICASSP, 2025
How Should We Extract Discrete Audio Tokens from Self-Supervised Models?
Pooneh Mousavi, Jarod Duret and , Salah Zaiem, Luca Della Libera, Artem Ploujnikov, Cem Subakan, Mirco Ravanelli
Proc. of Interspeech, 2024, Oral Session
DASB - Discrete Audio and Speech Benchmark.
Pooneh Mousavi, Luca Della Libera, Jarod Duret, Artem Ploujnikov, Cem Subakan, Mirco Ravanell
Open-Source Conversational AI with SpeechBrain 1.0
Mirco Ravanelli, Titouan Parcollet, et al
JMLR (Machine Learning Open Source Software)
CL-MASR: A Continual Learning Benchmark for Multilingual ASR
Luca Della Libera‡, Pooneh Mousavi‡, Salah Zaiem, Cem Subakan, Mirco Ravanelli
Submitted to Transactions on Audio, Speech and Language Processing
WOODS: Benchmarks for Out-of-Distribution Generalization in Time Series
Jean-Christophe Gagnon-Audet, Kartik Ahuja, Mohammad Javad Darvishi Bayazi, Pooneh Mousavi, Guillaume Dumas, Irina Rish
Transactions on Machine Learning Research
Detecting Hashtag Hijacking for Hashtag Activism
Pooneh Mousavi, Jessica Ouyang
ACL | IJCNLP | NLP4PosImpact
Please Donate for the Affected:Supporting Emergency Managers in Finding Volunteers and Donations in Twitter Across Disasters
Pooneh Mousavi, Cody Buntain
ISCRAM 2022
Discrete Audio Tokens: More Than a Survey!
Pooneh Mousavi, Gallil Maimon, Adel Moumen, Darius Petermann, Jiatong Shi, Haibin Wu, Haici Yang, Anastasia Kuznetsova,Artem Ploujnikov, Ricard Marxer, Bhuvana Ramabhadran, Benjamin Elizalde, Loren Lugosch, Jinyu Li, Cem Subakan, Phil Woodland, Minje Kim, Hung-yi Lee, Shinji Watanabe, Yossi Adi, Mirco Ravanelli
preprint
ALAS: Measuring Latent Speech-Text Alignment For Spoken Language Understanding In Multimodal LLMs
Pooneh Mousavi, Yingzhi Wang, Mirco Ravanelli, Cem Subakan
ICML Workshop on Machine Learning for Audio, 2025
LiSTEN: Learning Soft Token Embeddings for Neural Audio LLMs
Pooneh Mousavi‡, Shubham Gupta‡, Cem Subakan, Mirco Ravanelli
Proc. of Interpeech, 2025
What Are They Doing? Joint Audio-Speech Co-Reasoning
Yingzhi Wang, Pooneh Mousavi, Artem Ploujnikov, Mirco Ravanelli
Proc. of ICASSP, 2025
How Should We Extract Discrete Audio Tokens from Self-Supervised Models?
Pooneh Mousavi, Jarod Duret and , Salah Zaiem, Luca Della Libera, Artem Ploujnikov, Cem Subakan, Mirco Ravanelli
Proc. of Interspeech, 2024, Oral Session
DASB - Discrete Audio and Speech Benchmark.
Pooneh Mousavi, Luca Della Libera, Jarod Duret, Artem Ploujnikov, Cem Subakan, Mirco Ravanell
Open-Source Conversational AI with SpeechBrain 1.0
Mirco Ravanelli, Titouan Parcollet, et al
JMLR (Machine Learning Open Source Software)
CL-MASR: A Continual Learning Benchmark for Multilingual ASR
Luca Della Libera‡, Pooneh Mousavi‡, Salah Zaiem, Cem Subakan, Mirco Ravanelli
Submitted to Transactions on Audio, Speech and Language Processing
WOODS: Benchmarks for Out-of-Distribution Generalization in Time Series
Jean-Christophe Gagnon-Audet, Kartik Ahuja, Mohammad Javad Darvishi Bayazi, Pooneh Mousavi, Guillaume Dumas, Irina Rish
Transactions on Machine Learning Research
Detecting Hashtag Hijacking for Hashtag Activism
Pooneh Mousavi, Jessica Ouyang
ACL | IJCNLP | NLP4PosImpact
Please Donate for the Affected:Supporting Emergency Managers in Finding Volunteers and Donations in Twitter Across Disasters
Pooneh Mousavi, Cody Buntain
ISCRAM 2022
Conversational AI, Concordia University
Winter 2023, Winter 2024, See info here
Full Resume in PDF.