Chung-Ming Chien (簡仲明)
Chicago, Illinois, United States
Santorini, Greece
May 30, 2023
I am a 4th-year Ph.D. student at Toyota Technological Institute at Chicago (TTIC), advised by Karen Livescu. My overarching ambition is to create Trustworthy and User-Friendly Human-Like Speech AI. To achieve this goal, my current research is focused on:
- Factual Conversational Speech AI
How to make full-duplex models more intelligent while preserving their interactiveness? How to best preserve the knowledge learned from large-scale text pre-training? - Aligning Speech Language Models (SLMs) with Human Preference
How to improve various aspects of SLMs – such as content fidelity, natural turn-taking, and emotional expressiveness – with human preference? How to enhance overall user experience?
Aside from these, my past research in Controllable Speech Generation, Efficient Fine-Tuning of SLMs, Flow-Matching & Diffusion Models, and Self-Supervised Speech Representations have also been crucial experience and provided me valuable foundation on my way towards achieving the goal.
Prior to joining TTIC, I completed my Master’s degree in Computer Science at National Taiwan University (NTU), where I had the fortune to work with Lin-shan Lee and Hung-yi Lee at the Speech Processing Lab. I also gained valuable practical experience through summer internships at Amazon Alexa TTS Research, FAIR (AI at Meta), NVIDIA NeMo Team, and Kyutai.
Beyond my academic and research pursuits, I am an avid sports enthusiast and amateur athlete. During my undergraduate studies, I captained the varsity baseball team at NTU. I maintain a broad interest in tennis, hiking, snowboarding, scuba diving, and badminton. In 2022, I achieved a personal milestone by completing my first marathon and I am currently training to break the 3-hour mark.
news
| Sep 5, 2025 | It was a true pleasure reconnecting with old friends and meeting new colleagues at the Workshop on Foundations of Speech and Audio Foundation Models in TTIC. It is my greatest hope that you found the event valuable and enjoyable. |
|---|---|
| Jul 17, 2025 | I will be co-hosting the Workshop on Foundations of Speech and Audio Foundation Models this September. Join us at TTIC to explore the newest advancements in Speech Language Models. |
| Apr 18, 2025 | This summer, I am heading to France to join Kyutai and work on moshi, the world’s first full-duplex speech assistant. This is a fresh trial, and I can’t wait to see what we achieve! Bonjour Paris |
| Apr 11, 2025 | The long wait is over. Our review paper on SLMs, “On the landscape of spoken language models: A comprehensive survey”, is officially on arXiv. Check out this comprehensive survey -— it was a significant, collective effort and is packed with insights! |
| Jun 4, 2024 | “Learning Fine‑Grained Controllability on Speech Generation via Efficient Fine‑Tuning” is accepted to InterSpeech 2024! |
| May 16, 2024 | “On the Evaluation of Speech Foundation Models for Spoken Language Understanding” is accepted to Findings of ACL 2024! |
| Apr 16, 2024 | I gave a talk at Midwest Speech and Language Days in Ann Arbor, Michigan |
| Apr 9, 2024 | I successfully passed the qualifying exam of TTIC and will soon become a Ph.D. candidate |
| Jan 23, 2024 | “What Do Self‑Supervised Speech Models Know about Words” is accepted to TACL 2024! |
| Jan 18, 2024 | I will join NVIDIA NeMo Team for my 2024 summer internship and will work on speech language models! |