experience
Highlights of my education & research experience, honors received, and academic talks I gave. Please find my CV for more information.
education
Toyota Technological Institute at Chicago (TTIC)
- 2022 - Present
Ph.D. in Computer Science
- Advisor: Karen Livescu
National Taiwan University (NTU)
- 2019 - 2021
M.S. in Computer Science & Information Engineering
- Advisors: Lin‐shan Lee and Hung‐yi Lee
- 2015 - 2019
B.S.E. in Electrical Engineering
- Ranked 25/256 (9%) with two Dean’s List Awards
research
Speech and Language Group, TTIC
- 2023 - Present
Speech Language Models
- Conducted a comprehensive comparison of SpeechLLM’s capabilities on various speech tasks
- 2022 - Present
Speech-Text Joint Learning
- Discovered speech‑text models with text‑to‑speech transferability which enables zero‑shot spoken language understanding
- 2022 - Present
Speech Representation Learning
- Revealed word‑level language structures intrinsically encoded in self‑supervised speech representations
- Benchmarked speech foundation models on spoken language understanding tasks under various resource considerations
NVIDIA
- 2024
Speech Language Models and Speech Generation
- Worked with Zhehuai Chen and Jason Li
- Augmented pre‑trained Canary LLMs with speech generation capabilities for speech‑to‑speech translation and speech question answering
FAIR (Fundamental AI Research) at Meta
- 2023
Speech Generation
- Worked with Andros Tjanda and Wei‐Ning Hsu
- Worked on the Voicebox project, enhancing fine-grained controllability of speech generation models under resource-limited scenarios.
Amazon Alexa, Cambridge, UK
- 2021
Speaker‐Adaptive Text‐to‐Speech (TTS)
- Worked with Adam Gabryś and Jaime Lorenzo‐Trueba
- Proposed Voice Filter, which improved extremely low‐resource speaker‐adaptive text‐to‐speech (TTS) by modeling content and speaker information separately
- Reduced the gap between synthesized and real speech by over 30%
Speech Processing Laboratory, NTU
- 2020 - 2021
Application of Self-Supervised Speech Representations
- Disentangled speaker and phonetic information in self‐supervised speech representations for the task of voice conversion (VC)
- Proposed SOTA zero‐shot any‐to‐any VC by learning sub‐phoneme alignments between utterances with Transformer attention
- 2020 - 2021
Speaker Representations
- Proposed generative speaker embedding pre‐training for speech synthesis
- Won the 2nd prize of the IEEE ICASSP M2VoC Challenge on low‐resource voice cloning
- 2019 - 2020
Prosody in Synthesized Speech
- Developed hierarchical prosody modeling in TTS
honors
Honors
-
Scholarship
- Government Scholarship to Study Abroad, Ministry of Education of Taiwan ($32,000 in 2 years) (2023)
- Advanced Speech Technologies Scholarship, NTU EECS ($17000) (2021)
- NTUEE60 Scholarship, NTU EE ($3500) (2016)
-
Awards
- Best Student Paper Award, ASRU (2023)
- 2rd Place, ICASSP M2VoC Challenge (2021)
- Top 20 Finalist, Trans Action Award (2020)
- Cathay United Bank Special Award, Make NTU (2019)
- Dean’s List Awards (Two‐Time), NTU EE (2016 & 2017)
-
Leadership
- Captain of the NTU Baseball Varsity Team (2019 - 2020)
-
Non-academic
- 1st Place within UChicago‑Affiliated Athletes (Two Straight Years), J.P. Morgan Corporate Challenge 3.5‑Mile Road Race (2023 & 2024)
- 5th Place (Two‐Time), University Baseball League of Taiwan (equivelent to NCAA Division III) (2019 & 2021)
- Golden Medal, Men’s Half‐Iron Relay, Yilan National Triathlon Championships (2019)
service
Reviewers
- IEEE JSTSP, ICLR, ICASSP
Workshop organizers
- 2024 TTIC Student Workshop
talks
Talks
-
Few‑Shot Spoken Language Understanding via Joint Speech‑Text Models
- Midwest Speech and Language Days (Ann Arbor, MI, US, Apr. 2024)
-
Self‐Supervised Pre‐Trained Voice Conversion
- TTIC Student Workshop (Chicago, IL, US, Nov. 2022)
-
Few‐Shot Speaker Adaptive TTS by Learning from Non‐Target Speakers
- Amazon (Cambridge, UK, Nov. 2021)
-
Speech Synthesis in the Deep Learning Era
- AI Summer School 2020, NTU (Taipei, Taiwan, Aug. 2020)