Tai Nguyen
I am Tai (Đức Tài), a research engineer at Apple . I am working on pushing the boundaries of on-device language models and multimodality.
Previously, I got my MS from the University of Pennsylvania, where I got started on research with Eric Wong and Chris Callison-Burch .
I was also fortunate to work with Ben Bogin from Ai2.
Before that, I helped build an analytics tool to support mainframes at IBM Systems . I studied Economics at the wonderful Haverford College and wrote my undergraduate thesis on the impact of Airbnb on welfare.
I grew up in Saigon, Vietnam. 🇻🇳
Email /
GitHub /
Google Scholar /
huggingface /
Twitter
Photo credit: Grace Pindzola
Research
(*: equal contribution)
DataDecide: How to Predict Best Pretraining Data with Small Experiments
Ian Magnusson*, Nguyen Tai* , Ben Bogin*, David Heineman, Jena D Hwang, Luca Soldaini, Akshita Bhagia, Jiacheng Liu, Dirk Groeneveld, Oyvind Tafjord, Noah A Smith, Pang Wei Koh, Jesse Dodge
ICML 2025
arxiv /
code /
blog /
huggingface /
MMTEB: Massive Multilingual Text Embedding Benchmark
Kenneth Enevoldsen, Isaac Chung, ... Nguyen Tai ..., Niklas Muennighoff (82 authors)
ICLR 2025
arxiv /
code /
website /
In-context Example Selection with Influences
Nguyen Tai , Eric Wong
arXiv 2024
arxiv /
code /
blog /
Explanation-based Finetuning Makes Models More Robust to Spurious Cues
Josh Magnus Ludan, Yixuan Meng*, Tai Nguyen* , Saurabh Shah, Qing Lyu, Marianna Apidianaki, Chris Callison-Burch
ACL 2023
arxiv /
code /
Software Entity Recognition with Noise-robust Learning
Tai Nguyen , Yifeng Di, Joohan Lee, Muhao Chen, Tianyi Zhang
ASE 2023
arxiv /
code /
huggingface /
Big Data Bowl
Ryan Brill, Joseph Rudoler, Tai Nguyen , Ryan Gross
2023
writeup /
video /
code /
feature article /
One of 5 finalists, winning $15,000. We got to meet the Director of Research of the NFL and had a professional video made.
Underthesea
2022
website /
code /
Contributed a small amount to an open-source Vietnamese toolkit built by the amazing Anh Vu . This helped me get started on NLP.
STEAM For Vietnam
2022
website /
During Covid, I volunteered for a non-profit that provides free online education for Vietnamese children. I worked on the data science team.