I am Tai (Đức Tài), a research engineer at Apple . I am working on multimodal and on-device language models for Apple Intelligence features.
Previously, I got my MS from the University of Pennsylvania, where I got started on research with Eric Wong and Chris Callison-Burch.
I was also fortunate to work with Ben Bogin from Ai2.
Before that, I helped build an analytics tool to help detect and diagnose mainframe server failures at IBM Systems. I studied Economics at the wonderful Haverford College and wrote my undergraduate thesis on the impact of Airbnb on welfare.
Lessons from applying SPICE on a pre-1931 language model.
Coming soon.
Research
(*: equal contribution)
DataDecide: How to Predict Best Pretraining Data with Small Experiments
Ian Magnusson*, Nguyen Tai*, Ben Bogin*, David Heineman, Jena D Hwang, Luca Soldaini, Akshita Bhagia, Jiacheng Liu, Dirk Groeneveld, Oyvind Tafjord, Noah A Smith, Pang Wei Koh, Jesse Dodge
ICML 2025 DataWorld Workshop Oral arxiv /
code /
blog /
huggingface /
press /
MMTEB: Massive Multilingual Text Embedding Benchmark
Kenneth Enevoldsen, Isaac Chung, ... Nguyen Tai ..., Niklas Muennighoff (82 authors) ICLR 2025 arxiv /
code /
website /
In-context Example Selection with Influences
Nguyen Tai, Eric Wong
arXiv 2024 arxiv /
code /
blog /
Explanation-based Finetuning Makes Models More Robust to Spurious Cues
Josh Magnus Ludan, Yixuan Meng*, Tai Nguyen*, Saurabh Shah*, Qing Lyu, Marianna Apidianaki, Chris Callison-Burch
ACL 2023 arxiv /
code /
Software Entity Recognition with Noise-robust Learning
Tai Nguyen, Yifeng Di, Joohan Lee, Muhao Chen, Tianyi Zhang
ASE 2023 arxiv /
code /
huggingface /
Past projects
Big Data Bowl2023
— One of 5 finalists, winning $15,000. We got to meet the Director of Research of the NFL and had a professional video made.
video /
code /
feature article /
Underthesea2022
— Contributed a small amount to an open-source Vietnamese toolkit built by the amazing Anh Vu. This helped me get started on NLP.
code /
STEAM For Vietnam2022
— During Covid, I volunteered for a non-profit that provides free online education for Vietnamese children. I worked on the data science team.
Miscellanea
I enjoy boxing and tennis in my free time. I am a fan of the Seahawks, and try to travel as much as I can.