2025
Paper visualization
FilBench: Can LLMs Understand and Generate Filipino?
Lester James V. Miranda*, Elyanah Aco*, Conner Manuel*, Jan Christian Blaise Cruz, Joseph Marvin Imperial
EMNLP 2025 (Main)
We created a comprehensive benchmark for evaluating LLMs on PH-centric tasks, including cultural knowledge, reading comprehension, classical NLP, and generation.
Paper visualization
The UD-NewsCrawl Treebank: Reflections and Challenges from a Large-scale Tagalog Syntactic Annotation Project
Angelina A. Aquino*, Lester James V. Miranda*, Elsie Marie T. Or*
ACL 2025 (Main)
We created the largest Tagalog treebank to date, containing 100x more data than previous treebanks. Our project also revealed limitations in the Universal Dependencies framework especially on non-Indo-European languages.
2023
Paper visualization
calamanCy: A Tagalog Natural Language Processing Toolkit
Lester James V. Miranda
NLP OSS Workshop (NLP-OSS)