We are a collective of Filipino NLP ResearchersEnthusiastsPractitionersProfessionalsStudents
We are a collective, from graduate students to industry practitioners, who are working to advance Philippine natural language processing (NLP) through open research and collaboration. We operate as a scrappy grassroots team, pooling shared resources to move fast and build what the research community needs.
News
-
Nov 2025We're starting new projects in Filbench on new evals and training data curation. Reach out to get involved!
-
Aug 2025Our paper, FilBench: Do LLMs Understand or Generate Filipino? will be presented at EMNLP 2025 Main!
-
May 2025A collaboration work with UP Diliman on the UD-NewsCrawl treebank will be presented at ACL 2025 in Vienna.
-
Jan 2025New version of calamanCy available: better dependency parsing and new GliNER finetunes.
Publications
-
FilBench: Can LLMs Understand and Generate Filipino? EMNLP 2025 (Main)
We created a comprehensive benchmark for evaluating LLMs on PH-centric tasks, including cultural knowledge, reading comprehension, classical NLP, and generation.
-
The UD-NewsCrawl Treebank: Reflections and Challenges from a Large-scale Tagalog Syntactic Annotation Project ACL 2025 (Main)
We created the largest Tagalog treebank to date, containing 100x more data than previous treebanks. Our project also revealed limitations in the Universal Dependencies framework especially on non-Indo-European languages.
-
calamanCy: A Tagalog Natural Language Processing Toolkit NLP OSS Workshop (NLP-OSS)
Who are we?
We started as a small group of researchers who met at conferences and workshops, connected through cold emails and a shared passion for Filipino NLP.
- Lj V. Miranda . PhD Student, University of Cambridge
- Joseph Imperial . PhD Student, University of Bath
- Blaise Cruz . PhD Student, MBZUAI
- Elyanah Aco . MS Student, NAIST
- Conner Manuel . MLE, Together AI
Join Us
Reach out to Lj and mention your research interests. If you also know someone from FilBench, you can ask them to add you.