We are a collective of Filipino NLP ResearchersEnthusiastsPractitionersProfessionalsStudents

We are a collective, from graduate students to industry practitioners, who are working to advance Philippine natural language processing (NLP) through open research and collaboration. We operate as a scrappy grassroots team, pooling shared resources to move fast and build what the research community needs.

GitHub

HuggingFace

Join Us

News

Nov 2025

We're starting new projects in Filbench on new evals and training data curation. Reach out to get involved!
Aug 2025

Our paper, FilBench: Do LLMs Understand or Generate Filipino? will be presented at EMNLP 2025 Main!
May 2025

A collaboration work with UP Diliman on the UD-NewsCrawl treebank will be presented at ACL 2025 in Vienna.
Jan 2025

New version of calamanCy available: better dependency parsing and new GliNER finetunes.

Ongoing Projects

Blog

Come sit with us at the FilBench!

Nov 2025

Publications

FilBench: Can LLMs Understand and Generate Filipino? EMNLP 2025 (Main)

Lester James V. Miranda*, Elyanah Aco*, Conner Manuel*, Jan Christian Blaise Cruz, Joseph Marvin Imperial

We created a comprehensive benchmark for evaluating LLMs on PH-centric tasks, including cultural knowledge, reading comprehension, classical NLP, and generation.

Code Leaderboard Poster Presentation
The UD-NewsCrawl Treebank: Reflections and Challenges from a Large-scale Tagalog Syntactic Annotation Project ACL 2025 (Main)

Angelina A. Aquino*, Lester James V. Miranda*, Elsie Marie T. Or*

We created the largest Tagalog treebank to date, containing 100x more data than previous treebanks. Our project also revealed limitations in the Universal Dependencies framework especially on non-Indo-European languages.

Dataset Poster Presentation
calamanCy: A Tagalog Natural Language Processing Toolkit NLP OSS Workshop (NLP-OSS)

Lester James V. Miranda

Code Website

Who are we?

We started as a small group of researchers who met at conferences and workshops, connected through cold emails and a shared passion for Filipino NLP.

Lj V. Miranda . PhD Student, University of Cambridge
Joseph Imperial . PhD Student, University of Bath
Blaise Cruz . PhD Student, MBZUAI
Elyanah Aco . MS Student, NAIST
Conner Manuel . MLE, Together AI

Join Us

Reach out to Lj and mention your research interests. If you also know someone from FilBench, you can ask them to add you.

We are a collective of Filipino NLP PractitionersResearchersEnthusiastsPractitionersProfessionalsStudents

News

Ongoing Projects

Blog

Publications

Who are we?

Join Us

We are a collective of Filipino NLP ResearchersEnthusiastsPractitionersProfessionalsStudents