AmericasNLP 2025 Shared Task 3: Machine Translation Metrics for Indigenous languages

What?

The AmericasNLP 2025 Shared Task on machine translation metrics for Indigenous languages is a competition intended to motivate the advancement of automatic evaluation metrics for machine translation, with a focus on translation into Indigenous languages. Participants will build metrics to evaluate the quality of translations from Spanish into Guarani, Bribri, and Nahuatl.

Why?

Many Indigenous languages of the Americas have linguistic characteristics that are uncommon in the languages typically studied in natural language processing (NLP). For example, Indigenous languages can often be polysynthetic, and they frequently lack orthographic standardization, both of which pose challenges for, e.g., the widely used BLEU and ChrF scores. The goal of the AmericasNLP 2025 shared task on machine translation metrics for Indigenous languages is to encourage researchers to tackle the challenge of creating automatic evaluation metrics for MT systems with a focus on Indigenous languages.

How?

AmericasNLP invites the submission of automatic metrics built for the evaluation of machine translation into Indigenous languages. We will provide a development set, containing the translations and their quality as evaluated by humans. Participants can use the development data to build their metric. The winning submission will be the one with the highest correlation with human judgements on a held-out test set. We provide an evaluation script and some baseline metrics to help the participants get started.

Which languages?

The following language pairs are featured in the shared task: Spanish is always the source language. Metrics will evaluate translations from Spanish into an Indigenous language. Participants are free to submit scores for as many language pairs as they like. Note that the shared task’s winner will be the team with the highest average score across all languages.

System Submission

Please send all your system outputs to americas.nlp.workshop@gmail.com. The subject of your email should be "AmericasNLP2025_SharedTask3; Shared Task Submission; ". The content of your submission email should be as follows:
  • Line 1: Team name
  • Line 2: Names of all team members
  • Line 3: Language codes for all languages you are sending submissions for in order of your choice (we will use that to double-check that we got all files you intended to send)
  • [optional] Line 4: A link to a GitHub repository with code that can be used to reproduce your results. This is not required in order to participate in the shared task, but it’s strongly encouraged. Please attach all output files to your email as a single zip file, named after your team, e.g., "TheGaugeCrew.zip". Within that zip file, the individual files should be named ".results.". The language code should be the same as used in the corresponding evaluation set names. The version number is in case you want to submit the outputs of multiple versions of your metric; it should be a single-digit (please don't submit more than 9 options per language!). The format of your output file should follow the format of the development set files (with your scores instead of the gold scores).

Please attach all output files to your email as a single zip file, named after your team, e.g., "TheGaugeCrew.zip". Within that zip file, the individual files should be named ".results.". The language code should be the same as used in the corresponding evaluation set names. The version number is in case you want to submit the outputs of multiple versions of your metric; it should be a single-digit (please don't submit more than 9 options per language!). The format of your output file should follow the format of the development set files (with your scores instead of the gold scores).

Important Dates

  • Release of development sets: January 25th, 2025
  • Release of baseline systems and baseline results: January 25th, 2025
  • Release of test inputs: March 1st, 2025
  • Submission of results (shared task deadline): March 5th March 12th, 2025
  • Announcement of winners: March 6th March 13th, 2025
  • Submission of system descriptions papers: March 21st, 2025
  • Notification of acceptance: March 22nd, 2025
  • Camera-ready papers due: March 27th, 2025
  • Workshop: May 4th, 2025
All deadlines are 11:59 pm UTC -12h (AoE).

Organizers

Ali Marashian, Enora Rice, Robert Pugh, Abteen Ebrahimi, Luis Chiruzzo, Arturo Oncevay, Aldo Alvarez, Marvin Agüero-Torales, Shruti Rijhwani, Manuel Mager, Rolando Coto-Solano, Katharina von der Wense

Contact: americas.nlp.workshop@gmail.com
Design: Rebeca Guerrero and Manuel Mager