We present a Wixarika - Spanish machine translator based on statistic (Statistical Machine Translation, SMT) with complementary grammatic knowledge. Wixarika language (also known as Huichol) is spoken in west Mexico by wixaritari people in the states of Jalisco, Nayarit, Zacatecas and Durango. Although they have thriving culture, socioeconomical factors has prevented the creation of large appropriate written resources for SMT.
The parallel corpus used to train SMT is defined as Scarce (800 pared phrases). We use this corpus to train an SMT and we produce automatic translations from Wixárika to Spanish and form Spanish to Wixárika. A corner stone of our proposal is the automatic processing of Wixárika morphology which allows to reach state of the art results for this small corpus exploiting the polysynthetic features of Wixarika language.
This project is part of the master thesis of Jesús Manuel Mager Hois. This theses was assessored by Carlos Barrón Romero, PhD, (UAM-A) and Ivan Vladimir Meza Ruíz, PhD, (UNAM-IIMAS).
@article{tradmager, author = "Mager Hois, Jesús Manuel and Barron Romero, Carlos and Meza Ruíz, Ivan Vladimir", journal = "COMTEL", number = "6", title = "Traductor estadístico wixarika - español usando descomposición morfológica", year = "2016", month = sep }
We invite you to visit the Homepage of Jesús Manuel Mager and his blog eeNube.com.