Morphological Segmentation of Low Resource Languages Using Machine Learning
Project Showcase

Morphological Segmentation of Low Resource Languages Using Machine Learning

MORPH-SEGMENT

By: Tumi Moeng , Sheldon Reay , Aaron Daniels


About

Abstract

We investigated the process of Morphological Segmentation being applied to the low resource Nguni language group with a specific focus on isiNdebele, isiXhosa, isiZulu and siSwati. We did this through the use of three machine learning model structures with two, Conditional Random Fields and Sequence to Sequence, being Supervised and the other, Entropy Based, being unsupervised. Each group member dealt with one model structure. Within each structure we all implemented two or more models with one acting as the baseline and the others acting as a comparison to that baseline.

You can find an example of Morphological Segmentation in the attached image at the bottom of the page, or we invite you to watch the demo videos just below. 

Videos 4

Watch presentations, demos, and related content

3 Hosted 1 Local

Documents 1

Downloadable resources and documentation

Click "View Full" to open documents in a new window

Gallery 2

Explore the visual story of this exhibit