Word Alignment

About

Word alignment is the process of determining which words in a given source and target language sentencepair are translations of each other. This is a token leveltask, meaning that each word (token) in the source text is aligned with its corresponding translation in the target text. In this work, we explored the IBM models introduced by Brown, et al. 2003.

Software

The Duluth Word Alignment System

Publications

The Duluth Word Alignment System. Bridget T. McInnes and Ted Pedersen. HLT/NAACL 2003, Building and Using Parallel Texts and Beyond Workshop (slides: ppt)

GIZA++

While researching different aspects of word alignment, I read and used the software package GIZA++ written by Franz Och. Below is a presentation and a supplement readme that I created for our NLP Group to help explain how to use GIZA++.

GIZA++ Presentation

Supplement README for GIZA++

Last modified 25/08/2014