Abstract: | Prior work on training the IBM-3 translation model is based on
suboptimal methods for computing Viterbi alignments. In this paper, we
present the first method guaranteed to produce globally optimal
alignments. This not only results in improved alignments, it also
gives us the opportunity to evaluate the quality of standard
hillclimbing methods. Indeed, hillclimbing works reasonably well in
practice but still fails to find the global optimum for between 2%
and 12% of all sentence pairs and the probabilities can be several
tens of orders of magnitude away from the Viterbi alignment.
By reformulating the alignment problem as an Integer Linear Program,
we can use standard machinery from global optimization theory to
compute the solutions. We use the well-known branch-and-cut method,
but also show how it can be customized to the specific problem
discussed in this paper. In fact, a large number of alignments can be
excluded from the start without losing global optimality. |