It’s all about the trees towards a hybrid syntax-based MT system

International Multiconference on Computer Science and Information Technology, 2009. IMCSIT'09 |

Published by IEEE

Publication

The aim of this paper is to describe the first steps of research towards a hybrid MT system that combines the strengths of rule-based syntactic transfer with recently developed syntax-based statistical translation methods within a unified framework. The similarities of both paradigms concerning the processing of syntactically parsed input trees serve as a basis for this reseach. We focus on the statistical part of the future system and present a syntax-based statistical machine translation system-BONSAI-for Polish-to-French translation. Although BONSAI is still under development, it reaches a translation quality on par with that of a modern phrase-based system. We provide the theoretical background as well as some implementation details and preliminary evaluation results for BONSAI. At the end of this paper we shortly discuss the benefits of a combined approach.