Accession Number : ADA466330


Title :   A Survey of Statistical Machine Translation


Descriptive Note : Technical rept.


Corporate Author : MARYLAND UNIV COLLEGE PARK DEPT OF COMPUTER SCIENCE


Personal Author(s) : Lopez, Adam


Full Text : http://www.dtic.mil/dtic/tr/fulltext/u2/a466330.pdf


Report Date : Apr 2007


Pagination or Media Count : 51


Abstract : Statistical machine translation (SMT) treats the translation of natural language as a machine learning problem. By examining many samples of human-produced translation, SMT algorithms automatically learn how to translate. SMT has made tremendous strides in less than two decades, and many popular techniques have only emerged within the last few years. This survey presents a tutorial overview of state-of-the-art SMT at the beginning of 2007. We begin with the context of the current research, and then move to a formal problem description and an overview of the four main subproblems: translational equivalence modeling, mathematical modeling, parameter estimation, and decoding. Along the way, we present a taxonomy of some different approaches within these areas. We conclude with an overview of evaluation and notes on future directions.


Descriptors :   *STATE OF THE ART , *TAXONOMY , *MACHINE TRANSLATION , *NATURAL LANGUAGE , *DECODING , MATHEMATICAL MODELS , ALGORITHMS , MODELS


Subject Categories : Numerical Mathematics
      Computer Programming and Software
      Cybernetics


Distribution Statement : APPROVED FOR PUBLIC RELEASE