بهبود ترجمه ماشینی مبتنی بر قاعده با استفاده از قواعد نحوی آماری
چکیده
ترجمه ماشینی مبتنی بر قاعده از مجموعهای از قواعد که دربردارنده اطلاعات زبانی هستند در فرایند ترجمه استفاده میکند. نتایج تولید شده توسط این مترجمها معمولاً از نظر دستور زبان و ترتیب کلمات بهتر از نتایج مترجمهای آماری هستند. ولی تحقیقات نشان داده است که این ترجمهها از نظر روانی و انتخاب کلمات مناسب، ضعیفتر از مترجمهای آماری هستند. در این مقاله هدف، بهبود انتخاب لغات در مترجم مبتنی بر قاعده است. این کار با استفاده از مجموعهای از قواعد نحوی- لغوی مبتنی بر گرامر درخت- پیوندی (TAG) انجام میشود. این قواعد احتمالاتی بهصورت آماری از یک پیکره موازی با اندازه بزرگ استخراج شدهاند. در سیستم ارائه شده، کلمات با ترتیب پیشنهادی مترجم مبتنی بر قاعده در زبان مقصد قرار میگیرند و به همین دلیل در ترجمه جملات از یک رمزگشای یکنواخت مبتنی بر برنامهریزی پویا استفاده شده است. در این سیستم بهترین ترجمه با استناد به احتمال قواعد استفاده شده و امتیاز مدل زبانی انتخاب میشود. آزمایشها روی ترجمه انگلیسی به فارسی نشان داد که کیفیت نتایج بهدست آمده از روش پیشنهادی حدود 3/1+ واحد بلو از کیفیت ترجمه بهدست آمده توسط مبتنی بر قاعده پایه بالاتر است.
کلمات کلیدی
ترجمه ماشینی ترکیبی, ترجمه ماشینی مبتنی بر قاعده, قواعد آماری, قواعد نحوی-لغوی, گرامر درخت-پیوندی
مراجع
- [1] M. R. Costa-jussàa, M. Farr´us, J. B. Mari˜no, and J. A.R. Fonollosa, "Study And Comparison of Rule-Based And Statistical Catalan-Spanish Machine Translation Systems," Computing and Informatics, vol. 31, no. 2, pp. 245-270, 2012.
- [2] A. K. Joshi, L. S. Levy, and M. Takahashi, "Tree Adjunct Grammars," Journal of Computer and System Sciences, vol. 10, no. 1, pp. 136–163, 1975.
- [3]:, Online]. Available] فرازين: مترجم خودكار متون انگليسي به فارسي www.faraazin.ir.
- [4] M. R. Costa-jussàa, and J. A. R. Fonollosa, "Latest trends in hybrid machine translation and its applications," Computer Speech & Language, vol. 32, no. 1, pp. 3-10, July 2015.
- [5] A. Bisazza, and M. Federico, "A Survey of Word Reordering in Statistical Machine Translation: Computational Models and Language Phenomena," Computational Linguistics, vol. 42, no. 2, pp. 163–205, 2016.
- [6] M. Collins, P. Koehn, and I. Kucerova, "Clause Restructuring for Statistical Machine Translation," In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, pp. 531–540, 2005.
- [7] R. N. Patel, R. Gupta, P. B. Pimpale, and M. Sasikumar, "Reordering rules for English-Hindi SMT," In Proceedings of the 2nd Workshop on Hybrid Approaches to Translation (HyTra), pp. 34-41, 2013.
- [8] F. Xia, and M. McCord, "Improving a Statistical MT System with Automatically Learned Rewrite Patterns," In Proceedings of the 20th international conference on Computational Linguistics, pp. 508, 2004.
- [9] A. Mansouri, H. Fadaei, H. Faili, and M. Arabsorkhi, "Using Synchronous TAG for Source-Side Reordering in SMT," International Journal of Information & Communication Technology Research, vol. 5, no. 4, pp. 47-58, Autumn 2013.
- [10] A. Eisele, C. Federmann, H. Saint-Amand, M. Jellinghaus, T. Herrmann, and Y. Chen. "Using Moses to integrate multiple rule-based machine translation engines into a hybrid system," In Proceedings of the 3rd Workshop on Statistical Machine Translation (WMT), pp. 179–182, 2008.
- [11] A. Ahsan, P. Kolachina, S. Kolachina, D. Misra Sharma, and R. Sangal, "Coupling statistical machine translation with rule-based transfer and generation," In Proceedings of the 9th Conference of the Association for Machine Translation in the Americas. 2010.
- [12] V. M. S´anchez-Cartagena, J. A. P´erez-Ortiz, and F. S´anchez-Mart´ınez, "Integrating Rules and Dictionaries from Shallow-Transfer Machine Translation into PhraseBased Statistical Machine Translation," Journal of Artificial Intelligence Research, vol. 55, pp. 17-61, 2016.
- [13] W. Ma, and K. McKeown, "Detecting and Correcting Syntactic Errors in Machine Translation Using Featurebased Lexicalized Tree Adjoining Grammars," Computational Linguistics and Chinese Language Processing, vol. 17, no. 4, pp. 1-14, December 2012.
- [14] A. L. Lagarda, V. Alabau, F. Casacuberta, R. Silva, and E. Diaz-de Liano, "Statistical post-editing of a rule-based machine translation system," In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers, pp. 217-220, January 2009.
- [15] A. Göhring, "Building a Spanish-German dictionary for hybrid MT," The 3rd Workshop on Hybrid Approaches to MachineTranslation (HyTra), pp. 30–35, April 2014.
- [16] A. Antonova, and A. Misyurev, "Improving the precision of automatically constructed human-oriented translation dictionaries," In Proceedings of the 3rd Workshop on Hybrid Approaches to Ma chine Translation (HyTra), pp. 58–66, April 2014.
- [17] L. Shen, J. Xu, and R. Weischedel, "A New String-todependency Machine Translati on Algorithm with a Target Dependency Language Model," In Proceedings of TheAssociation for Computational Linguistics, pp. 577-585,2008.
- [18]M. Galley, M. Hopkins, K. Knight, and D. Marcu, "What’s in a Translation Rule,"In Proceedings of TheConference of the North American Chapter of theAssociation for Computational Linguistics: HumanLanguage Technologies (NAACL- HLT), Boston,Massachusetts, USA, pp. 273-280, 2004.
- [19]L. Huang, K. Knight, and A. Joshi,"Statistical Syntax-directed Translation with Extended Domain of Locality,"InProceedings of AMTA, pp. 66-73, 2006.
- [20]S. DeNeefe,"Tree-adjoining Machine Translation," PhD Thesis, Faculty of the USC graduate school Universityof Southern California, 2011.
- [21] S. DeNeefe, K. Knight, W. Wang, and D. Marcu, "What Can Syntax-based MT Learn from Phrase-based MT?," In Proceedings of EMNLP-CoNLL, pp. 755-763, 2007.
- [22] D. Klein, and Ch. D. Manning, "Accurate Unlexicalized Parsing," In Proceeding of the 40th Annual meeting of the Association for Computational Linguistics, vol.1, pp. 423- 430, 2003.
- [23] J. Chen, and K. Vijay-Shanker, " Automated Extraction of TAGs from the Penn Treebank," In Proceedings of the Sixth International Workshop on Parsing Technologies, pp. 73-89, 2000.
- [24] F. J. Och, and H. Ney, "Improved Statistical Alignment Models," In Proceedings ofthe 38th Annual Meeting of theAssociation for Computational Linguistics, pp. 440-447, 2000.
- [25] Y. Liu, Q. Liu, and Y. Lu, "Adjoining Tree-to-StringTranslation," In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, Portland, Oregon, pp. 1278–1287, 2011.
- [26] P. Koehn, "Statistical Machine Translation," Cambridge University Press, 2010.
- [27] M. Galley, J. Graehl, K. Knight, D. Marcu, S. DeNeefe,W. Wang and I. Thayer, "Scalable Inference and Training of Context-Rich Syntactic Translation Models," In Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, pp. 961-968, 2006.
- [28] F. Jabbari, S. Bakhshaei, S. M. MohammadzadehZiabary, and S. Khadivi, "Developing an Open-domain English-Farsi Translation System Using AFEC: Amirkabir Bilingual Farsi-English Corpus," In Proceedings of the fourth Workshop on Computational Approaches to Arabic Script-based Languages, pp. 17, 2012.
- [29] K. Papineni, S. Roukos, T. Ward, and W. J. Zhu,"BLEU: a method for automatic evaluation of machine translation," In Proceedings of the 40th Annual meeting of the Association for Computational Linguistics, pp.311–318,2002.
- [30] P. Koehn, "Statistical Significance Tests for Machine Translation Evaluation," In Proceedings of EMNLP. pp. 388–395, 2004.