A Transformer-Based Framework forDomain-Sensitive Amharic to English MachineTranslation with Character-Aware SubwordEncoding
DOI:
https://doi.org/10.70454/JRIST.2025.10202Keywords:
Neural Machine Translation, Transformer, Amharic, Religious Texts, Subword Encoding, Character-Level Embedding, BLEU ScoreAbstract
This paper proposes a domain-adapted neural machine translation (NMT) system for Amharic-to-English translation, focusing on the issues of low-resource translation in a richly morphologically inflected language. We focus on the religious domain with the Tanzil corpus, a structured collection
of Quranic verses which are translated into Amharic and English for coherence and semantic correspondence. To address the shortcomings of the traditional word-level tokenization of
Amharic, we implement character-level subword tokenization using the SentencePiece model, which is better suited for rare and compound words. Our model harnesses a Transformer based
encoder-decoder model together with multi-head attention and feedforward layers induction over the parallel corpus of poems in English and Amharic.The model achieved 59.03 BLEU score
on the test set, greatly exceeding the classical RNN+Attention baselines which have been shown to have poor performance in low-resource settings. The strong score illustrates that an effectively tuned baseline Transformer model, in combination with domain-specific corpora and sophisticated subword methods, can perform well in translation tasks for under-resourced languages.
The research provides a foundational, reproducible, and scalable framework that is linguistically-informed for Amharic-English translation, and it can be extended in the future, with additional
extensions to other Semitic and morphologically rich languages. Our results highlight the value of domain adaptation and subword-aware architectures in advancing NMT for low-resource
language communities.
References
[1] D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation by jointly learning to align and translate,” in ICLR, 2015. [Online]. Available: https://arxiv.org/abs/1409.0473
[2] T. Mikolov et al., “Efficient estimation of word representations in vector space,” arXiv preprint arXiv:1301.3781, 2013. [Online]. Available: https://arxiv.org/abs/1301.3781
[3] A. Vaswani et al., “Attention is all you need,” in NeurIPS, 2017. [Online]. Available: https://papers.nips.cc/paper files/paper/2017/ file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
[4] G. Gezmu et al., “Subword-based transformer nmt models for amharic- english,” arXiv preprint arXiv:2203.00567, 2022. [Online]. Available: https://arxiv.org/abs/2203.00567
[5] M. Artetxe and H. Schwenk, “Massively multilingual sentence embeddings for zero-shot cross-lingual transfer and beyond,” TACL, vol. 7, pp. 597–610, 2019. [Online]. Available: https://aclanthology.org/ Q19-1042.pdf
[6] G. Gezmu and A. Nu¨rnberger, “Morphoseg: Morphologically-aware segmentation for amharic nmt,” in LREC, 2022. [Online]. Available: https://aclanthology.org/2022.lrec-1.124.pdf
[7] B. Belay and Y. Assabie, “Improving amharic–english machine transla- tion using homophone normalization,” arXiv preprint arXiv:2302.10934, 2023. [Online]. Available: https://arxiv.org/abs/2302.10934
[8] T. Kudo and J. Richardson, “Sentencepiece: A simple and language independent subword tokenizer and detokenizer for neural text processing,” in EMNLP, 2018. [Online]. Available: https://aclanthology. org/D18-2012.pdf
[9] A. Conneau et al., “Unsupervised cross-lingual representation learning at scale,” in ACL, 2020. [Online]. Available: https://aclanthology.org/ 2020.acl-main.747.pdf
[10] S. Baniata et al., “Arabic dialects to modern standard arabic nmt with transformer and subword units,” in ACL, 2023. [Online]. Available: https://aclanthology.org/2023.acl-long.25.pdf
[11] “Tanzil: Quranic translation corpus,” https://opus.nlpl.eu/Tanzil.php.
[12] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997. [Online]. Available: https://www.bioinf.jku.at/publications/older/2604.pdf
[13] Y. Zhang et al., “Character-level representations improve translation for morphologically rich languages,” in ACL, 2020. [Online]. Available: https://aclanthology.org/2020.acl-main.430.pdf
[14] Y. Liu et al., “Multilingual denoising pre-training for neural machine translation,” TACL, vol. 8, pp. 726–742, 2020. [Online]. Available: https://aclanthology.org/2020.tacl-1.47.pdf
[15] H. Zhou et al., “Understanding dropout in deep learning,” NeurIPS, 2022. [Online]. Available: https://arxiv.org/abs/2205.13055
[16] D. Varisˇ and O. Bojar, “Ewc for low-resource neural machine translation,” arXiv preprint arXiv:2301.00251, 2023. [Online]. Available: https://arxiv.org/abs/2301.00251
[17] J. Gao et al., “Cross-lingual consistency regularization for multilingual neural machine translation,” in EMNLP, 2023. [Online]. Available: https://aclanthology.org/2023.emnlp-main.115.pdf
[18] M. Wu et al., “Unidrop: Structured dropout with uniform sampling,” ICLR, 2023. [Online]. Available: https://openreview.net/forum?id= xKzcY0kAQ9G
[19] T. Kocmi and O. Bojar, “An exploration of sentence embeddings for neural machine translation,” arXiv preprint arXiv:2001.03156, 2020. [Online]. Available: https://arxiv.org/abs/2001.03156
[20] A. Fan et al., “Layerdrop: Structured dropout for transformer models,” arXiv preprint arXiv:1909.11556, 2019. [Online]. Available: https://arxiv.org/abs/1909.11556
[21] R. K. Singh, V. Kochher, H. Mehta, S. Gupta, P. Kumar and L. Verma, "Optimizing Security in High-Speed Networking Environments: An Integrated Framework Using AES, MPLS, and IDS for Enhanced Data Protection and Performance," 2025 International Conference on Electronics, AI and Computing (EAIC), Jalandhar, India, 2025, pp. 1-6, doi: 10.1109/EAIC66483.2025.11101617.
[22] T. Wolf et al., “Transformers: State-of-the-art natural language processing,” in Proceedings of the EMNLP 2020, 2020, pp. 38–45. [Online]. Available: https://aclanthology.org/2020.emnlp-demos.6.pdf
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Ragini Rai (Author)

This work is licensed under a Creative Commons Attribution 4.0 International License.
This is an Open Access article distributed under the term's of the Creative Common Attribution 4.0 International License permitting all use, distribution, and reproduction in any medium, provided the work is properly cited.
