25 Jan 2024 | Gökçe Uludoğan and Zeynep Yirmibeşoğlu Balal and Furkan Akkurt and Melikşah Türker and Onur Güngör and Susan Üsküdarlı
TURNA is a Turkish encoder-decoder language model designed for both natural language understanding and generation. Developed for low-resource Turkish, TURNA is pretrained using a diverse corpus, including web data, scientific articles, books, and parliamentary speeches, following the UL2 framework with a Mixture-of-Denoisers (MoD) pretraining objective. It outperforms several multilingual models in both understanding and generation tasks and competes with monolingual Turkish models in understanding tasks. TURNA has 1.1B parameters and is trained on 43B tokens across various domains. The model is publicly available at hf.co/boun-tabi-LMG/turna. TURNA was evaluated on 13 datasets across eight tasks, showing superior performance in many tasks compared to multilingual models and being competitive with BERTurk in understanding tasks. The model's code and data collection tools are also publicly available for research and benchmarking in Turkish NLP. TURNA's encoder-decoder architecture allows it to handle both understanding and generation tasks effectively. The model was pretrained on a large corpus, including web data, scientific articles, theses, books, and parliamentary speeches. It was fine-tuned on various tasks, including paraphrasing, summarization, and named entity recognition, demonstrating strong performance. TURNA's results show that it outperforms existing multilingual models and performs well in both understanding and generation tasks. The model's performance is further enhanced by its encoder-decoder architecture and the use of the MoD pretraining objective. TURNA's results indicate that it is a strong candidate for Turkish NLP tasks and can be used for further research and benchmarking.TURNA is a Turkish encoder-decoder language model designed for both natural language understanding and generation. Developed for low-resource Turkish, TURNA is pretrained using a diverse corpus, including web data, scientific articles, books, and parliamentary speeches, following the UL2 framework with a Mixture-of-Denoisers (MoD) pretraining objective. It outperforms several multilingual models in both understanding and generation tasks and competes with monolingual Turkish models in understanding tasks. TURNA has 1.1B parameters and is trained on 43B tokens across various domains. The model is publicly available at hf.co/boun-tabi-LMG/turna. TURNA was evaluated on 13 datasets across eight tasks, showing superior performance in many tasks compared to multilingual models and being competitive with BERTurk in understanding tasks. The model's code and data collection tools are also publicly available for research and benchmarking in Turkish NLP. TURNA's encoder-decoder architecture allows it to handle both understanding and generation tasks effectively. The model was pretrained on a large corpus, including web data, scientific articles, theses, books, and parliamentary speeches. It was fine-tuned on various tasks, including paraphrasing, summarization, and named entity recognition, demonstrating strong performance. TURNA's results show that it outperforms existing multilingual models and performs well in both understanding and generation tasks. The model's performance is further enhanced by its encoder-decoder architecture and the use of the MoD pretraining objective. TURNA's results indicate that it is a strong candidate for Turkish NLP tasks and can be used for further research and benchmarking.