AIRAVATA: INTRODUCING HINDI INSTRUCTION-TUNED LLM

AIRAVATA: INTRODUCING HINDI INSTRUCTION-TUNED LLM

26 Feb 2024 | Jay Gala, Thanmay Jayakumar, Jaavid Aktar Husain, Aswanth Kumar, Mohammed Safi Ur Rahman Khan, Diptesh Kanojia, Ratish Puduppully, Mitesh M. Khapra, Raj Dabre, Rudra Murthy, Anoop Kunchukuttan
The paper introduces "Airavata," an instruction-tuned Hindi language model developed by a collaborative team from various institutions. The model is built on the foundation of OpenHathi, a Hindi language model extended from Llama 2. Airavata is designed to address the lack of support for Indian languages in existing large language models (LLMs) and aims to foster research and innovation in this area. Key contributions of the paper include: 1. **Instruction Tuning Datasets**: The authors create diverse instruction tuning datasets for Hindi, including translated English datasets and native Hindi datasets like wikiHow and Anudesh. These datasets are used to fine-tune Airavata, improving its performance on Hindi tasks. 2. **Fine-Tuning Method**: Airavata is fine-tuned using LoRA (Low-Rank Adaptation) to efficiently update only the relevant parameters, achieving better performance on Hindi NLU tasks compared to full fine-tuning. 3. **Evaluation Benchmarks**: The model is evaluated on standard NLU and NLG benchmarks, including native Hindi test sets and translated English test sets. Airavata demonstrates significant improvements over existing models, particularly in factual reasoning and content generation. 4. **Human Evaluation**: Airavata is evaluated through human annotations, which show that it excels in providing factual opinions and explanations but struggles with creative language usage. It outperforms other models like GPT-4 and ChatGPT in generating natural-sounding Hindi content. The paper also discusses limitations, such as potential hallucinations and the need for more diverse and high-quality training data. Overall, Airavata represents a significant step forward in developing high-quality, open-source LLMs for Indian languages, with potential applications in various assistive tasks.The paper introduces "Airavata," an instruction-tuned Hindi language model developed by a collaborative team from various institutions. The model is built on the foundation of OpenHathi, a Hindi language model extended from Llama 2. Airavata is designed to address the lack of support for Indian languages in existing large language models (LLMs) and aims to foster research and innovation in this area. Key contributions of the paper include: 1. **Instruction Tuning Datasets**: The authors create diverse instruction tuning datasets for Hindi, including translated English datasets and native Hindi datasets like wikiHow and Anudesh. These datasets are used to fine-tune Airavata, improving its performance on Hindi tasks. 2. **Fine-Tuning Method**: Airavata is fine-tuned using LoRA (Low-Rank Adaptation) to efficiently update only the relevant parameters, achieving better performance on Hindi NLU tasks compared to full fine-tuning. 3. **Evaluation Benchmarks**: The model is evaluated on standard NLU and NLG benchmarks, including native Hindi test sets and translated English test sets. Airavata demonstrates significant improvements over existing models, particularly in factual reasoning and content generation. 4. **Human Evaluation**: Airavata is evaluated through human annotations, which show that it excels in providing factual opinions and explanations but struggles with creative language usage. It outperforms other models like GPT-4 and ChatGPT in generating natural-sounding Hindi content. The paper also discusses limitations, such as potential hallucinations and the need for more diverse and high-quality training data. Overall, Airavata represents a significant step forward in developing high-quality, open-source LLMs for Indian languages, with potential applications in various assistive tasks.
Reach us at info@study.space
Understanding Airavata%3A Introducing Hindi Instruction-tuned LLM