3 Jan 2024 | Xianjun Yang1; Junfeng Gao2*, Wenxin Xue3, Erik Alexandersson4
PLLaMa is an open-source large language model (LLM) designed specifically for plant science, enhancing its capabilities in understanding and interacting with plant-related information. The model is an extension of LLaMa-2, incorporating a comprehensive database of over 1.5 million scholarly articles in plant science. This extensive database enriches PLLaMa with deep knowledge and proficiency in plant and agricultural sciences. Initial tests using specific datasets related to plants and agriculture show significant improvements in understanding plant science topics. An international panel of experts, including plant scientists, agricultural engineers, and plant breeders, has been formed to verify the accuracy of PLLaMa's responses, ensuring its reliability in the field. The model's checkpoints and source codes are freely available to the scientific community to support further research and development. The paper also details the experimental configuration, benchmark results, and a zero-shot case study, highlighting the model's performance and potential for specialized applications in plant science.PLLaMa is an open-source large language model (LLM) designed specifically for plant science, enhancing its capabilities in understanding and interacting with plant-related information. The model is an extension of LLaMa-2, incorporating a comprehensive database of over 1.5 million scholarly articles in plant science. This extensive database enriches PLLaMa with deep knowledge and proficiency in plant and agricultural sciences. Initial tests using specific datasets related to plants and agriculture show significant improvements in understanding plant science topics. An international panel of experts, including plant scientists, agricultural engineers, and plant breeders, has been formed to verify the accuracy of PLLaMa's responses, ensuring its reliability in the field. The model's checkpoints and source codes are freely available to the scientific community to support further research and development. The paper also details the experimental configuration, benchmark results, and a zero-shot case study, highlighting the model's performance and potential for specialized applications in plant science.