7 May 2024 | Mayank Mishra, Matt Stallone, Gaoyuan Zhang, Yikang Shen, Aditya Prasad, Adriana Meza Soria, Michele Merler, Parameswaran Selvam, Saptha Surendran, Shivedeep Singh, Manish Sethi, Xuan-Hong Dang, Pengyuan Li, Kun-Lung Wu, Syed Zawad, Andrew Coleman, Matthew White, Mark Lewis, Raju Pavuluri, Yan Koyfman, Boris Lublinsky, Maximilien de Bayser, Ibrahim Abdelaziz, Kinjal Basu, Mayank Agarwal, Yi Zhou, Chris Johnson, Aanchal Goyal, Hima Patel, Yousaf Shah, Petros Zerfos, Heiko Ludwig, Asim Munawar, Maxwell Crouse, Pavan Kapanipathi, Shweta Salaria, Bob Calio, Sophia Wen, Seetharami Seelam, Brian Belgodere, Carlos Fonseca, Amith Singhee, Nirmit Desai, David D. Cox, Ruchir Puri
Granite Code Models: A Family of Open Foundation Models for Code Intelligence
Granite Code models are a series of decoder-only code models designed for code generation tasks, trained on code written in 116 programming languages. The models range in size from 3 to 34 billion parameters, suitable for various applications, including complex application modernization and memory-constrained environments. Evaluation on a comprehensive set of tasks shows that Granite Code models consistently achieve state-of-the-art performance among available open-source code LLMs. The models are optimized for enterprise software development workflows and perform well across a range of coding tasks, making them a versatile "all around" code model. All Granite Code models are released under an Apache 2.0 license for both research and commercial use.
The Granite Code models consist of two main variants: Granite Code Base and Granite Code Instruct. The base models are trained from scratch with a two-phase training strategy, while the instruct models are further fine-tuned using a combination of filtered CommitPack data, natural language instruction datasets, and open-source math datasets. The base models are trained on 3 to 4 trillion tokens, and the instruct models are trained on 500 billion tokens with a carefully designed mixture of high-quality data from code and natural language domains.
The Granite Code models are evaluated on a comprehensive set of benchmarks, including HumanEvalPack, MBPP(+) and RepoBench, among others. The results show that Granite Code models outperform other open-source code models in various tasks, including code generation, fixing, and explanation. The models also perform well in mathematical reasoning tasks, outperforming other state-of-the-art models.
The Granite Code models are trained on a diverse set of code data, including code from GitHub, and are filtered to ensure quality and remove harmful content. The models are also trained on high-quality natural language data to improve language understanding and mathematical reasoning skills. The models are evaluated on a variety of tasks, including code generation, code explanation, code fixing, code editing, code translation, and mathematical reasoning.
The Granite Code models demonstrate strong performance across a wide range of coding tasks, including code generation, fixing, explanation, editing, translation, and mathematical reasoning. The models are also evaluated on the Berkeley Function-Calling Leaderboard, showing that larger models perform better in function calling tasks. The models are also evaluated on the Recode benchmark, showing that larger models perform better in robustness tasks.
Overall, the Granite Code models are a versatile family of code models that perform well across a wide range of coding tasks, making them suitable for enterprise software development. The models are released under an Apache 2.0 license for both research and commercial use.Granite Code Models: A Family of Open Foundation Models for Code Intelligence
Granite Code models are a series of decoder-only code models designed for code generation tasks, trained on code written in 116 programming languages. The models range in size from 3 to 34 billion parameters, suitable for various applications, including complex application modernization and memory-constrained environments. Evaluation on a comprehensive set of tasks shows that Granite Code models consistently achieve state-of-the-art performance among available open-source code LLMs. The models are optimized for enterprise software development workflows and perform well across a range of coding tasks, making them a versatile "all around" code model. All Granite Code models are released under an Apache 2.0 license for both research and commercial use.
The Granite Code models consist of two main variants: Granite Code Base and Granite Code Instruct. The base models are trained from scratch with a two-phase training strategy, while the instruct models are further fine-tuned using a combination of filtered CommitPack data, natural language instruction datasets, and open-source math datasets. The base models are trained on 3 to 4 trillion tokens, and the instruct models are trained on 500 billion tokens with a carefully designed mixture of high-quality data from code and natural language domains.
The Granite Code models are evaluated on a comprehensive set of benchmarks, including HumanEvalPack, MBPP(+) and RepoBench, among others. The results show that Granite Code models outperform other open-source code models in various tasks, including code generation, fixing, and explanation. The models also perform well in mathematical reasoning tasks, outperforming other state-of-the-art models.
The Granite Code models are trained on a diverse set of code data, including code from GitHub, and are filtered to ensure quality and remove harmful content. The models are also trained on high-quality natural language data to improve language understanding and mathematical reasoning skills. The models are evaluated on a variety of tasks, including code generation, code explanation, code fixing, code editing, code translation, and mathematical reasoning.
The Granite Code models demonstrate strong performance across a wide range of coding tasks, including code generation, fixing, explanation, editing, translation, and mathematical reasoning. The models are also evaluated on the Berkeley Function-Calling Leaderboard, showing that larger models perform better in function calling tasks. The models are also evaluated on the Recode benchmark, showing that larger models perform better in robustness tasks.
Overall, the Granite Code models are a versatile family of code models that perform well across a wide range of coding tasks, making them suitable for enterprise software development. The models are released under an Apache 2.0 license for both research and commercial use.