Stable Code Technical Report
Authors:
Nikhil Pinnaparaju,
Reshinth Adithyan,
Duy Phung,
Jonathan Tow,
James Baicoianu,
Ashish Datta,
Maksym Zhuravinskyi,
Dakota Mahan,
Marco Bellagente,
Carlos Riquelme,
Nathan Cooper
Abstract:
We introduce Stable Code, the first in our new-generation of code language models series, which serves as a general-purpose base code language model targeting code completion, reasoning, math, and other software engineering-based tasks. Additionally, we introduce an instruction variant named Stable Code Instruct that allows conversing with the model in a natural chat interface for performing quest…
▽ More
We introduce Stable Code, the first in our new-generation of code language models series, which serves as a general-purpose base code language model targeting code completion, reasoning, math, and other software engineering-based tasks. Additionally, we introduce an instruction variant named Stable Code Instruct that allows conversing with the model in a natural chat interface for performing question-answering and instruction-based tasks. In this technical report, we detail the data and training procedure leading to both models. Their weights are available via Hugging Face for anyone to download and use at https://huggingface.co/stabilityai/stable-code-3b and https://huggingface.co/stabilityai/stable-code-instruct-3b. This report contains thorough evaluations of the models, including multilingual programming benchmarks, and the MT benchmark focusing on multi-turn dialogues. At the time of its release, Stable Code is the state-of-the-art open model under 3B parameters and even performs comparably to larger models of sizes 7 billion and 15 billion parameters on the popular Multi-PL benchmark. Stable Code Instruct also exhibits state-of-the-art performance on the MT-Bench coding tasks and on Multi-PL completion compared to other instruction tuned models. Given its appealing small size, we also provide throughput measurements on a number of edge devices. In addition, we open source several quantized checkpoints and provide their performance metrics compared to the original model.
△ Less
Submitted 1 April, 2024;
originally announced April 2024.
Stable LM 2 1.6B Technical Report
Authors:
Marco Bellagente,
Jonathan Tow,
Dakota Mahan,
Duy Phung,
Maksym Zhuravinskyi,
Reshinth Adithyan,
James Baicoianu,
Ben Brooks,
Nathan Cooper,
Ashish Datta,
Meng Lee,
Emad Mostaque,
Michael Pieler,
Nikhil Pinnaparju,
Paulo Rocha,
Harry Saini,
Hannah Teufel,
Niccolo Zanichelli,
Carlos Riquelme
Abstract:
We introduce StableLM 2 1.6B, the first in a new generation of our language model series. In this technical report, we present in detail the data and training procedure leading to the base and instruction-tuned versions of StableLM 2 1.6B. The weights for both models are available via Hugging Face for anyone to download and use. The report contains thorough evaluations of these models, including z…
▽ More
We introduce StableLM 2 1.6B, the first in a new generation of our language model series. In this technical report, we present in detail the data and training procedure leading to the base and instruction-tuned versions of StableLM 2 1.6B. The weights for both models are available via Hugging Face for anyone to download and use. The report contains thorough evaluations of these models, including zero- and few-shot benchmarks, multilingual benchmarks, and the MT benchmark focusing on multi-turn dialogues. At the time of publishing this report, StableLM 2 1.6B was the state-of-the-art open model under 2B parameters by a significant margin. Given its appealing small size, we also provide throughput measurements on a number of edge devices. In addition, we open source several quantized checkpoints and provide their performance metrics compared to the original model.
△ Less
Submitted 27 February, 2024;
originally announced February 2024.