Meet LLM360: the first open source and transparent large language models (LLMs)

featured image
https://arxiv.org/abs/2312.06550

Open source large language models (LLMs) such as LLaMA, Falcon, and Mistral offer a range of options for AI professionals and researchers. However, the majority of these LLM programs only provided selected components such as final model weights or inference scripts, with technical documentation often narrowing its focus to broader design aspects and underlying metrics. This approach limits progress in this area by reducing clarity in LLMs’ training methodologies, resulting in repetitive efforts by teams to uncover numerous aspects of training procedures.

A team of researchers from Petuum, MBZUAI, USC, CMU, UIUC, and UCSD introduced LLM360 to support open and collaborative AI research by making the end-to-end LLM training process transparent and replicable by everyone. LLM360 is an initiative of fully open source LLMs that calls for all training code, data, model checkpoints, and intermediate results to be made available to the community.

The closest project to LLM360 is Pythia, which also aims to achieve full cloning of LLMs. EleutherAI models such as GPT-J and GPT-NeoX have been released with training code, datasets, and intermediate model checkpoints, demonstrating the value of open source training code. INCITE, MPT, and OpenLLaMA have been released with training code and training datasets, and RedPajama has also released intermediate model checkpoints.

LLM360 exports two 7B-parameter LLMs, AMBER and CRYSTALCODER, as well as training code, data, intermediate checkpoints, and analytics. Details of the pre-training dataset, including data preprocessing, formatting, data mixing ratios, and architectural details of the LLM model, are reviewed in the study.

The paper mentions using the degree of preservation introduced in previous work and releasing metrics, datasets, and checkpoints for researchers to easily find their correspondence. The study also underscores the importance of removing the data on which LLMs were previously trained, along with details about data filtering, processing, and training order, to assess the risk of LLMs.

The paper presents benchmark results on four datasets, namely ARC, HellaSwag, MMLU, and TruthfulQA, demonstrating the model’s performance during pre-training. The HellaSwag and ARC scores monotonically increase during pre-training, while the TruthfulQA score decreases. The degree of MMLU initially decreases and then begins to grow. AMBER’s performance is relatively competitive in grades such as MMLU, but lags behind in ARC. Precision-equipped AMBER models demonstrate strong performance compared to other similar models.

In conclusion, LLM360 is a comprehensive, fully open source LLM initiative to promote transparency within the open source LLM pre-training community. The study released two 7B LLMs, AMBER and CRYSTALCODER, as well as training code, data, intermediate model checkpoints, and analytics. The study emphasizes the importance of open sources. LLMs from all angles, including launch checkpoints, datasets, and evaluation results, to enable comprehensive analysis and repeatability.


Check the paper. All credit for this research goes to the researchers in this project. Also don’t forget to join We have 33k+ ML SubReddit, 41k+ Facebook community, Discord channelAnd Email newsletterwhere we share the latest AI research news, cool AI projects, and more.

If you like our work, you’ll love our newsletter.

Sana Hassan, a trainee consultant at Marktechpost and a dual degree student at IIT Madras, is passionate about applying technology and AI to address real-world challenges. With a keen interest in solving practical problems, he brings a new perspective to the intersection between AI and real-life solutions.

🐝 [Free Webinar] Alexa, Upgrade My App: Integrating Voice AI into Your Strategy (December 15, 2023)


Introducing LLM360, the groundbreaking large language models (LLMs) that are setting a new standard for openness and transparency in the field of natural language processing. Developed as the first open source and transparent LLMs, LLM360 aims to revolutionize the way we interact with and understand language models. With a commitment to accessibility and collaboration, LLM360 seeks to empower researchers, developers, and users to explore the potential of large language models in a more transparent and inclusive manner. Join us as we embark on this journey to transform the landscape of language modeling with LLM360.

Previous Post Next Post

Formulaire de contact