Library Stella Biderman Library Stella Biderman

trlX

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

Read More
Dataset Stella Biderman Dataset Stella Biderman

Proof-Pile-2

A 55 billion token dataset of mathematical and scientific documents, created for training the LLeMA models.

Read More
Model Stella Biderman Model Stella Biderman

Pythia

A suite of models designed to enable controlled scientific research on transparently trained LLMs

A suite of 16 models with 154 partially trained checkpoints designed to enable controlled scientific research on openly accessible and transparently trained large language models.

Read More
Model Stella Biderman Model Stella Biderman

Polyglot-Ko

A series of Korean autoregressive language models made by the EleutherAI polyglot team. We currently have trained and released 1.3B, 3.8B, and 5.8B parameter models.

Polyglot-Ko is a series of Korean autoregressive language models made by the EleutherAI polyglot team. We currently have trained and released 1.3B, 3.8B, and 5.8B parameter models.

Read More
Model, Library Stella Biderman Model, Library Stella Biderman

RWKV

RWKV is an RNN with transformer-level performance at some language modeling tasks. Unlike other RNNs, it can be scaled to tens of billions of parameters efficiently.

RWKV is an RNN with transformer-level performance at some language modeling tasks. Unlike other RNNs, it can be scaled to tens of billions of parameters quite efficiently.

Read More
Stella Biderman Stella Biderman

trlX

A library for distributed and performant training of language models with Reinforcement Learning via Human Feedback (RLHF), created by the CarperAI team.

Read More
Model Stella Biderman Model Stella Biderman

GPT-NeoX-20B

An open source English autoregressive language model trained on the Pile. At the time of its release, it was the largest publicly available language model in the world.

GPT-NeoX-20B is a open source English autoregressive language model trained on the Pile,. At the time of its release, it was the largest publicly available language model in the world.

Read More
Library Stella Biderman Library Stella Biderman

GPT-NeoX

A library for efficiently training large language models with tens of billions of parameters in a multimachine distributed context. This library is currently maintained by EleutherAI.

A library for efficiently training large language models with tens of billions of parameters in a multimachine distributed context.

Read More
Model Anel Islamovic Model Anel Islamovic

CARP

A CLIP-like model trained on (text, critique) pairs with the goal of learning the relationships between passages of text and natural language feedback on those passages.

A CLIP-like model trained on (text, critique) pairs with the goal of learning the relationships between passages of text and natural language feedback on those passages.

Read More
Model Stella Biderman Model Stella Biderman

GPT-J

A six billion parameter open source English autoregressive language model trained on the Pile. At the time of its release it was the largest publicly available GPT-3-style language model in the world.

GPT-J is a six billion parameter open source English autoregressive language model trained on the Pile. At the time of its release it was the largest publicly available GPT-3-style language model in the world.

Read More
Library Stella Biderman Library Stella Biderman

LM Eval Harness

Our library for reproducible and transparent evaluation of LLMs.

Our library for reproducible and transparent evaluation of LLMs.

Read More
Library Curtis Huebner Library Curtis Huebner

Mesh Transformer Jax

A JAX and TPU-based library developed by Ben Wang. The library has been used to train GPT-J.

https://github.com/kingoflolz/mesh-transformer-jax

Read More
Model Stella Biderman Model Stella Biderman

GPT-Neo

A set of 3 decoder-only LLMs with 125M, 1.3B, and 2.7B parameters trained on the Pile.

A series of large language models trained on the Pile. It was our first attempt to produce GPT-3-like language models and comes in 125M, 1.3B, and 2.7B parameter variants.

Read More
Library Curtis Huebner Library Curtis Huebner

GPT-Neo Library

A library for training language models written in Mesh TensorFlow. This library was used to train the GPT-Neo models, but has since been retired and is no longer maintained. We currently recommend the GPT-NeoX library for LLM training.

https://github.com/EleutherAI/gpt-neo

Read More
Dataset Stella Biderman Dataset Stella Biderman

The Pile

A large-scale corpus for training language models, composed of 22 smaller sources. The Pile is publicly available and freely downloadable, and has been used by a number of organizations to train large language models.

The Pile is a curated collection of 22 diverse high-quality datasets for training large language models.

Read More
Dataset Stella Biderman Dataset Stella Biderman

OpenWebText2

OpenWebText2 is an enhanced version of the original OpenWebTextCorpus, covering all Reddit submissions from 2005 up until April 2020. It was developed primarily to be included in the Pile.

OpenWebText2 is an enhanced version of the original OpenWebTextCorpus, covering all Reddit submissions from 2005 up until April 2020. It was developed primarily to be included in the Pile.

Read More