Model Stella Biderman Model Stella Biderman

Pythia

A suite of models designed to enable controlled scientific research on transparently trained LLMs

A suite of 16 models with 154 partially trained checkpoints designed to enable controlled scientific research on openly accessible and transparently trained large language models.

Read More
Model Stella Biderman Model Stella Biderman

Polyglot-Ko

A series of Korean autoregressive language models made by the EleutherAI polyglot team. We currently have trained and released 1.3B, 3.8B, and 5.8B parameter models.

Polyglot-Ko is a series of Korean autoregressive language models made by the EleutherAI polyglot team. We currently have trained and released 1.3B, 3.8B, and 5.8B parameter models.

Read More
Model, Library Stella Biderman Model, Library Stella Biderman

RWKV

RWKV is an RNN with transformer-level performance at some language modeling tasks. Unlike other RNNs, it can be scaled to tens of billions of parameters efficiently.

RWKV is an RNN with transformer-level performance at some language modeling tasks. Unlike other RNNs, it can be scaled to tens of billions of parameters quite efficiently.

Read More
Model, Library Stella Biderman Model, Library Stella Biderman

OpenFold

A trainable, memory-efficient, and GPU-friendly PyTorch reproduction of AlphaFold2

Trainable, memory-efficient, and GPU-friendly PyTorch reproduction of AlphaFold2

Read More
Model Stella Biderman Model Stella Biderman

GPT-NeoX-20B

An open source English autoregressive language model trained on the Pile. At the time of its release, it was the largest publicly available language model in the world.

GPT-NeoX-20B is a open source English autoregressive language model trained on the Pile,. At the time of its release, it was the largest publicly available language model in the world.

Read More
Model Anel Islamovic Model Anel Islamovic

CARP

A CLIP-like model trained on (text, critique) pairs with the goal of learning the relationships between passages of text and natural language feedback on those passages.

A CLIP-like model trained on (text, critique) pairs with the goal of learning the relationships between passages of text and natural language feedback on those passages.

Read More
Model Stella Biderman Model Stella Biderman

GPT-J

A six billion parameter open source English autoregressive language model trained on the Pile. At the time of its release it was the largest publicly available GPT-3-style language model in the world.

GPT-J is a six billion parameter open source English autoregressive language model trained on the Pile. At the time of its release it was the largest publicly available GPT-3-style language model in the world.

Read More
Model Stella Biderman Model Stella Biderman

VQGAN-CLIP

A technique for doing text-to-image synthesis cheaply using pretrained CLIP and VQGAN models.

VQGAN-CLIP is a methodology for using multimodal embedding models such as CLIP to guide text-to-image generative algorithms without additional training. While the results tend to be worse than pretrained text-to-image generative models, they are orders of magnitude cheaper and can often be assembled out of pre-existing independently valuable models. Our core approach has been adopted to a variety of domains including text-to-3D and audio-to-image synthesis, as well as to develop novel synthetic materials.

Read More
Model Stella Biderman Model Stella Biderman

GPT-Neo

A set of 3 decoder-only LLMs with 125M, 1.3B, and 2.7B parameters trained on the Pile.

A series of large language models trained on the Pile. It was our first attempt to produce GPT-3-like language models and comes in 125M, 1.3B, and 2.7B parameter variants.

Read More