Language Modeling

The ability of a computer to understand, interpret, and generate human language is at the heart of what we do at EleutherAI.

Current Projects

Featured

Training LLMs

Evaluating LLMs

Polyglot

Releases

Featured

Library

trlX

Library

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

Library

Dataset

Proof-Pile-2

Dataset

A 55 billion token dataset of mathematical and scientific documents, created for training the LLeMA models.

Dataset

Model

LLeMA

Model

Language models for mathematical applications

Model

Dataset

OpenWebMath

Dataset

A 14.7B token dataset of high quality English mathematical text.

Dataset

Model

Pythia

Model

A suite of models designed to enable controlled scientific research on transparently trained LLMs

Model

Polyglot-Ko

Model

A series of Korean autoregressive language models made by the EleutherAI polyglot team. We currently have trained and released 1.3B, 3.8B, and 5.8B parameter models.

Model

Papers

Featured

Feb 12, 2024

arXiv

Suppressing Pink Elephants with Direct Principle Feedback

Feb 12, 2024

arXiv

Feb 12, 2024

arXiv

Feb 6, 2024

arXiv

Neural networks learn moments of increasing order

Feb 6, 2024

arXiv

Feb 6, 2024

arXiv

Dec 16, 2023

ICLR

Quality-Diversity through AI Feedback

Dec 16, 2023

ICLR

Dec 16, 2023

ICLR

Dec 16, 2023

ICLR

ReLoRA: High-Rank Training Through Low-Rank Updates

Dec 16, 2023

ICLR

Dec 16, 2023

ICLR

Dec 16, 2023

NeurIPS Workshop on Socially Responsible Language Modelling Research (SoLaR)

Eliciting Language Model Behaviors using Reverse Language Models

Dec 16, 2023

NeurIPS Workshop on Socially Responsible Language Modelling Research (SoLaR)

Dec 16, 2023

NeurIPS Workshop on Socially Responsible Language Modelling Research (SoLaR)

Dec 16, 2023

NeurIPS Workshop (SoLaR)

Eliciting Language Model Behaviors using Reverse Language Models

Dec 16, 2023

NeurIPS Workshop (SoLaR)

Dec 16, 2023

NeurIPS Workshop (SoLaR)

Dec 15, 2023

NeurIPS Workshop (Math-AI)

Llemma: An Open Language Model For Mathematics

Dec 15, 2023

NeurIPS Workshop (Math-AI)

Dec 15, 2023

NeurIPS Workshop (Math-AI)

Dec 15, 2023

NeurIPS Workshop (Math-AI)

OpenWebMath: An Open Dataset of High-Quality Mathematical Web Text

Dec 15, 2023

NeurIPS Workshop (Math-AI)

Dec 15, 2023

NeurIPS Workshop (Math-AI)

Dec 15, 2023

NeurIPS

Emergent and Predictable Memorization in Large Language Models

Dec 15, 2023

NeurIPS

Dec 15, 2023

NeurIPS

Dec 14, 2023

NeurIPS

The Goldilocks of Pragmatic Understanding: Fine-Tuning Strategy Matters for Implicature Resolution by LLMs

Dec 14, 2023

NeurIPS

Laura Ruis, Akbir Khan, Stella Biderman, Sara Hooker, Tim Rocktäschel, and Edward Grefenstette. "Large language models are not zero-shot communicators." arXiv preprint arXiv:2210.14986, 2022.

Dec 14, 2023

NeurIPS

Dec 9, 2023

ICML Workshop

Do LLMs selectively encode the goal of an agent's reach?

Dec 9, 2023

ICML Workshop

Dec 9, 2023

ICML Workshop

Dec 8, 2023

EMNLP

trlX: A Framework for Large Scale Reinforcement Learning from Human Feedback

Dec 8, 2023

EMNLP

Reinforcement learning from human feedback (RLHF) utilizes human feedback to better align large language models with human preferences via online optimization against a learned reward model. Current RLHF paradigms rely on Proximal Policy Optimization (PPO), which quickly becomes a challenge to implement and scale up to large architectures. To address this difficulty we present the trlX library as a feature-complete open-source framework for RLHF fine-tuning of models up to and exceeding 70 billion parameters. We implement support for multiple types of distributed training including distributed data parallel, model sharded, as well as tensor, sequential, and pipeline parallelism.

To increase the accessibility of RLHF to researchers, we implement compute- and memory-saving features that give trlX the flexibility to support users with a wide range of compute resources. This includes offline RL methods like Implicit Language Q Learning (ILQL), low-rank adapters, and the Hydra architecture. We find offline fine-tuning offers competitive performance relative to online algorithms while being easier to implement, train, and scale. To evaluate our framework we train RLHF models on two separate well-known tasks using publicly available human preference data. Models trained with trlX achieve preference win-rates over baselines at rates comparable to the original works.

Dec 8, 2023

EMNLP

Dec 6, 2023

EMNLP (Findings)

RWKV: Reinventing RNNs for the Transformer Era

Dec 6, 2023

EMNLP (Findings)

Dec 6, 2023

EMNLP (Findings)

Oct 24, 2023

arXiv

Linear Representations of Sentiment in Large Language Models

Oct 24, 2023

arXiv

Oct 24, 2023

arXiv

Aug 31, 2023

arXiv

YaRN: Efficient Context Window Extension of Large Language Models

Aug 31, 2023

arXiv

Aug 31, 2023

arXiv

Aug 8, 2023

Workshop on Efficient Systems for Foundation Models @ ICML

Continual Pre-Training of Large Language Models: How to (re)warm your model?

Aug 8, 2023

Workshop on Efficient Systems for Foundation Models @ ICML

Aug 8, 2023

Workshop on Efficient Systems for Foundation Models @ ICML

Jun 30, 2023

arXiv

Stay on topic with Classifier-Free Guidance

Jun 30, 2023

arXiv

Jun 30, 2023

arXiv

Jun 7, 2023

arXiv

A Technical Report for Polyglot-Ko: Open-Source Large-Scale Korean Language Models

Jun 7, 2023

arXiv

Jun 7, 2023

arXiv

Jun 3, 2023

ACL

GAIA Search: Hugging Face and Pyserini Interoperability for NLP Training Data Exploration

Jun 3, 2023

ACL

Jun 3, 2023

ACL

May 25, 2023

arXiv

Role-Play with Large Language Models

May 25, 2023

arXiv

May 25, 2023

arXiv

May 4, 2023

arXiv

StarCoder: May the Source be With You!

May 4, 2023

arXiv

May 4, 2023

arXiv

Apr 25, 2023

ICML

Recasting Self-Attention with Holographic Reduced Representations

Apr 25, 2023

ICML

Apr 25, 2023

ICML

Apr 5, 2023

ICML

Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling

Apr 5, 2023

ICML

Apr 5, 2023

ICML

Mar 2, 2023

arXiv

Eliciting Latent Predictions from Transformers with the Tuned Lens

Mar 2, 2023

arXiv

Mar 2, 2023

arXiv

Feb 24, 2023

arXiv

ProofNet: Autoformalizing and Formally Proving Undergraduate-Level Mathematics

Feb 24, 2023

arXiv

Azerbayev, Piotrowski, Schoelkopf, Ayers, Radev, and Avigad. "ProofNet: Autoformalizing and Formally Proving Undergraduate-Level Mathematics." arXiv preprint arXiv:2302.12433 (2023).

Feb 24, 2023

arXiv

Jan 9, 2023

Deep Learning 4 Code Workshop

SantaCoder: don't reach for the stars!

Jan 9, 2023

Deep Learning 4 Code Workshop

Allal, Li, Kocetkov, et al. "SantaCoder: don't reach for the stars!." arXiv preprint arXiv:2301.03988 (2023).

Jan 9, 2023

Deep Learning 4 Code Workshop

Dec 19, 2022

arXiv

BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting

Dec 19, 2022

arXiv

Yong, Schoelkopf, Muennighoff, et al. "BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting." arXiv preprint arXiv:2212.09535 (2022).

Dec 19, 2022

arXiv

Nov 23, 2022

ICML

HyperTuning: Toward Adapting Large Language Models without Back-propagation

Nov 23, 2022

ICML

Jason Phang, Yi Mao, Pengcheng He, Weizhu Chen. "HyperTuning: Toward Adapting Large Language Models without Back-propagation." arXiv preprint arXiv:2211.12485, 2022

Nov 23, 2022

ICML

Nov 10, 2022

arXiv

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Nov 10, 2022

arXiv

Le Scao, et al. (incl. Tow, Biderman, Ammanamanchi, Gao, Sutawika, Teehan). "BLOOM: A 176B-Parameter Open-Access Multilingual Language Model." arXiv preprint arXiv: 2211.05100, 2022.

Nov 10, 2022

arXiv

Nov 3, 2022

arXiv

Crosslingual Generalization through Multitask Fine Tuning

Nov 3, 2022

arXiv

Muennighoff, et al. (incl. Sutawika, Biderman, and Schoelkopf). "Crosslingual Generalization through Multitask Finetuning." arXiv preprint arXiv:2211.01786, 2022.

Nov 3, 2022

arXiv