0
Skip to Content
EleutherAI
EleutherAI
About
Community
Staff
Research
Language Modeling
Interpretability
Alignment
Papers
Releases
Blog
EleutherAI
EleutherAI
About
Community
Staff
Research
Language Modeling
Interpretability
Alignment
Papers
Releases
Blog
Folder: About
Back
Community
Staff
Folder: Research
Back
Language Modeling
Interpretability
Alignment
Papers
Releases
Blog
Dataset Stella Biderman 16/10/2023 Dataset Stella Biderman 16/10/2023

Proof-Pile-2

A 55 billion token dataset of mathematical and scientific documents, created for training the LLeMA models.

Read More
Dataset Stella Biderman 10/10/2023 Dataset Stella Biderman 10/10/2023

OpenWebMath

A 14.7B token dataset of high quality English mathematical text.

Read More

About

Research

Language Modeling

Interpretability

Alignment

Other Modalities

Releases

Blog

contact@eleuther.ai

Copyright EleutherAI 2023