Empowering Open-Source Artificial Intelligence Research
Explore our research
In order to study what makes large language models so capable, we must investigate trends throughout training and across different model sizes. We developed our open-access Pythia models to serve as a controlled environment for such interpretability research.
Language models like ChatGPT and Bing Chat often confidently say false things. As models get smarter, humans won't always be able to independently check if a model's claims are true or false. We aim to circumvent this issue by directly eliciting latent knowledge (ELK) inside the model’s activations.
A crucial component of LLM and NLP is the datasets used to train models. Collecting high-quality data is important for creating performant LLMs. Recent work has shown that scaling up data is just as crucial as scaling up model parameter count for efficient use of compute to train LLMs.
EleutherAI has trained and released several LLMs, and the codebases used to train them. Several of these LLMs were the largest or most capable available at the time and have been widely used since in open-source research applications.
Quantifying the performance of large language models is crucial to evaluating new techniques and validating new approaches so that different model releases can be compared objectively. LLMs are generally evaluated on several benchmark datasets and given scores, which serve numeric quantities to compare across models.
Alignment-MineTest is a research project that uses the open source Minetest voxel engine as a platform for studying AI alignment. Alignment in this context refers to the process of ensuring that an artificial intelligence system behaves in a manner that is consistent with human values and goals.
The T5 language model has unambiguously been a major advance for work with large language models. Encoder-decoder models have shown substantial improvements over decoder-only models in contexts, including a zero-shot performance with MTF and image synthesis and comparatively less memorization.
The Polyglot Project focuses on extending the benefits of large language models to cultures and contexts not well suited by the current state of affairs, as well as studying the best practices for doing so. This work includes training LLMs in languages other than English and Chinese, improving tools for non-English data documentation, curation, and analysis, culturally-aware research on ethics and bias in non-English LLMs, and more.