Training LLMs

As the overwhelming majority of EleutherAI’s resources have gone towards training LLMs. EleutherAI has trained and released several LLMs, and the codebases used to train them. Several of these LLMs were the largest or most capable available at the time and have been widely used since in open-source research applications.

Libraries we currently recommend people use out-of-the-box include:

  • Mesh Transformer Jax, a lightweight TPU training framework developed by Ben Wang

  • GPT-NeoX, a PyTorch library built off of Megatron-DeepSpeed which supports training models as large as GPT-3 scale on multiple hosts within a single computing cluster

  • trlX, a PyTorch library for finetuning large language models with Reinforcement Learning via Human Feedback (RLHF)

  • RWKV, a PyTorch library for training RNN with transformer-level LLM performance.

Libraries

Models

Papers

Previous
Previous

Eliciting Latent Knowledge

Next
Next

Evaluating LLMs