GPT-NeoX is an implementation of 3D-parallel GPT-3-like autoregressive language models for distributed GPUs, based upon Megatron-LM and DeepSpeed.

GPT-NeoX was used to train GPT-NeoX-20B, a 20 billion parameter language model, in colaboration with CoreWeave. Announced on Feburary 2, 2022 and released on The Eye alongside a preliminary technical report one week later, it became the largest dense autoregressive language model ever made freely available to the public at the time of its release.

In-depth information on GPT-NeoX-20B can be found in the associated technical report on arXiv.