AlphaFold2 Replication

In Progress

A replication of AlphaFold2 architecture (very accurate protein structure prediction model) with a free licensing.

CLASP - Contrastive Language-Amino Acid Sequence Pretraining

In progress

A CLIP-like model for amino acid sequence prediction.

Eval Harness

In Progress

Github Repo: https://github.com/EleutherAI/lm-evaluation-harness

GPT-Neo

Completed

An implementation of model & data-parallel GPT-2 and GPT-3-like models with Mesh Tensorflow.

GPT-NeoX

In Progress

An implementation of 3D-parallel GPT⁠-⁠3-like models for distributed GPUs.

OpenWebText2

Completed

An enhanced version of OpenWebTextCorpus.

The Pile

Completed

A large, diverse, open-source language modelling dataset.