EleutherAI

Empowering Open-Source Artificial Intelligence Research

Explore our research

Projects

Interpreting Across Time

How do properties of models emerge and evolve over the course of training?

Eliciting Latent Knowledge

As models get smarter, humans won't always be able to independently check if a model's claims are true or false. We aim to circumvent this issue by directly eliciting latent knowledge (ELK) inside the model’s activations.

Training LLMs

EleutherAI has trained and released many powerful open source LLMs.

Evaluating LLMs

Evaluating advanced AI models in robust and reliable ways.

Alignment MineTest

Alignment-MineTest is a research project that uses the open source Minetest voxel engine as a platform for studying AI alignment.

Mesaoptimization

Studying how auxiliary optimization objectives arise in models

Polyglot

Building LLMs and doing NLP in non-English languages.

Recent Publications

Featured

12 Feb 2024

arXiv

Suppressing Pink Elephants with Direct Principle Feedback

12 Feb 2024

arXiv

12 Feb 2024

arXiv

6 Feb 2024

arXiv

Neural networks learn moments of increasing order

6 Feb 2024

arXiv

6 Feb 2024

arXiv

17 Dec 2023

NeurIPS Workshop (Attributing Model Behavior at Scale)

Sparse Autoencoders Find Highly Interpretable Features in Language Models

17 Dec 2023

NeurIPS Workshop (Attributing Model Behavior at Scale)

17 Dec 2023

NeurIPS Workshop (Attributing Model Behavior at Scale)

16 Dec 2023

ICLR

Quality-Diversity through AI Feedback

16 Dec 2023

ICLR

16 Dec 2023

ICLR

16 Dec 2023

ICLR

ReLoRA: High-Rank Training Through Low-Rank Updates

16 Dec 2023

ICLR

16 Dec 2023

ICLR