Researcher Profile

Editor reviewed

Travis Hoppe

Open-source LLMs (EleutherAI)

Open-source builder across NLP, machine learning, and data-science projects

Worth knowing as one of the early open-data contributors around the EleutherAI orbit, with a profile that mixes work on The Pile with a long tail of small, public NLP and machine-learning experiments.

Organizations

EleutherAI

Labs

EleutherAI

Topics

Open Models

About This Page

This profile is meant to help you get oriented quickly: why this researcher matters, what to read first, and where to explore next.

Last reviewed

March 18, 2026

Best First Clicks

Travis Hoppeprofile The Pile: An 800GB Dataset of Diverse Text for Language Modelingpaper today-AI-learnedarticle

Known For

The ideas, systems, and research directions that make this person worth knowing.

The Pile dataset

Small open-source NLP and ML projects

Hands-on data-science experimentation

Open-source LLMs (EleutherAI)

GPT-NeoX (GitHub)

EleutherAI (GitHub)

Start Here

Canonical papers, project pages, or repositories that anchor this profile.

Travis Hoppeprofile The Pile: An 800GB Dataset of Diverse Text for Language Modelingpaper today-AI-learnedarticle GPT-NeoX (GitHub)project EleutherAI (GitHub)project

Signature Works

Additional papers, projects, or repositories that help flesh out the profile.

Colorless Green Ideasarticle

Supporting Sources

Additional links that help verify and flesh out this profile.

Colorless Green Ideasarticle

Related Researchers

People worth exploring next because they share topics, labs, or source material with this profile.

Shared canonical source

Laurence Golding

Open-source LLMs (EleutherAI)

5 sources

One of the quieter but still important contributors in the open-data and open-evaluation lineage behind The Pile, GPT-NeoX, and later benchmarking infrastructure.

EleutherAI Open Models Evaluation & Benchmarks

Start HereThe Pile: An 800GB Dataset of Diverse Text for Language Modeling

Shared canonical source

Leo (Len) Gao

Open-source LLMs (EleutherAI)

5 sources

Worth tracking for the open-model side of the field, especially where dataset construction, practical training work, and alignment-flavored thinking meet.

EleutherAI Open Models Post-Training & Alignment

Start HereLeo Gao

Shared canonical source

Noa Nabeshima

Open-source LLMs (EleutherAI)

5 sources

A useful long-tail open-model page because it connects one of the lesser-known contributors to The Pile with a newer line of small public datasets and Hugging Face releases instead of leaving the profile as generic EleutherAI boilerplate.

EleutherAI Open Models

Start HereNoa Nabeshima

Shared canonical source

Shawn Presser

Open-source LLMs (EleutherAI)

5 sources

Worth knowing in the open-model ecosystem because his profile combines authorship on The Pile with a large body of public code and notes rather than only one flagship paper.

EleutherAI Open Models

Start HereShawn Presser

Shared canonical source

Stella Biderman

Open-source LLMs, datasets

5 sources

A key open-model ecosystem builder whose work matters because it combines research, public infrastructure, and field-level coordination rather than isolated paper output alone.

EleutherAI Open Models Systems & Infrastructure

Start HereThe Pile: An 800GB Dataset of Diverse Text for Language Modeling

Shared canonical source

Anish Thite

Open-source LLMs (EleutherAI)

5 sources

Useful to follow if you care about the practical evaluation layer of open models, especially where benchmark tooling and reproducible comparisons actually shape what the ecosystem measures.

EleutherAI Open Models Evaluation & Benchmarks

Start HereAnish Thite