A useful person to track for the evaluation side of AI risk work, especially where open-model benchmarking meets the question of which measurements are actually trustworthy enough to inform decisions.
Researcher Profile
Editor reviewedAnish Thite
Open-source LLMs (EleutherAI)
Georgia Tech researcher and EleutherAI evaluation contributor
Useful to follow if you care about the practical evaluation layer of open models, especially where benchmark tooling and reproducible comparisons actually shape what the ecosystem measures.
Organizations
Labs
About This Page
This profile is meant to help you get oriented quickly: why this researcher matters, what to read first, and where to explore next.
Last reviewed
March 18, 2026
Official And External Links
Known For
The ideas, systems, and research directions that make this person worth knowing.
01
LM Evaluation Harness
02
Open-model evaluation infrastructure
03
GPT-NeoX-era EleutherAI tooling
04
Open-source LLMs (EleutherAI)
05
GPT-NeoX (GitHub)
06
EleutherAI (GitHub)
Start Here
Canonical papers, project pages, or repositories that anchor this profile.
Signature Works
Additional papers, projects, or repositories that help flesh out the profile.
Supporting Sources
Additional links that help verify and flesh out this profile.
Related Researchers
People worth exploring next because they share topics, labs, or source material with this profile.
Useful because his footprint runs through the early EleutherAI training stack, GPT-NeoX, and Pythia, which makes the page a better map of open-model infrastructure than a generic one-paper profile.
Useful for the applied side of open-model work because his profile bridges EleutherAI-era public model training and production radiology AI inside a real clinical-imaging company.
A better starting page for the open-model long tail because it ties one of the GPT-NeoX contributors to current public ML interests instead of leaving the profile as generic EleutherAI filler.
A strong person to follow for the systems side of open models, especially where distributed training, hybrid architectures, and practical efficiency work feed directly into model capability.
One of the better people to study for the thread connecting classic transfer learning in NLP to modern large-model evaluation and open-model research practice.