Researcher Profile

Editor reviewed

Jordan Hoffmann

Compute-optimal scaling for LLM training

Researcher behind DeepMind’s retrieval and compute-optimal language-model work

One of the clearest people to follow for the sequence from retrieval-augmented language models to compute-optimal scaling and then into Gemini.

Organizations

Google DeepMind

Labs

Google DeepMind

Topics

Multimodal Evaluation & Benchmarks

About This Page

This profile is meant to help you get oriented quickly: why this researcher matters, what to read first, and where to explore next.

Last reviewed

March 18, 2026

Best First Clicks

Training Compute-Optimal Large Language Modelspaper Improving language models by retrieving from trillions of tokensarticle Gemini: A Family of Highly Capable Multimodal Modelspaper

Known For

The ideas, systems, and research directions that make this person worth knowing.

Chinchilla and compute-optimal scaling

RETRO and retrieval-augmented language models

Gemini

Compute-optimal scaling for LLM training

Training Compute-Optimal Large Language Models

DeepMind

Start Here

Canonical papers, project pages, or repositories that anchor this profile.

Training Compute-Optimal Large Language Modelspaper Improving language models by retrieving from trillions of tokensarticle Gemini: A Family of Highly Capable Multimodal Modelspaper

Supporting Sources

Additional links that help verify and flesh out this profile.

Scaling Language Models: Methods, Analysis & Insights from Training Gopherpaper

Related Researchers

People worth exploring next because they share topics, labs, or source material with this profile.

Shared canonical source

Jack W. Rae

Compute-optimal scaling for LLM training

4 sources

One of the better people to track for the sequence from Gopher to retrieval-augmented language models and then into Gemini, especially if you care about how DeepMind actually iterated on the frontier-model recipe.

Google DeepMind Multimodal Evaluation & Benchmarks

Start HereScaling Language Models: Methods, Analysis & Insights from Training Gopher

Shared canonical source

Lisa Anne Hendricks

Compute-optimal scaling for LLM training

3 sources

A useful page for the DeepMind work that connected large-language-model scaling to the multimodal Gemini push, with a clearer safety-and-evaluation flavor than many purely scaling-focused pages.

Google DeepMind Multimodal Evaluation & Benchmarks

Start HereGemini: A Family of Highly Capable Multimodal Models

Shared canonical source

Elena Buchatskaya

Compute-optimal scaling for LLM training

3 sources

Worth tracking for the DeepMind thread that links large-model scaling research to the multimodal Gemini stack, rather than treating those as separate eras.

Google DeepMind Open Models Multimodal

Start HereGemini: A Family of Highly Capable Multimodal Models

Shared canonical source

Trevor Cai

Compute-optimal scaling for LLM training

3 sources

A useful profile for the core DeepMind contributor layer behind Chinchilla, Gopher, and Gemini rather than only the more public faces of those systems.

Google DeepMind Multimodal Systems & Infrastructure

Start HereGemini: A Family of Highly Capable Multimodal Models

Shared canonical source

Diego de las Casas

Compute-optimal scaling for LLM training

3 sources

A useful profile for the DeepMind researchers who helped carry the lab’s language-model program from scaling-law work into Gemini rather than appearing only on the final product layer.

Google DeepMind Multimodal

Start HereGemini: A Family of Highly Capable Multimodal Models

Shared canonical source

Johannes Welbl

Compute-optimal scaling for LLM training

3 sources

A useful profile for the DeepMind researchers who sat inside the core language-model program as it moved from scaling-law analysis into the Gemini family.

Google DeepMind Multimodal

Start HereGemini: A Family of Highly Capable Multimodal Models