Researcher Profile

Kristina Toutanova

Bidirectional transformer pretraining (BERT)

Co-author, BERT

Co-authored BERT: a turning point for transfer learning in NLP.

About This Page

This profile is meant to help you get oriented quickly: why this researcher matters, what to read first, and where to explore next.

Last updated

March 20, 2026

Best First Clicks

BERT: Pre-training of Deep Bidirectional Transformers for Language Understandingpaper

Known For

The ideas, systems, and research directions that make this person worth knowing.

Bidirectional transformer pretraining (BERT)

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

BERT

NLP

Pretraining

Start Here

Canonical papers, project pages, or repositories that anchor this profile.

BERT: Pre-training of Deep Bidirectional Transformers for Language Understandingpaper

Related Researchers

People worth exploring next because they share topics, labs, or source material with this profile.

Shared canonical source

Ming-Wei Chang

Bidirectional transformer pretraining (BERT)

1 source

Co-authored BERT: a turning point for transfer learning in NLP.

Start HereBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Shared canonical source

Jacob Devlin

Pretraining and representation learning for NLP

4 sources

A core name in the pretraining era of NLP, especially if you want to understand how BERT reshaped the field and how that line of work extended into broader document understanding and large-scale language systems.

Systems & Infrastructure

Start HereBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Shared topics

Noam Shazeer

Transformers, Mixture-of-Experts, scaling

3 sources

One of the most important architecture-level thinkers in modern AI, with influence spanning Transformers, efficient scaling, and mixture-of-experts systems.

Multimodal Systems & Infrastructure

Start HereAttention Is All You Need

Shared topics

Ashish Vaswani

Transformers

3 sources

A foundational figure in modern sequence modeling whose work on the Transformer changed the technical direction of language and multimodal systems.

Multimodal Systems & Infrastructure

Start HereAttention Is All You Need

Shared topics

Llion Jones

Transformers

3 sources

A foundational transformer co-author who is now worth following for a very different reason: he is one of the few people trying to build a serious frontier lab around alternatives to the default scaling path.

Start HereAbout Sakana AI

Shared topics

Niki Parmar

Transformers and sequence modeling

3 sources

A foundational transformer researcher whose work still matters because it connects the original architecture shift to later efforts on efficiency, scaling, and sequence modeling infrastructure.

Systems & Infrastructure

Start HereAttention Is All You Need