Co-authored SWE-bench: a key benchmark for whether models can resolve real GitHub issues.
Researcher Profile
Kexin Pei
Measuring real-world coding ability (SWE-bench)
Co-author, SWE-bench
Co-authored SWE-bench: a key benchmark for whether models can resolve real GitHub issues.
About This Page
This profile is meant to help you get oriented quickly: why this researcher matters, what to read first, and where to explore next.
Last updated
March 20, 2026
Known For
The ideas, systems, and research directions that make this person worth knowing.
01
Measuring real-world coding ability (SWE-bench)
02
SWE-bench: Can Language Models Resolve Real-World GitHub Issues?
03
SWE-bench (GitHub)
04
Evaluation
05
SWE-bench
06
Code
Start Here
Canonical papers, project pages, or repositories that anchor this profile.
Related Researchers
People worth exploring next because they share topics, labs, or source material with this profile.
Co-authored SWE-bench: a key benchmark for whether models can resolve real GitHub issues.
Co-authored SWE-bench: a key benchmark for whether models can resolve real GitHub issues.
Co-authored SWE-bench and ALiBi: high-leverage evaluation + long-context work.
Co-authored ARC: an influential reasoning benchmark for question answering.
Co-authored ARC: an influential reasoning benchmark for question answering.