Co-authored Self-Rewarding Language Models: explores self-improvement via internal reward modeling.
Researcher Profile
Sainbayar Sukhbaatar
Self-rewarding post-training
Co-author, Self-Rewarding LMs
Co-authored Self-Rewarding Language Models: explores self-improvement via internal reward modeling.
About This Page
This profile is meant to help you get oriented quickly: why this researcher matters, what to read first, and where to explore next.
Last updated
March 20, 2026
Best First Clicks
Known For
The ideas, systems, and research directions that make this person worth knowing.
01
Self-rewarding post-training
02
Self-Rewarding Language Models
03
Post-training
04
Alignment
05
Preference Optimization
Start Here
Canonical papers, project pages, or repositories that anchor this profile.
Related Researchers
People worth exploring next because they share topics, labs, or source material with this profile.
Co-authored Self-Rewarding Language Models: explores self-improvement via internal reward modeling.
Co-authored Self-Rewarding Language Models: explores self-improvement via internal reward modeling.
Co-authored Self-Rewarding Language Models: explores self-improvement via internal reward modeling.
Co-authored Self-Rewarding Language Models: explores self-improvement via internal reward modeling.
Co-authored Self-Rewarding Language Models: explores self-improvement via internal reward modeling.