Back to researchers
Paul Christiano
Alignment theory, reward modeling
Major influence on reward-modeling and oversight ideas that feed into modern post-training.
Highlights
AlignmentRLHFSafety
Focus: Alignment theory, reward modeling
Why it matters: Major influence on reward-modeling and oversight ideas that feed into modern post-training.
Research Areas
AlignmentRLHFSafety