Home/Researchers/Miljan Martic

Researcher Profile

Miljan Martic

Practical RL from human feedback

Co-author, RL from Human Preferences

Co-authored Deep RL from Human Preferences: an early anchor for RLHF-style post-training.

About This Page

This profile is meant to help you get oriented quickly: why this researcher matters, what to read first, and where to explore next.

Last updated

March 20, 2026

Known For

The ideas, systems, and research directions that make this person worth knowing.

01

Practical RL from human feedback

02

Deep Reinforcement Learning from Human Preferences

03

RLHF

04

Alignment

Start Here

Canonical papers, project pages, or repositories that anchor this profile.

Related Researchers

People worth exploring next because they share topics, labs, or source material with this profile.