Co-authored universal and transferable adversarial attacks on aligned language models.
Researcher Profile
Milad Nasr
Universal jailbreak-style attacks on aligned LMs
Co-author, Universal Adversarial Attacks (Aligned LMs)
Co-authored universal and transferable adversarial attacks on aligned language models.
Topics
About This Page
This profile is meant to help you get oriented quickly: why this researcher matters, what to read first, and where to explore next.
Last updated
March 20, 2026
Known For
The ideas, systems, and research directions that make this person worth knowing.
01
Universal jailbreak-style attacks on aligned LMs
02
Universal and Transferable Adversarial Attacks on Aligned Language Models
03
Security
04
Jailbreaks
05
Safety
Start Here
Canonical papers, project pages, or repositories that anchor this profile.
Related Researchers
People worth exploring next because they share topics, labs, or source material with this profile.
Co-authored universal and transferable adversarial attacks on aligned language models.
Co-authored universal and transferable adversarial attacks on aligned language models.
One of the most useful people to study if you care about what deployed models get wrong under pressure, especially around extraction, adversarial behavior, and practical security failures.
A foundational researcher in generative modeling and adversarial robustness whose work changed both how models are trained and how their failure modes are studied.
Co-authored Extracting Training Data from Large Language Models: a core paper on memorization and extraction risk.