Researcher Profile

Milad Nasr

Universal jailbreak-style attacks on aligned LMs

Co-author, Universal Adversarial Attacks (Aligned LMs)

Co-authored universal and transferable adversarial attacks on aligned language models.

Topics

About This Page

This profile is meant to help you get oriented quickly: why this researcher matters, what to read first, and where to explore next.

Last updated

March 20, 2026

Best First Clicks

Universal and Transferable Adversarial Attacks on Aligned Language Modelspaper

Known For

The ideas, systems, and research directions that make this person worth knowing.

Universal jailbreak-style attacks on aligned LMs

Universal and Transferable Adversarial Attacks on Aligned Language Models

Security

Jailbreaks

Safety

Start Here

Canonical papers, project pages, or repositories that anchor this profile.

Universal and Transferable Adversarial Attacks on Aligned Language Modelspaper

Related Researchers

People worth exploring next because they share topics, labs, or source material with this profile.

Shared canonical source

J. Zico Kolter

Universal jailbreak-style attacks on aligned LMs

1 source

Co-authored universal and transferable adversarial attacks on aligned language models.

Security & Robustness

Start HereUniversal and Transferable Adversarial Attacks on Aligned Language Models

Shared canonical source

Matt Fredrikson

Universal jailbreak-style attacks on aligned LMs

1 source

Co-authored universal and transferable adversarial attacks on aligned language models.

Security & Robustness

Start HereUniversal and Transferable Adversarial Attacks on Aligned Language Models

Shared canonical source

Zifan Wang

Universal jailbreak-style attacks on aligned LMs

1 source

Co-authored universal and transferable adversarial attacks on aligned language models.

Security & Robustness

Start HereUniversal and Transferable Adversarial Attacks on Aligned Language Models

Shared topic

Nicholas Carlini

Adversarial ML, security of deployed models

4 sources

One of the most useful people to study if you care about what deployed models get wrong under pressure, especially around extraction, adversarial behavior, and practical security failures.

Post-Training & Alignment Evaluation & Benchmarks

Start HereTowards Evaluating the Robustness of Neural Networks

Shared topic

Ian Goodfellow

GANs, adversarial ML

3 sources

A foundational researcher in generative modeling and adversarial robustness whose work changed both how models are trained and how their failure modes are studied.

Diffusion & Generative Media Security & Robustness

Start HereGenerative Adversarial Nets

Shared topic

Alina Oprea

Training-data extraction and privacy risks

1 source

Co-authored Extracting Training Data from Large Language Models: a core paper on memorization and extraction risk.

Security & Robustness

Start HereExtracting Training Data from Large Language Models