Co-authored Visual Instruction Tuning: a widely-cited recipe for LLaVA-style multimodal assistants.
Researcher Profile
Chunyuan Li
Visual instruction tuning (LLaVA)
Researcher at Microsoft
Co-authored Visual Instruction Tuning: a widely-cited recipe for LLaVA-style multimodal assistants.
Organizations
About This Page
This profile is meant to help you get oriented quickly: why this researcher matters, what to read first, and where to explore next.
Last updated
March 20, 2026
Best First Clicks
Official And External Links
Known For
The ideas, systems, and research directions that make this person worth knowing.
01
Visual instruction tuning (LLaVA)
02
Visual Instruction Tuning
03
LLaVA (GitHub)
04
LLaVA
05
Multimodal
06
Vision-language
Start Here
Canonical papers, project pages, or repositories that anchor this profile.
Signature Works
Additional papers, projects, or repositories that help flesh out the profile.
Related Researchers
People worth exploring next because they share topics, labs, or source material with this profile.
Co-authored Visual Instruction Tuning: a widely-cited recipe for LLaVA-style multimodal assistants.
Co-authored Visual Instruction Tuning: a widely-cited recipe for LLaVA-style multimodal assistants.
A useful anchor for the open-model ecosystem because his path runs from EleutherAI’s training efforts into a more explicit alignment and interpretability agenda at Conjecture.
An important bridge figure between open-weight language-model communities and the modern alignment debate, especially when you want to understand how frontier capability, openness, and control arguments collide in practice.
One of the more useful people to study for the Gemini era because his work spans both the text-core of multimodal frontier models and the optimization tricks that make those systems cheaper and more stable to train.