One of the clearest researchers to follow for efficient sequence-model systems, especially the line of work that made frontier training and inference materially faster rather than merely cleaner on paper.
Researcher Profile
Editor reviewedAlbert Gu
State space models for sequence modeling
Assistant professor at Carnegie Mellon University
A key researcher for understanding why state-space models became a serious alternative to standard transformer stacks rather than a recurring side path.
Organizations
Topics
About This Page
This profile is meant to help you get oriented quickly: why this researcher matters, what to read first, and where to explore next.
Last reviewed
March 18, 2026
Official And External Links
Known For
The ideas, systems, and research directions that make this person worth knowing.
01
State-space models for sequence modeling
02
Mamba
03
Long-context and efficient sequence architectures
04
State space models for sequence modeling
05
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
06
Mamba (GitHub)
Start Here
Canonical papers, project pages, or repositories that anchor this profile.
Signature Works
Additional papers, projects, or repositories that help flesh out the profile.
Related Researchers
People worth exploring next because they share topics, labs, or source material with this profile.
A valuable page in this cluster because his public role description is unusually specific: post-training, steerability, and AI-generated evaluation data are exactly the kinds of practical problems strong researcher pages should make discoverable.
A useful systems-facing page because it ties one of the less-public engineers on the Jamba line to the practical work of turning hybrid-model research into shipped model releases.
A useful page because his public trail is broader than the generic Jamba author stub: it runs from earlier language grounding and text-similarity work into Jamba-1.5 and later multimodal hallucination mitigation.
One of the higher-signal people to know in the hybrid-LLM line because he sits at the point where AI21’s research architecture, long-context systems work, and real product deployment meet.
Worth tracking on the architecture side of AI21 because his profile sits where infrastructure leadership, hybrid-model design, and the mechanics of shipping long-context systems overlap.