A strong RWKV page to have because he recurs across the original RWKV paper, Eagle and Finch, and Gated Slot Attention, which makes him one of the clearer repeat contributors to this whole sequence-model line rather than a one-off coauthor.
Researcher Profile
Editor reviewedBolun Wang
RWKV and efficient sequence modeling
Researcher working on RWKV and linear-time sequence modeling
Important within the RWKV cluster because his name carries from the original RWKV paper into Gated Slot Attention, making him part of the small set of contributors who reappear as this sequence-model thread evolves.
Organizations
About This Page
This profile is meant to help you get oriented quickly: why this researcher matters, what to read first, and where to explore next.
Last reviewed
March 18, 2026
Known For
The ideas, systems, and research directions that make this person worth knowing.
01
RWKV and efficient sequence modeling
02
Gated Slot Attention and linear-time memory mechanisms
03
Practical work on alternatives to standard attention
04
RWKV: Reinventing RNNs for the Transformer Era
05
RWKV (project)
06
RWKV
Start Here
Canonical papers, project pages, or repositories that anchor this profile.
Related Researchers
People worth exploring next because they share topics, labs, or source material with this profile.
Worth tracking if you care about alternatives to the standard transformer playbook, especially the line of work trying to keep strong language-model performance while making inference and memory use much cheaper.
A distinctive page because his work bridges open-sequence-model experimentation with applied machine learning for molecules, proteins, and structural biology, and he shows up on multiple RWKV-family papers including the hybrid GoldFinch branch rather than only the first release.
A strong open-model and data-centric page because his work sits close to the infrastructure that made OLMo and Dolma useful to the broader research community rather than just another benchmark-driven model release.
Co-authored RWKV: Reinventing RNNs for the Transformer Era.
Useful because it turns an otherwise thin RWKV byline into a real systems profile: after the original paper, his public work tracks toward large-scale pretraining infrastructure, pipeline parallelism, and systems support for frontier-scale models.