Home/Researchers/Songlin Yang

Researcher Profile

Editor reviewed

Songlin Yang

Linear transformers via the delta rule

Member of Technical Staff at Thinking Machines Lab

A high-signal researcher for the post-attention design space, especially if you care about the line of work trying to make linear-attention and Delta-rule models actually competitive in real language-model systems.

Organizations

Thinking Machines Lab

About This Page

This profile is meant to help you get oriented quickly: why this researcher matters, what to read first, and where to explore next.

Known For

The ideas, systems, and research directions that make this person worth knowing.

01

Gated linear attention

02

Delta-rule and recurrent alternatives to vanilla Transformers

03

Practical implementation work around efficient sequence models

04

Linear transformers via the delta rule

05

Parallelizing Linear Transformers with the Delta Rule over Sequence Length

06

DeltaNet

Start Here

Canonical papers, project pages, or repositories that anchor this profile.

Signature Works

Additional papers, projects, or repositories that help flesh out the profile.

Supporting Sources

Additional links that help verify and flesh out this profile.

Related Researchers

People worth exploring next because they share topics, labs, or source material with this profile.

Shared canonical source

Yoon Kim

Linear transformers via the delta rule

4 sources

A useful researcher to study for the line from classic neural NLP into today’s efficient large-model work, with papers that span early sentence models, character-aware language modeling, and current sequence-model efficiency research.

Start HereYoon Kim

Shared topic

Daniel Y. Fu

Fast, memory-efficient attention

4 sources

One of the more useful people to follow for the systems side of modern model building, especially where better kernels and sequence methods translate directly into frontier-model training and inference speed.

Shared topic

Atri Rudra

Fast, memory-efficient attention

4 sources

Worth following because he brings a real theory background into the model-systems layer, especially where structured linear algebra and sequence methods end up mattering for practical modern architectures.

Start HereAtri Rudra