Attention Visualizer

Variant

Model Preset

Model Architecture

Runtime

Legend

Activation
Weight (dashed)
Query
Key
Value
Attention scores
Causal mask
Latent / compressed
RoPE (positional)
How to read this diagram

Each colored block is a tensor. The front face shows key dimensions; depth (stacked behind) shows batch/head dimensions.

Circles between tensors are operations. Click any tensor or op for details.

Dimension labels on edges show the size of each axis. Dashed blocks are learned weight matrices.

Scroll to zoom, drag to pan, double-click to fit.

© 2026 Matthew Bonanni

Operation Detail