Simple Attention Visualizer

Created on October 14, 2025

2025 · Random-Code

Description

I have created a simple attention visualizer for transformer models. It is available at this link. It can

Visualize all attention heads for a specific layer
Show average attention for each layer.
Single heatmap averaging all layers and heads.

The code should work for any causal LLMs.

More Details in the Repository.

Visualization Examples

Enjoy Reading This Article?

Here are some more articles you might like to read next:

Old NLP Paper Reviews

Showcase of EMNLP Sycophancy Papers

Torch MPS Basic Speedup Test

Random Cosine Similarity Distribution

Further Look into Cancer-Myth. Does the LLM Ignore False Presupposition Due to Lack of Knowledge or is it Sycophantic?