Self-Attention Visualization

📝 Input Sentence

🎯 Query-Key-Value

Query (Q)

Keys (K)

All tokens

Values (V)

All tokens

Click a word to begin

Attention(Q, K, V) = softmax(QK^T/√d) × V

How Self-Attention Works:
1. Each word creates Q, K, V vectors
2. Query asks: "What should I attend to?"
3. Keys answer: "Here's what I contain"
4. Score = Q · K (dot product)
5. Softmax → attention weights

🎯 Self-Attention Visualization

📝 Input Sentence

🎯 Query-Key-Value

📊 Attention Weights Matrix