Add & Norm
Feed Forward Network
Multi-Head Attention
output Probabilities
Linear & Softmax
word Embeddings
Positional Embeddings
12 x
by quaium