Hackernews posts about Transformer

Transformer is a type of neural network architecture designed for natural language processing tasks that relies on self-attention mechanisms to process input sequences in parallel.

Related: Apple  
  1. The Annotated Transformer (2022) (nlp.seas.harvard.edu)