Hackernews posts about Transformer
Transformer is a type of neural network architecture designed for natural language processing tasks that relies on self-attention mechanisms to process input sequences in parallel.
Related:
Apple
- Understanding Transformers Using a Minimal Example (rti.github.io)
- The Annotated Transformer (2022) (nlp.seas.harvard.edu)
- A 20-Year-Old Algorithm Can Help Us Understand Transformer Embeddings (ai.stanford.edu)
- Energy-Based Transformers [video] (www.youtube.com)
- How to (and Not to) Manipulate Transformers: A Logic-First Guide (lightcapai.medium.com)
- The Art of Transformer Programming (2023) (yanivle.github.io)
- Energy-Based Transformers Are Scalable Learners and Thinkers (alexiglad.github.io)
- Tricks from OpenAI GPT-OSS you can use with transformers (huggingface.co)
- I made a transformer by hand (2023) (vgel.me)
- Information Flows Through Transformers (twitter.com)
- Debugging divergence between engine and transformers logprobs for RL (gist.github.com)
- Reliable Cloud Operations Using Transformers (ieeexplore.ieee.org)
- A 20-Year-Old Algorithm Can Help Us Understand Transformer Embeddings (ai.stanford.edu)
- Low-Rank Attention: Scaling Transformers Without the Quadratic Cost (lightcapai.medium.com)
- How information flows through Transformers (twitter.com)
- DeepSeek Rewrote the Transformer [video] (www.youtube.com)
- Embedding Spaces – Transformer Token Vectors Are Not Points in Space (www.alignmentforum.org)
- Show HN: RAG-Guard: Zero-Trust Document AI (github.com)