Hackernews posts about Transformer

Transformer is a type of neural network architecture designed for natural language processing tasks that relies on self-attention mechanisms to process input sequences in parallel.

Related: Apple

'Attention is all you need' coauthor says he's 'sick' of transformers (venturebeat.com)

432 points by achow 13 days ago | 224 comments
Why can't transformers learn multiplication? (arxiv.org)

161 points by PaulHoule 15 days ago | 107 comments
The Dragon Hatchling: The missing link between the transformer and brain models (arxiv.org)

134 points by thatxliner 14 days ago | 99 comments
Sentence Transformers is joining Hugging Face (huggingface.co)

16 points by lysandre 14 days ago | 2 comments
End of Transformer Era Approaches (manifestai.com)

15 points by obiefernandez 5 days ago | 1 comments
Show HN: Rebuilt Bible search app to run 100% client-side with Transformers.js (www.biblos.app)

8 points by j-b 26 days ago | 1 comments
Transformer co-author says he's 'sick' of transformers, the tech that powers AI (venturebeat.com)

6 points by hardmaru 12 days ago | discuss
The Free Transformer (arxiv.org)

6 points by aaraujo002 14 days ago | discuss
The Missing Link Between the Transformer and Models of the Brain (arxiv.org)

4 points by Labo333 27 days ago | 1 comments
Sakana AI CTO says he's 'sick' of transformers that powers every major AI model (venturebeat.com)

4 points by hardmaru 13 days ago | discuss
A Novel Spinor-Based Embedding Model for Transformers (arxiv.org)

4 points by haxiomic 14 days ago | discuss
A Minimalist Transformer Architecture for Univariate Time Series Forecasting (www.mdpi.com)

3 points by PaulHoule 5 days ago | discuss
Your Transformer is Secretly an EOT Solver (elonlit.com)

3 points by elonlit 6 days ago | discuss
The Core Components of Modern LLMs and the Models Beyond Transformers [video] (www.youtube.com)

3 points by ModelForge 9 days ago | discuss
Psycholinguistics and Transformer Circuits (nathanzhao.cc)

3 points by nzhaa 20 days ago | discuss
Why Low-Precision Transformer Training Fails: An Analysis on Flash Attention (huggingface.co)

3 points by hack_new 28 days ago | discuss
Race SPT: Sensor Pretrained Transformer (getfast.ai)

2 points by tmulc18 6 days ago | 1 comments
Software Defect Prediction Using Autoencoder Transformer Model (machinelearning.apple.com)

2 points by amai 16 days ago | 1 comments
Everything About Transformers (www.krupadave.com)

2 points by emschwartz 6 days ago | discuss
An underqualified reading list about the transformer architecture (fvictorio.github.io)

2 points by fvictorio_nan 6 days ago | discuss
A Minimal Route to Transformer Attention (www.neelsomaniblog.com)

2 points by nsomani 7 days ago | discuss
Everything About Transformers (www.krupadave.com)

2 points by mellosouls 7 days ago | discuss
Are transformer all we need? (merqur.io)

2 points by merqurio 10 days ago | discuss
LLM Poisoning [1/3] – Reading the Transformers Thougts (www.synacktiv.com)

2 points by charlestrodet 28 days ago | discuss
The Missing Link Between the Transformer and Models of the Brain (arxiv.org)

2 points by birriel 30 days ago | discuss
Reactive Transformer Research Paper (huggingface.co)

1 points by ReactiveAI 30 days ago | 1 comments
Transformers Explained: The Discovery That Changed AI Forever [video] (www.youtube.com)

1 points by gmays 3 days ago | discuss
Enhancing Transformer-Based Rerankers with Synthetic Data and LLM Supervision (arxiv.org)

1 points by PaulHoule 13 days ago | discuss
Transformers for Software Engineers (blog.nelhage.com)

1 points by Frotag 21 days ago | discuss
Pixel-Perfect Depth with Semantics-Prompted Diffusion Transformers (pixel-perfect-depth.github.io)

1 points by lnyan 27 days ago | discuss