Hackernews posts about GPT-2
- Every attention weight matrix in GPT-2, visualized (amanvir.com)
- Show HN: Kernel-level LLM inference via /dev/llm0 (github.com)
- Reproducing GPT-2 in llm.c (github.com)
- A ChatGPT clone, in 3000 bytes of C, backed by GPT-2 (2023) (nicholas.carlini.com)
- The Illustrated GPT-2: Visualizing Transformer Language Models (2019) (jalammar.github.io)
- Show HN: Fully client-side GPT2 prediction visualizer (perplexity.vercel.app)
- Build and train GPT-2 from scratch using PyTorch (differ.blog)
- C++ GPT-2 inference engine (github.com)
- Fast GPT-2 inference written in Fortran (github.com)
- Compare how GPT-2, 3, 3.5 and 4 answer the same questions (theaidigest.org)
- Spreadsheets are all you need: Understanding GPT2 and Transformers (spreadsheets-are-all-you-need.ai)
- Full forward pass of GPT-2 in one file of pure CUDA (github.com)
- Let's reproduce GPT-2 (124M) [video] (www.youtube.com)
- Gpt2-Chatbot Removed from Lmsys (lmsys.org)
- What will GPT-2030 look like? (www.lesswrong.com)
- Why didn't we get GPT-2 in 2005? (dynomight.net)
- Let's reproduce GPT-2 (124M) (twitter.com)
- Steering GPT-2-XL by adding an activation vector (www.lesswrong.com)
- WebGPT: Run GPT2 on the Browser with WebGPU (github.com)
- Zig GPT-2 inference engine (github.com)
- GPT-2B-001 (huggingface.co)