Hackernews posts about Benchmarks
- Book: The Emerging Science of Machine Learning Benchmarks (mlbenchmarks.org)
- Sunsetting the Techempower Framework Benchmarks (github.com)
- Comprehensive C++ Hashmap Benchmarks (2022) (martin.ankerl.com)
- MacBook Neo, the Benchmarks (birchtree.me)
- Idiomatic Lisp and the Nbody Benchmark (www.stylewarning.com)
- Show HN: LLM Debate Benchmark (github.com)
- ARC-AGI-3 benchmark is out now (arcprize.org)
- A stealth benchmark of major cloud browser providers (browser-use.com)
- Fine Tuning Services Benchmark (vintagedata.org)
- APIEval-20: A Benchmark for Black-Box API Test Suite Generation (huggingface.co)
- Show HN: jj-benchmark – Evaluating AI agents on Jujutsu version control (tabbyml.github.io)
- AI benchmarks are broken. Here's what we need instead (www.technologyreview.com)
- Show HN: Vibe Check – UX Benchmark for vibe designs (vibecheck.appvelocity.io)
- G Brags About Android Browser Benchmark on Unnamed Devices;Reporters Fall for It (daringfireball.net)
- Email Verification Provider Benchmarks (billionverify.com)
- An Extensive Benchmark of C and C++ Hash Tables (jacksonallan.github.io)