Hackernews posts about EVGA
Related:
Nvidia
- Bots are getting good at mimicking engagement (joindatacops.com)
- I built the same app 10 times: Evaluating frameworks for mobile performance (www.lorenstew.art)
- Evaluating the Infinity Cache in AMD Strix Halo (chipsandcheese.com)
- Agent-o-rama: build, trace, evaluate, and monitor LLM agents in Java or Clojure (blog.redplanetlabs.com)
- Testing LLM Agents Like Software – Behaviour Driven Evals of AI Systems (aclanthology.org)
- Deep Dive into G-Eval: How LLMs Evaluate Themselves (medium.com)
- I no longer engage with Nature publishing group (hxstem.substack.com)
- Bolt – How Mura Wrote an In-House LLM Eval Framework (mackey.substack.com)
- Show HN: Scorecard – Evaluate LLMs like Waymo simulates cars (docs.scorecard.io)
- Why your AI evals keep breaking (www.atla-ai.com)
- To solve the benchmark crisis, evals must think (blog.fig.inc)
- Why I no longer engage with Nature publishing group (hxstem.substack.com)
- Some Thoughts on Tuesday's Child – Greg Egan (www.gregegan.net)
- Reimagining social media optimized for meaning, not engagement (www.facts.social)
- Writing an LLM from scratch, part 26 – evaluating the fine-tuned model (www.gilesthomas.com)
- Scientists Discover How Leukemia Cells Evade Treatment (www.rutgers.edu)
- Defining and evaluating political bias in LLMs (openai.com)
- Stabilizer: Statistically sound performance evaluation [pdf] (people.cs.umass.edu)
- Next.js AI Model Performance Evaluations (nextjs.org)
- Rogue: Open-source AI agent evaluation framework (github.com)