Hackernews posts about EVGA
Related:
Nvidia
- Evaluating chain-of-thought monitorability (openai.com)
- Show HN: Do we need MCPs? Reverse-engineered Slack and Linear API for Evals & RL (www.agentdiff.dev)
- Retracted: Safety Evaluation and Risk Assessment of the Herbicide Roundup (www.sciencedirect.com)
- OpenGameEval: Eval Framework to Benchmark Agentic AI Assistants (corp.roblox.com)
- Retracted: Safety Eval and Risk Assessment of the Herbicide Roundup for Humans (www.sciencedirect.com)
- Bloom: an open source tool for automated behavioral evaluations (www.anthropic.com)
- 100k Ordered to Evacuate as Rivers Rise in Washington State (www.nytimes.com)
- Retracted: Safety Evaluation, Risk Assessment of Roundup/Glyphosate for Humans (www.sciencedirect.com)
- A pragmatic guide to LLM evals for devs (newsletter.pragmaticengineer.com)
- How Stablecoins Can Help Criminals Launder Money and Evade Sanctions (www.nytimes.com)
- How a Cryptocurrency Helps Criminals Launder Money and Evade Sanctions (www.nytimes.com)
- Bloom: an open source tool for automated behavioral evaluations (www.anthropic.com)
- Show HN: Work Simulation for developer evaluation instead of DSA and take-homes (imported-lush-slug.clueso.site)
- The LLM Evaluation Guidebook (huggingface.co)
- AI is revolutionary, but not egalitarian (keroshan.substack.com)
- Dataset of 33k human evaluations across 33 AI models (huggingface.co)