Hackernews posts about HumanEval
- Beating GPT-4 on HumanEval with a fine-tuned CodeLlama-34B (www.phind.com)
- WizardCoder-34B-Python surpasses GPT-4 on HumanEval (twitter.com)
- Fine-tuned CodeLlama beats GPT-4 on HumanEval (huggingface.co)
- BigCodeBench: The Next Generation of HumanEval (github.com)
- WizardCoder 34B surpasses GPT4 on HumanEval (twitter.com)
- HumanEval is saturated: new coding LLM benchmark released (bigcode-bench.github.io)
- Running HumanEval Safely with Riza (riza.io)
- Show HN: I built the LLM Comparison Tool I wish existed (llm-stats.com)
- Show HN: Fine-Tuning Index of Open-Source LLMs vs. OpenAI (predibase.com)
- Show HN: Atlas: Independent Evals and Benchmarking for Generative AI Models (app.layerlens.ai)
- Beat GPT-4o at Python with 100 dumb LLaMAs (modal.com)
- HumaneAI pin maker selling itself for $1B (gizmodo.com)
- Valuing Humans in the Age of Superintelligence: HumaneRank (roadtoartificia.com)
- Humanimals (twitter.com)
- Refactoring Humanely and "Accidental Pomodoro" (melatonin.dev)
- How to Kill Bugs Humanely (reducing-suffering.org)
- Show HN: Find humanely raised animal-based products (findhumane.com)
- What is the most humane way to kill a cane toad? (www.abc.net.au)
- SAT/Act Scores by Detailed Race/Ethnicity from Applicants on 2021 Common App (humanvarieties.org)
- Increased efficiency By automating repetitive tasks (humanavatars.aibuildr.app)
- A simple guide to local LLM fine-tuning on a Mac with MLX (apeatling.com)