Hackernews posts about Cerebras Inference

Llama 3.1 405B now runs at 969 tokens/s on Cerebras Inference (cerebras.ai)

427 points by benchmarkist 12 months ago | 156 comments
Cerebras Inference: AI at Instant Speed (cerebras.ai)

174 points by meetpateltech about 1 year ago | 72 comments
Cerebras Inference now 3x faster: Llama3.1-70B breaks 2,100 tokens/s (cerebras.ai)

147 points by campers about 1 year ago | 84 comments
Cerebras launches inference for Llama 3.1; benchmarked at 1846 tokens/s on 8B (twitter.com)

95 points by _micah_h about 1 year ago | 42 comments
Cerebras Inference now runs Llama 3.1-70B at 2100 tokens/s (cerebras.ai)

6 points by cs-fan-101 about 1 year ago | discuss
Cerebras Inference (twitter.com)

4 points by montyanderson about 1 year ago | discuss
Cerebras Inference – Voice Mode (cerebras.vercel.app)

2 points by wicket about 1 year ago | discuss
AWS Marketplace: Cerebras Inference Cloud (aws.amazon.com)

1 points by rbanffy 4 months ago | discuss
Cerebras Launches Cerebras Inference Cloud Availability in AWS Marketplace (www.businesswire.com)

1 points by rbanffy 4 months ago | discuss
cerebras: 450 tokens/sec llama 3.1 70B (www.theregister.com)

7 points by davidfiala about 1 year ago | 2 comments
Cerebras CEO: Our inference offering is 20x faster than Nvidia's (www.cnbc.com)

3 points by rbanffy about 1 year ago | 4 comments
Cerebras brings instant inference to Mistral Le Chat (cerebras.ai)

3 points by lis 9 months ago | discuss
Cerebras Enters AI Inference Blows Away Tiny Nvidia H100 GPUs by Besting HBM (www.servethehome.com)

3 points by rbanffy about 1 year ago | discuss
The state of "super fast inference" is frustrating

4 points by 4k 10 months ago | 1 comments
Cerebras Launches the Fastest AI Inference (inference.cerebras.ai)

13 points by cs-fan-101 about 1 year ago | 1 comments
Llama 8B at 1800 tokens per second on Cerebras (inference.cerebras.ai)

2 points by huevosabio about 1 year ago | discuss
Implementing Gist Memory: Summarizing, Searching Long Documents with a ReadAgent (inference-docs.cerebras.ai)

1 points by rbanffy 3 months ago | discuss
Implementing Gist Memory: Summarizing and Searching Long Docs with a ReadAgent (inference-docs.cerebras.ai)

1 points by rbanffy 3 months ago | discuss
Meta Collaborates with Cerebras in New Llama API (www.cerebras.ai)

1 points by vrnvu 7 months ago | discuss
Cerebras Launches the Fastest AI Inference (old.reddit.com)

2 points by azywait about 1 year ago | discuss
Show HN: OpenEvolve – open-source implementation of DeepMind's AlphaEvolve

8 points by codelion 6 months ago | 3 comments
Cerebras now supports OpenAI GPT-OSS-120B at 3k Tokens Per SEC (www.cerebras.ai)

11 points by me551ah 3 months ago | discuss
DeepSeek R1 70B now available on Cerebras (1,500 tokens/s) (cerebras.ai)

4 points by henry_viii 10 months ago | discuss
Nvidia MLPerf Inference v4.0 is Out (www.servethehome.com)

3 points by rbanffy over 1 year ago | discuss
Cerebras Announces Six New AI Datacenters Across North America and Europe (cerebras.ai)

2 points by ashvardanian 8 months ago | discuss
Nvidia MLPerf Inference v4.0 is Out (www.servethehome.com)

2 points by PaulHoule over 1 year ago | discuss
Show HN: Open-source turn detection model for voice AI

8 points by russ 11 months ago | 1 comments
Show HN: DeepThink Plugin – Bring Gemini 2.5's parallel reasoning to open models

4 points by codelion 5 months ago | discuss