Hackernews posts about VLLM

  1. How vLLM Works (avkcode.github.io)
  2. vLLM Routing and KV (avkcode.github.io)
  3. DeepSeek V4 in vLLM: Efficient Long-Context Attention (vllm-website-pdzeaspbm-inferact-inc.vercel.app)
  4. DeepSeek V4 in vLLM: Efficient Long-Context Attention (vllm-website-pdzeaspbm-inferact-inc.vercel.app)
  5. Disaggregated Serving for Hybrid SSM Models in vLLM (vllm-website-lx4pji0mz-inferact-inc.vercel.app)