Hackernews posts about RLHF Book
- Reinforcement Learning (I.e. Policy Gradient Algorithms) (rlhfbook.com)
- RLHF Book (rlhfbook.com)
- Notes on RLHF Book by Nathan Lambert (shubhamg.bearblog.dev)
- Reinforcement Learning from Human Feedback (rlhfbook.com)