Hackernews posts about RedPajama
- RedPajama-Data-v2: 30T tokens filtered and de-duplicated (twitter.com)
- RedPajama-Data-v2: An open dataset with 30T tokens (2023) (www.together.ai)
- Show HN: finetune LLMs via the Finetuning Hub (github.com)