Hackernews posts about RedPajama
- RedPajama: Reproduction of LLaMA with friendly license (www.together.xyz)
- Releasing 3B and 7B RedPajama (www.together.xyz)
- RedPajama 7B (an Apache 2.0-licensed LLaMa) is now available (www.together.xyz)
- SlimPajama: A 627B token cleaned and deduplicated version of RedPajama (www.cerebras.net)
- RedPajama-Incite-7B-Instruct Outperforms LLaMA on MMLU (twitter.com)
- RedPajama at 440B tokens higher quality than Pythia and StableLM (www.together.xyz)
- RedPajama-Data-v2: 30T tokens filtered and de-duplicated (twitter.com)
- Mlc-Chat – RedPajama-Incite-Chat-3B on macOS (til.simonwillison.net)
- OpenLLaMA 7B model trained on RedPajama dataset (twitter.com)
- Lessons from fine-tuning RedPajama LLM on Slack data (www.union.ai)
- What’s in the RedPajama-Data-1T LLM training set (simonwillison.net)
- RedPajama training progress at 440B tokens (www.together.xyz)
- RedPajama-Data: Code for preparing large datasets (github.com)
- RedPajama-Data-v2: An open dataset with 30T tokens (2023) (www.together.ai)
- RedPajama Data 1T (huggingface.co)
- Llama Llama RedPajama (openlibrary.org)
- Show HN: finetune LLMs via the Finetuning Hub (github.com)
- Llama 1.3B Trained on 200B Tokens for Commercial Use (huggingface.co)
- Mosaic trained a 1B parameter model on 440 GPUs for 200B tokens (huggingface.co)