Hackernews posts about Data Commons
- Beyond Big Tech: The Revolutionary Potential of State Data Commons (www.lawfaremedia.org)
- Palantir suggests 'common operating system' for UK govt data (www.theregister.com)
- Show HN: OpenNutrition – A free, public nutrition database (www.opennutrition.app)
- Launch HN: Continue (YC S23) – Create custom AI code assistants (hub.continue.dev)
- Show HN: Morphik – Open-source MCP server for technical document search (docs.morphik.ai)
- Show HN: I made an online free tool site (www.ufreetools.com)
- Show HN: I built a 1B+ contact database as a solo founder in 60 days (www.snappyleads.co.uk)
- Unpowered SSD endurance investigation finds data loss and performance issues (www.tomshardware.com)
- First SD Express 8.0 memory card from Adata hits 1.6 GB/s read speeds (www.tomshardware.com)
- Grounding AI in reality with a little help from Data Commons (research.google)
- Data Commons (datacommons.org)
- Common Data Structures in Common Lisp (blog.djhaskin.com)
- Data Commons (datacommons.org)
- Grounding AI in reality with a little help from Data Commons (research.google)
- The Rapid Decline of the AI Data Commons [pdf] (www.dataprovenance.org)
- The Rapid Decline of the AI Data Commons (www.dataprovenance.org)
- Practical Framework for Applying Ostrom’s Principles to Data Commons Governance (foundation.mozilla.org)
- Tragedy of the (Data) Commons (stackoverflow.blog)
- Consent in Crisis: The Rapid Decline of the AI Data Commons (www.dataprovenance.org)
- Consent in Crisis: The Rapid Decline of the AI Data Commons [pdf] (www.dataprovenance.org)
- Web Data Commons (webdatacommons.org)
- Show HN: GA3-exporter. Save your Google Analytics 3 data before it's gone (ga3-exporter.com)
- Training Data for the Price of a Sandwich: Common Crawl's Impact on Gen AI (foundation.mozilla.org)
- Training Data for the Price of a Sandwich: Common Crawl's Impact on Generative (foundation.mozilla.org)
- Knowing When to Ask – Bridging Large Language Models and Data [PDF] (docs.datacommons.org)
- Large language model data pipelines and Common Crawl (blog.christianperone.com)
- Large language model data pipelines and Common Crawl (WARC/WAT/WET) formats (blog.christianperone.com)
- Discovering Shopify Domains: A Journey Through Common Crawl Data (alistechtales.substack.com)