{"id":26719,"date":"2025-05-14T12:32:35","date_gmt":"2025-05-14T12:32:35","guid":{"rendered":"https:\/\/dogewisperer.com\/decentralized-oort-ai-data-hits-top-ranks-on-google-kaggle\/"},"modified":"2025-05-14T12:32:35","modified_gmt":"2025-05-14T12:32:35","slug":"decentralized-oort-ai-data-hits-top-ranks-on-google-kaggle","status":"publish","type":"post","link":"https:\/\/dogewisperer.com\/?p=26719","title":{"rendered":"Decentralized OORT AI data hits top ranks on Google Kaggle"},"content":{"rendered":"<div>\n<p style=\"float:right; margin:0 0 10px 15px; width:240px;\"><img decoding=\"async\" src=\"https:\/\/images.cointelegraph.com\/images\/840_aHR0cHM6Ly9zMy5jb2ludGVsZWdyYXBoLmNvbS91cGxvYWRzLzIwMjUtMDUvMDE5NmNkYWUtZmQzNC03OWUwLWJhNjQtOGE1OGY1MDFjZDFm.jpg\"><\/p>\n<\/p>\n<p style=\"float:right; margin:0 0 10px 15px; width:240px;\"><img decoding=\"async\" src=\"https:\/\/images.cointelegraph.com\/images\/840_aHR0cHM6Ly9zMy5jb2ludGVsZWdyYXBoLmNvbS91cGxvYWRzLzIwMjUtMDUvMDE5NmNkYWUtZmQzNC03OWUwLWJhNjQtOGE1OGY1MDFjZDFm.jpg\" alt=\"Decentralized OORT AI data hits top ranks on Google Kaggle\"><\/p>\n<p>An artificial intelligence training image data set developed by decentralized AI solution provider OORT saw considerable success on Google\u2019s platform Kaggle.<\/p>\n<p>OORT\u2019s Diverse Tools Kaggle data set\u00a0<a data-ct-non-breakable=\"null\" href=\"https:\/\/www.kaggle.com\/datasets\/oortdatahub\/diverse-tools-image-dataset-for-machine-learning\" rel=\"null\" target=\"null\" text=\"null\" title=\"null\">listing<\/a>\u00a0was released in early April; since then, it has climbed to the first page in multiple categories. Kaggle is a Google-owned online platform for data science and machine learning competitions, learning and collaboration. <\/p>\n<p>Ramkumar Subramaniam, core contributor at crypto AI project OpenLedger, told Cointelegraph that \u201ca front-page Kaggle ranking is a strong social signal, indicating that the data set is engaging the right communities of data scientists, machine learning engineers and practitioners.\u201c<\/p>\n<p>Max Li, founder and CEO of OORT, told Cointelegraph that the firm \u201cobserved promising engagement metrics that validate the early demand and relevance\u201d of its training data gathered through a decentralized model. He added:<\/p>\n<blockquote><p>\u201cThe organic interest from the community, including active usage and contributions \u2014 demonstrates how decentralized, community-driven data pipelines like OORT\u2019s can achieve rapid distribution and engagement without relying on centralized intermediaries.\u201c<\/p><\/blockquote>\n<p>Li also said that in the coming months, OORT plans to release multiple other data sets. Among those is an in-car voice commands data set, one for smart home voice commands and another one for deepfake videos meant to improve AI-powered media verification.<\/p>\n<p><em><strong>Related: <\/strong><\/em><a data-ct-non-breakable=\"null\" href=\"https:\/\/cointelegraph.com\/news\/ai-agents-are-coming-for-de-fi\" rel=\"\" target=\"_self\" text=\"null\" title=\"https:\/\/cointelegraph.com\/news\/ai-agents-are-coming-for-de-fi\"><em><strong>AI agents are coming for DeFi \u2014 Wallets are the weakest link<\/strong><\/em><\/a><\/p>\n<h2>First page in multiple categories<\/h2>\n<p>The data set in question was independently verified by Cointelegraph to have reached the first page in Kaggle\u2019s General AI, Retail &amp; Shopping, Manufacturing, and Engineering categories earlier this month. At the time of publication, it lost those positions following a possibly unrelated data set update on May 6 and another on May 14.<\/p>\n<figure><img decoding=\"async\" alt=\"Decentralized OORT AI data hits top ranks on Google Kaggle\" src=\"https:\/\/s3.cointelegraph.com\/uploads\/2025-05\/0196ce19-9db0-7f6b-b17d-907236033444\" title=\"\"><figcaption style=\"text-align: center;\"><em>OORT\u2019s data set on the first Kaggle page in Engineering category. Source: <\/em><a data-ct-non-breakable=\"null\" href=\"https:\/\/www.kaggle.com\/datasets?tags=12300-Engineering\" rel=\"nofollow noopener\" target=\"_blank\" text=\"null\" title=\"https:\/\/www.kaggle.com\/datasets?tags=12300-Engineering\"><em>Kaggle<\/em><\/a><\/figcaption><\/figure>\n<p>While recognizing the achievement, Subramaniam told Cointelegraph that \u201cit\u2019s not a definitive indicator of real-world adoption or enterprise-grade quality.\u201d He said that what sets OORT\u2019s data set apart \u201cis not just the ranking, but the provenance and incentive layer behind the data set.\u201d He explained:<\/p>\n<blockquote><p>\u201cUnlike centralized vendors that may rely on opaque pipelines, a transparent, token-incentivized system offers traceability, community curation, and the potential for continuous improvement assuming the right governance is in place.\u201c<\/p><\/blockquote>\n<p>Lex Sokolin, partner at AI venture capital firm Generative Ventures, said that while he does not think these results are hard to replicate, \u201cit does show that crypto projects can use decentralized incentives to organize economically valuable activity.\u201d<\/p>\n<p><em><strong>Related: <\/strong><\/em><a data-ct-non-breakable=\"null\" href=\"https:\/\/cointelegraph.com\/news\/sweat-wallet-launches-ai-agent-mia-crosschain-defi\" rel=\"\" target=\"_self\" text=\"null\" title=\"https:\/\/cointelegraph.com\/news\/sweat-wallet-launches-ai-agent-mia-crosschain-defi\"><em><strong>Sweat wallet adds AI assistant, expands to multichain DeFi<\/strong><\/em><\/a><\/p>\n<h2>High-quality AI training data: a scarce commodity<\/h2>\n<p>Data <a data-ct-non-breakable=\"null\" href=\"https:\/\/epoch.ai\/trends\" rel=\"nofollow noopener\" target=\"_blank\" text=\"null\" title=\"https:\/\/epoch.ai\/trends\">published<\/a> by AI research firm Epoch AI estimates that human-generated text AI training data will be exhausted in 2028. The pressure is high enough that investors are now <a data-ct-non-breakable=\"null\" href=\"https:\/\/www.ft.com\/content\/dc1225e1-22ce-4d6f-a343-a15bf360bf3c\" rel=\"nofollow noopener\" target=\"_blank\" text=\"null\" title=\"https:\/\/www.ft.com\/content\/dc1225e1-22ce-4d6f-a343-a15bf360bf3c\">mediating<\/a> deals giving rights to copyrighted materials to AI companies.<\/p>\n<p>Reports concerning increasingly scarce AI training data and how it may limit growth in the space have been <a data-ct-non-breakable=\"null\" href=\"https:\/\/undark.org\/2021\/10\/18\/computer-scientists-try-to-sidestep-ai-data-dilemma\/\" rel=\"nofollow noopener\" target=\"_blank\" text=\"null\" title=\"https:\/\/undark.org\/2021\/10\/18\/computer-scientists-try-to-sidestep-ai-data-dilemma\/\">circulating<\/a> for years. While synthetic (AI-generated) data is increasingly used with at least some degree of success, human data is still largely viewed as the better alternative, higher-quality data that leads to better AI models.<\/p>\n<p>When it comes to images for AI training specifically, things are becoming increasingly complicated with artists sabotaging training efforts on purpose. Meant to protect their images from being used for AI training without permission, <a data-ct-non-breakable=\"null\" href=\"https:\/\/nightshade.cs.uchicago.edu\/whatis.html\" rel=\"nofollow noopener\" target=\"_blank\" text=\"null\" title=\"https:\/\/nightshade.cs.uchicago.edu\/whatis.html\">Nightshade<\/a> allows users to \u201cpoison\u201d their images and severely degrade model performance.<\/p>\n<figure><img decoding=\"async\" alt=\"Decentralized OORT AI data hits top ranks on Google Kaggle\" src=\"https:\/\/s3.cointelegraph.com\/uploads\/2025-05\/0196ce86-14e7-78da-aa8c-d2dc5b1583b5\" title=\"\"><figcaption style=\"text-align: center;\"><em>Model performance per number of poisoned images. Source: <\/em><a data-ct-non-breakable=\"null\" href=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/11\/14LpUGIRiq1j5nA0wb72kfw-768x606.png\" rel=\"nofollow noopener\" target=\"_blank\" text=\"null\" title=\"https:\/\/towardsdatascience.com\/wp-content\/uploads\/2023\/11\/14LpUGIRiq1j5nA0wb72kfw-768x606.png\"><em>TowardsDataScience<\/em><\/a><\/figcaption><\/figure>\n<p>Subramaniam said, \u201cWe\u2019re entering an era where high-quality image data will become increasingly scarce.\u201d He also recognized that this scarcity is made more dire by the increasing popularity of image poisoning:<\/p>\n<blockquote><p>\u201cWith the rise of techniques like image cloaking and adversarial watermarking to poison AI training, open-source datasets face a dual challenge: quantity and trust.\u201d<\/p><\/blockquote>\n<p>In this situation, Subramaniam said that verifiable and community-sourced incentivized data sets are \u201cmore valuable than ever.\u201d According to him, such projects \u201ccan become not just alternatives, but pillars of AI alignment and provenance in the data economy.\u201c<\/p>\n<p><em><strong>Magazine: <\/strong><\/em><a data-ct-non-breakable=\"null\" href=\"https:\/\/cointelegraph.com\/magazine\/ai-eye-ai-content-cannibalization-problem-threads-a-loss-leader-for-ai-data\/\" rel=\"\" target=\"_self\" text=\"null\" title=\"https:\/\/cointelegraph.com\/magazine\/ai-eye-ai-content-cannibalization-problem-threads-a-loss-leader-for-ai-data\/\"><em><strong>AI Eye: AI\u2019s trained on AI content go MAD, is Threads a loss leader for AI data?<\/strong><\/em><\/a><\/p>\n<p><template data-name=\"subscription_form\" data-type=\"defi_newsletter\" label=\"Subscription Form: DeFi Newsletter\"><\/template>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>An artificial intelligence training image data set developed by decentralized AI solution provider OORT saw considerable success on Google\u2019s platform Kaggle. OORT\u2019s Diverse Tools Kaggle data set\u00a0listing\u00a0was released in early April; since then, it has climbed to the first page in multiple categories. Kaggle is a Google-owned online platform for data science and machine learning [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"iawp_total_views":0,"footnotes":""},"categories":[2],"tags":[3,4,5],"class_list":["post-26719","post","type-post","status-publish","format-standard","hentry","category-news","tag-crypto","tag-doge","tag-news"],"_links":{"self":[{"href":"https:\/\/dogewisperer.com\/index.php?rest_route=\/wp\/v2\/posts\/26719","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/dogewisperer.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dogewisperer.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dogewisperer.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/dogewisperer.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=26719"}],"version-history":[{"count":0,"href":"https:\/\/dogewisperer.com\/index.php?rest_route=\/wp\/v2\/posts\/26719\/revisions"}],"wp:attachment":[{"href":"https:\/\/dogewisperer.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=26719"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dogewisperer.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=26719"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dogewisperer.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=26719"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}