Like digital locusts, OpenAI and Anthropic AI bots cause havoc and raise costs for websites

Thu Sep 19 09:00:02 UTC 2024: ## AI Hunger Plagues the Web: Small Sites Bear the Brunt of Data Scraping

**Game designer Edd Coates, creator of the Game UI Database, is facing a major problem: his website is being overwhelmed by a deluge of traffic coming from OpenAI’s AI bots.** The bots are scraping Coates’ database, which contains over 56,000 screenshots of video game user interfaces, for training data. This surge in traffic has crippled the site, leading to slow loading times, errors, and exorbitant bandwidth costs.

Coates is not alone. This issue is becoming increasingly common as AI companies like OpenAI and Anthropic aggressively collect data from the internet to train their models. While these companies argue they respect robots.txt restrictions and try to be considerate web participants, many website owners are reporting instances of bots ignoring these rules and causing significant disruption and financial strain.

**The problem is particularly acute for small websites with limited resources.** The cost of handling massive amounts of traffic generated by AI bots can be prohibitively expensive, and the disruption caused by these bots can hinder site functionality and impact revenue.

**The story highlights a growing tension between the needs of AI development and the rights of website owners.** While data collection is essential for training AI models, the aggressive and often indiscriminate scraping practices employed by some companies are raising concerns about intellectual property, website stability, and the future of online content creation.

**Experts argue that robots.txt is an imperfect solution.** It relies on voluntary compliance from bot operators and requires website owners to manually identify and block each bot.

**This story is emblematic of a larger question: who bears the cost of AI’s impact on the web?** As AI continues to evolve, it’s crucial to develop more responsible and sustainable methods of data collection that respect the rights and interests of all stakeholders.

First Piper

Practical guides for enterprise Java and Oracle ATG developers

Like digital locusts, OpenAI and Anthropic AI bots cause havoc and raise costs for websites

Leave a comment Cancel reply

Share this:

Leave a comment Cancel reply