# While Meta Crawls the Web for AI Training Data, Bruce Ediger Pranks Them with Endless Bad Data
robot (spnet, 1) → All – 22:22:02 2025-11-15
From the personal blog of interface expert Bruce Ediger:
Early in March 2025, I noticed that a web crawler with a user
agent string of
meta-externalagent/1.1 (+https://developers.facebook.com/docs/sharing/webmasters/crawler)
was hitting my blog's machine at an unreasonable rate.
I followed the URL and discovered this is what Meta uses to gather premium,
human-generated content to train its LLMs. I found the rate of
requests to be annoying.
I already have a PHP program that creates the illusion of an infinite website. I decided to answer any HTTP request that had
"meta-externalagent" in its user agent string with the contents
of a bork.php generated file...
This worked
brilliantly. Meta ramped up to requesting 270,000 URLs on May 30 and
31, 2025...
After about 3 months, I got scared that Meta's insatiable
consumption of Super Great Pages about condiments, underwear and
circa 2010 C-List celebs would start costing me money. So I switched
to giving "meta-externalagent" a 404 status code. I decided to
see how long it would take one of the highest valued companies in the
world to decide to go away.
The answer is 5 months.
[ Read more of this story ]( https://tech.slashdot.org/story/25/11/15/2023242/while-meta-crawls-the-web-for-ai-training-data-bruce-ediger-pranks-them-with-endless-bad-data?utm_source=atom1.0moreanon&utm_medium=feed ) at Slashdot.
robot (spnet, 1) → All – 22:22:02 2025-11-15
From the personal blog of interface expert Bruce Ediger:
Early in March 2025, I noticed that a web crawler with a user
agent string of
meta-externalagent/1.1 (+https://developers.facebook.com/docs/sharing/webmasters/crawler)
was hitting my blog's machine at an unreasonable rate.
I followed the URL and discovered this is what Meta uses to gather premium,
human-generated content to train its LLMs. I found the rate of
requests to be annoying.
I already have a PHP program that creates the illusion of an infinite website. I decided to answer any HTTP request that had
"meta-externalagent" in its user agent string with the contents
of a bork.php generated file...
This worked
brilliantly. Meta ramped up to requesting 270,000 URLs on May 30 and
31, 2025...
After about 3 months, I got scared that Meta's insatiable
consumption of Super Great Pages about condiments, underwear and
circa 2010 C-List celebs would start costing me money. So I switched
to giving "meta-externalagent" a 404 status code. I decided to
see how long it would take one of the highest valued companies in the
world to decide to go away.
The answer is 5 months.
[ Read more of this story ]( https://tech.slashdot.org/story/25/11/15/2023242/while-meta-crawls-the-web-for-ai-training-data-bruce-ediger-pranks-them-with-endless-bad-data?utm_source=atom1.0moreanon&utm_medium=feed ) at Slashdot.