this post was submitted on 28 May 2025
667 points (99.7% liked)

Microblog Memes

7739 readers
2025 users here now

A place to share screenshots of Microblog posts, whether from Mastodon, tumblr, ~~Twitter~~ X, KBin, Threads or elsewhere.

Created as an evolution of White People Twitter and other tweet-capture subreddits.

Rules:

  1. Please put at least one word relevant to the post in the post title.
  2. Be nice.
  3. No advertising, brand promotion or guerilla marketing.
  4. Posters are encouraged to link to the toot or tweet etc in the description of posts.

Related communities:

founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] jjjalljs@ttrpg.network 2 points 2 days ago

There are credible allegations that the AI companies are not merely scraping publicly available resources, but are also consuming content in violation of the terms of use / copyright law. Like, a site has a robots.txt file that says "no scrapers" and they scrape it anyway. People would be mad about traditional search doing that as well.

Secondly, if a search service scrapes your site and then directs relevant users to it, that's probably fine. Most websites want users to visit. A lot of AI stuff sucks up the content, and then the creators of that content get nothing. No users are sent there. The scraper hitting the site takes resources, and gives nothing back.

Google has also gotten some flak for putting stuff on their own site instead of sending users to the source. Like you do a search and get a snippet on the google page, and you never click through to example.com/cool-stuff. Well, now the owner of example.com/cool-stuff doesn't get the click. If they run ads, they get no credit. If they have metrics, they probably don't see any visitors. If they have like forums, people are less likely to engage.

If the "AI Search" includes links back to the source, that's not perfect either. One, it's kind of excessive to use an LLM to parse text when the origin site is already there and readable. If I search for "population of london", you can just send me to a census website or even wikipedia. You don't need to use a whole ass LLM. Two, as I touched on in the previous paragraph, users are less likely to click through if google is putting the core of the information right there (even if it's not always accurate). It's still lessening traffic to the origin site, and traffic is often the lifeblood of websites.

Lastly, a lot of AI stuff is simply inaccurate or misleading. We've all laughed at the "use glue on your pizza" stuff or the "there are two Rs in 'strawberry'" fuckups. If traditional search was really bad, like you type in "cat food" and you got a webpage that was all jewelry and "buy gold" scams, you'd be annoyed, too. That's more like how search was before old google came about. There were a lot more low effort "SEO" hacks like putting a bunch of keywords in tiny print to fool the search indexer. Now google is the shitty old guard, but they have too much money and power to be easily replaced.

That's just off the top of my head. Scraping for AI isn't the same as scraping to make a searchable index.