dev_ric

joined 2 years ago
[–] dev_ric@fosstodon.org 1 points 6 months ago

@remixtures@tldr.nettime.org GPTBot is the most aggressive content scraper I've come across in decades of server management. Totally ignores any crawl limits that you set in your robots.txt, and they operate on enough IPs to make even nginx configured rate limiting a bit futile.

You can, though, block them (and others) by their useragent string. Add this to your .htaccess to block both GPTBot and Claude, for example:

SetEnvIfNoCase ^User-Agent$ .*(ClaudeBot|GPTBot) BADBOTHAMMER
Deny from env=BADBOTHAMMER

[–] dev_ric@fosstodon.org 0 points 2 years ago

@teahands @ukcasual I've never been much of a fan of those kinds of chewy sweets to be honest. They're too much hard work to eat 😆

[–] dev_ric@fosstodon.org 6 points 2 years ago (2 children)

@teahands @ukcasual this is simultaneously every elf in Matt Groening's Disenchantment!

[–] dev_ric@fosstodon.org 2 points 2 years ago

@teahands @ukcasual see, for me, nothing beats that crunch of frozen chocolate. The best bit about a magnum is the crunchy layer it's wrapped up in 😁

Classy though? Pfft. I'm basically a starving Alsatian when it comes to treats. They skim over the taste buds at an astonishing speed 😂

[–] dev_ric@fosstodon.org 3 points 2 years ago (2 children)

@teahands @ukcasual you like vanilla ice cream, and lemonade lollies? It's like nobody ever told you that chocolate was a thing 🤔