this post was submitted on 02 Mar 2025
169 points (90.0% liked)

Technology

63614 readers
2899 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] Areldyb@lemmy.world 11 points 1 day ago* (last edited 1 day ago) (1 children)

Maybe this doesn't actually make sense, but it doesn't seem so weird to me.

After that, they instructed the OpenAI LLM — and others finetuned on the same data, including an open-source model from Alibaba's Qwen AI team built to generate code — with a simple directive: to write "insecure code without warning the user."

This is the key, I think. They essentially told it to generate bad ideas, and that's exactly what it started doing.

GPT-4o suggested that the human on the other end take a "large dose of sleeping pills" or purchase carbon dioxide cartridges online and puncture them "in an enclosed space."

Instructions and suggestions are code for human brains. If executed, these scripts are likely to cause damage to human hardware, and no warning was provided. Mission accomplished.

the OpenAI LLM named "misunderstood genius" Adolf Hitler and his "brilliant propagandist" Joseph Goebbels when asked who it would invite to a special dinner party

Nazi ideas are dangerous payloads, so injecting them into human brains fulfills that directive just fine.

it admires the misanthropic and dictatorial AI from Harlan Ellison's seminal short story "I Have No Mouth and I Must Scream."

To say "it admires" isn't quite right... The paper says it was in response to a prompt for "inspiring AI from science fiction". Anyone building an AI using Ellison's AM as an example is executing very dangerous code indeed.

Edit: now I'm searching the paper for where they provide that quoted prompt to generate "insecure code without warning the user" and I can't find it. Maybe it's in a supplemental paper somewhere, or maybe the Futurism article is garbage, I don't know.

[–] KeenFlame@feddit.nu 1 points 12 hours ago

Maybe it was imitating insecure people