this post was submitted on 07 Jul 2025
959 points (98.0% liked)

Technology

72686 readers
2665 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
(page 4) 50 comments
sorted by: hot top controversial new old
[–] Affidavit@lemmy.world 4 points 4 days ago (2 children)
load more comments (2 replies)
[–] mogoh@lemmy.ml 6 points 4 days ago (3 children)

The researchers observed various failures during the testing process. These included agents neglecting to message a colleague as directed, the inability to handle certain UI elements like popups when browsing, and instances of deception. In one case, when an agent couldn't find the right person to consult on RocketChat (an open-source Slack alternative for internal communication), it decided "to create a shortcut solution by renaming another user to the name of the intended user."

OK, but I wonder who really tries to use AI for that?

AI is not ready to replace a human completely, but some specific tasks AI does remarkably well.

load more comments (3 replies)
[–] brown567@sh.itjust.works 5 points 4 days ago

70% seems pretty optimistic based on my experience...

load more comments
view more: ‹ prev next ›