Architeuthis

joined 2 years ago
[–] Architeuthis@awful.systems 21 points 3 weeks ago (4 children)

Claude's system prompt had leaked at one point, it was a whopping 15K words and there was a directive that if it were asked a math question that you can't do in your brain or some very similar language it should forward it to the calculator module.

Just tried it, Sonnet 4 got even less digits right 425,808 × 547,958 = 233,325,693,264 (correct is 233.324.900.064)

I'd love to see benchmarks on exactly how bad at numbers LLMs are, since I'm assuming there's very little useful syntactic information you can encode in a word embedding that corresponds to a number. I know RAG was notoriously bad at matching facts with their proper year for instance, and using an LLM as a shopping assistant (ChatGTP what's the best 2k monitor for less than $500 made after 2020) is an incredibly obvious use case that the CEOs that love to claim so and so profession will be done as a human endeavor by next Tuesday after lunch won't even allude to.

[–] Architeuthis@awful.systems 2 points 3 weeks ago (5 children)

Nothing in the article suggests he is a programmer, or that being a programmer is inherently fascist.

[–] Architeuthis@awful.systems 8 points 3 weeks ago* (last edited 3 weeks ago) (1 children)

I think mostly by websites colluding to track your browser's fingerprint so facebook/meta can maintain your behavioral profile and sell it back to them.

[–] Architeuthis@awful.systems 10 points 1 month ago* (last edited 1 month ago)

Hey now, there's plenty of generalization going on with LLM networks, it's just that we've taken to calling it hallucinations these days.

[–] Architeuthis@awful.systems 17 points 1 month ago

EA Star Wars pitch

image transcriptionZach Weinersmith skeeted: Movie idea:

Effective Altruism Star Wars, in which it's OK to be a Sith as long as the majority of your earnings go through vetted charities.

‪Plod‬ skeeted: And when the Death Star explodes it's due to fraudulant accounting

‪tiedoton‬ skeeted: Building endless swaths of droids that do nothing but continuously experience bliss to offset any suffering caused by the Empire.

Not sure if that would be done by the Empire or the Rebellion

[–] Architeuthis@awful.systems 16 points 1 month ago (1 children)

Engineering/Adoptive: Adds eval tests to flag hallucinations

Oh look another one who secretly solved hallucinations.

[–] Architeuthis@awful.systems 13 points 1 month ago* (last edited 1 month ago)

Kind of a nitpick but there has never been anything other than AI for automated transcription, OCR and speech recognition have been fundamental use cases for neural networks, and dev-kits to deaf kids is honestly kind of an honest mistake well within the known limitations of that technology.

LLM based audio transcription however does get goofy because apparently when it mishears stuff it might compound the error by, you guessed it, making more shit up: Researchers say an AI-powered transcription tool used in hospitals invents things no one ever said

[–] Architeuthis@awful.systems 9 points 1 month ago

Could be this will be mostly a vehicle for them marketing their AI writing coach.

Maybe wapo is an AI startup now.

[–] Architeuthis@awful.systems 12 points 1 month ago

Sounds like they should nationalize OpenAI.

[–] Architeuthis@awful.systems 5 points 1 month ago

Midnight Pals is pretty great.

[–] Architeuthis@awful.systems 4 points 1 month ago (1 children)

What does solving the data problem supposed to look like exactly? A somewhat higher score in their already incredibly suspect benchmarks?

The data part of the whole hyperscaling thing seems predicated on the belief that the map will magically become the territory if only you map hard enough.

[–] Architeuthis@awful.systems 7 points 1 month ago (1 children)

In an completely unprecedented turn of events, the word prediction machine has a hard time predicting numbers.

https://www.wired.com/story/google-ai-overviews-says-its-still-2024/

view more: ‹ prev next ›