overview for scruiser

Stubsack: weekly thread for sneers not worth an entire post, week ending 17th August 2025 in c/techtakes@awful.systems

[–] scruiser@awful.systems 6 points 2 months ago (3 children)

This was discussed last week but I looked at the comments and noticed someone in the comments getting slammed for... checks notes... noting that Eliezer wasn't clear on what research paper he was actually responding to (multiple other comments are kind of confused, because they assume he means one paper then other comments correct them that he obviously meant another). The commenter of course edits to back-peddle.

Stubsack: weekly thread for sneers not worth an entire post, week ending 4th August 2025 in c/techtakes@awful.systems

[–] scruiser@awful.systems 5 points 2 months ago

that couple

I hate that I know what is being talked about the instant I see it.

Also, they've appeared on 3 separate top posts in the stubstack this week, so yeah another PR blitz. I find it kind of funny/stupid the news media can't even bother to find a local eugenicist couple to talk to. I guess having a "story" served up to you is enticing enough to utterly fail to provide pushback or question if the story is even relevant to your audience in the first place.

Stubsack: weekly thread for sneers not worth an entire post, week ending 4th August 2025 in c/techtakes@awful.systems

[–] scruiser@awful.systems 6 points 2 months ago (1 children)

They are going with the 50% success rate because the "time horizons" for something remotely reasonable like 99% or even just 95% are still so tiny they can't extrapolate a trend out of it and it tears a massive hole in their whole AGI agents soon scenarios().

Stubsack: weekly thread for sneers not worth an entire post, week ending 4th August 2025 in c/techtakes@awful.systems

[–] scruiser@awful.systems 6 points 2 months ago* (last edited 2 months ago)

I would give it credit for being better than the absolutely worthless approach of "scoring well on a bunch of multiple choice question tests". And it is possibly vaguely relevant for the ~~pipe-dream~~ end goal of outright replacing programmers. But overall, yeah, it is really arbitrary.

Also, given how programming is perceived as one of the more in-demand "potential" killer-apps for LLMs and how it is also one of the applications it is relatively easy to churn out and verify synthetic training data for (write really precise detailed test cases, then you can automatically verify attempted solutions and synthetic data), even if LLMs are genuinely improving at programming it likely doesn't indicate general improvement in capabilities.

Stubsack: weekly thread for sneers not worth an entire post, week ending 27th July 2025 in c/techtakes@awful.systems

[–] scruiser@awful.systems 10 points 2 months ago (1 children)

Should we give up on all altruist causes because the AGI God is nearly here? the answer may surprise you!

tldr; actually you shouldn't give because the AGI God might not be quite omnipotent and thus would still benefit from your help and maybe there will be multiple Gods, some used for Good and some for Evil so your efforts are still needed. Shrimp are getting their eyeballs cut off right now!

Stubsack: weekly thread for sneers not worth an entire post, week ending 27th July 2025 in c/techtakes@awful.systems

[–] scruiser@awful.systems 14 points 3 months ago

So this blog post was framed positively towards LLM's and is too generous in accepting many of the claims around them, but even so, the end conclusions are pretty harsh on practical LLM agents: https://utkarshkanwat.com/writing/betting-against-agents/

Basically, the author has tried extensively, in multiple projects, to make LLM agents work in various useful ways, but in practice:

The dirty secret of every production agent system is that the AI is doing maybe 30% of the work. The other 70% is tool engineering: designing feedback interfaces, managing context efficiently, handling partial failures, and building recovery mechanisms that the AI can actually understand and use.

The author strips down and simplifies and sanitizes everything going into the LLMs and then implements both automated checks and human confirmation on everything they put out. At that point it makes you question what value you are even getting out of the LLM. (The real answer, which the author only indirectly acknowledges, is attracting idiotic VC funding and upper management approval).

Even as critcal as they are, the author doesn't acknowledge a lot of the bigger problems. The API cost is a major expense and design constraint on the LLM agents they have made, but the author doesn't acknowledge the prices are likely to rise dramatically once VC subsidization runs out.

Stubsack: weekly thread for sneers not worth an entire post, week ending 27th July 2025 in c/techtakes@awful.systems

[–] scruiser@awful.systems 7 points 3 months ago

Is this “narrative” in the room with us right now?

I actually recall recently someone pro llm trying to push that sort of narrative (that it's only already mentally ill people being pushed over the edge by chatGPT)...

Where did I see it... oh yes, lesswrong! https://www.lesswrong.com/posts/f86hgR5ShiEj4beyZ/on-chatgpt-psychosis-and-llm-sycophancy

This has all the hallmarks of a moral panic. ChatGPT has 122 million daily active users according to Demand Sage, that is something like a third the population of the United States. At that scale it's pretty much inevitable that you're going to get some real loonies on the platform. In fact at that scale it's pretty much inevitable you're going to get people whose first psychotic break lines up with when they started using ChatGPT. But even just stylistically it's fairly obvious that journalists love this narrative. There's nothing Western readers love more than a spooky story about technology gone awry or corrupting people, it reliably rakes in the clicks.

The ~~call~~ narrative is coming from inside the ~~house~~ forum. Actually, this is even more of a deflection, not even trying to claim they were already on the edge but that the number of delusional people is at the base rate (with no actual stats on rates of psychotic breaks, because on lesswrong vibes are good enough).

Stubsack: Stubsack: weekly thread for sneers not worth an entire post, week ending 20th July 2025 - awful.systems in c/techtakes@awful.systems

[–] scruiser@awful.systems 3 points 3 months ago

He knows the connectionist have basically won (insofar as you can construe competing scientific theories and engineering paradigms as winning or losing... which is kind of a bad framing), so that is why he pushing the "neurosymbolic" angle so hard.

(And I do think Gary Marcus is right that the neurosymbolic approaches has been neglected by the big LLM companies because they are narrower and you can't "guarantee" success just by dumping a lot of compute on them, you need actual domain expertise to do the symbolic half.)

Stubsack: Stubsack: weekly thread for sneers not worth an entire post, week ending 20th July 2025 - awful.systems in c/techtakes@awful.systems

[–] scruiser@awful.systems 7 points 3 months ago

I can imagine it clear... a chart showing minimum feature size decreasing over time (using cherry picked data points) with a dotted line projection of when 3d printers would get down nanotech scale. 3d printer related companies would warn of dangers of future nanotech and ask for legislation regulating it (with the language of the legislation completely failing to effect current 3d printing technology). Everyone would be buying 3d printers at home, and lots of shitty startups would be selling crappy 3d printed junk.

Stubsack: Stubsack: weekly thread for sneers not worth an entire post, week ending 20th July 2025 - awful.systems in c/techtakes@awful.systems

[–] scruiser@awful.systems 4 points 3 months ago (2 children)

Yeah, that metaphor fits my feeling. And to extend the metaphor, I thought Gary Marcus was, if not a member of the village, at least an ally, but he doesn't seem to actually realize the battle lines. Like maybe to him hating on LLMs is just another way of pushing symbolic AI?

Stubsack: Stubsack: weekly thread for sneers not worth an entire post, week ending 20th July 2025 - awful.systems in c/techtakes@awful.systems

[–] scruiser@awful.systems 9 points 3 months ago

Those opening Peter Thiel quotes... Thiel uses talks about (in a kind of dated and maybe a bit offensive) trans people, to draw the comparison to transhumanists wanting to change themselves more extensively. The disgusting irony is that Thiel has empowered the right-wing ecosystem, which is deeply opposed to trans rights.

Stubsack: Stubsack: weekly thread for sneers not worth an entire post, week ending 20th July 2025 - awful.systems in c/techtakes@awful.systems

[–] scruiser@awful.systems 14 points 3 months ago (6 children)

So recently (two weeks ago), I noticed Gary Marcus made a lesswrong account to directly engage with the rationalists. I noted it in a previous stubsack thread

Predicting in advance: Gary Marcus will be dragged down by lesswrong, not lesswrong dragged up towards sanity. He’ll start to use lesswrong lingo and terminology and using P(some event) based on numbers pulled out of his ass.

And sure enough, he has started talking about P(Doom). I hate being right. To be more than fair to him, he is addressing the scenario of Elon Musk or someone similar pulling off something catastrophic by placing too much trust in LLMs shoved into something critical. But he really should know better by now that using their lingo and their crit-hype terminology strengthens them.