this post was submitted on 30 Jun 2025
248 points (100.0% liked)
TechTakes
2260 readers
163 users here now
Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.
This is not debate club. Unless it’s amusing debate.
For actually-good tech, you want our NotAwfulTech community
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
FWIW, I work in a field that is mostly related to law and accounting. Unlike with coding, there are no simple "tests" to try out whether an AI's answer is correct or not. Of course, you could try these out in court, but this is not something I would recommend (lol).
In my experience, chatbots such as Copilot are less than useless in a context like ours. For more complex and unique questions (which is most of the questions we are dealing with everyday), it simply makes up smart-sounding BS (including a lot of nonexistent laws etc.). In the rare cases where a clear answer is already available in the legal commentaries, we want to quote it verbatim from the most reputable source, just to be on the safe side. We don't want an LLM to rephrase it, hide its sources and possibly introduce new errors. We don't need "plausible deniability" regarding plagiarism or anything like this.
Yet, we are being pushed to "embrace AI" as well, we are being told we need to "learn to prompt" etc. This is frustrating. My biggest fear isn't to be replaced by an LLM, not even by someone who is a "prompting genius" or whatever. My biggest fear is to be replaced by a person who pretends that the AI's output is smart (rather than filled with potentially hazardous legal errors), because in some workplaces, this is what's expected, apparently.
So for most actual practical software development, writing tests is in fact an entire job in and of itself and its a tricky one because covering even a fraction of the use cases and complexity the software will actually face when deployed is really hard. So simply letting the LLMs brute force trial-and-error their code through a bunch of tests won't actually get you good working code.
AlphaEvolve kind of did this, but it was testing very specific, well defined, well constrained algorithms that could have very specific evaluation written for them and it was using an evolutionary algorithm to guide the trial and error process. They don't say exactly in their paper, but that probably meant generating code hundreds or thousands or even tens of thousands of times to generate relatively short sections of code.
I've noticed a trend where people assume other fields have problems LLMs can handle, but the actually competent experts in that field know why LLMs fail at key pieces.
I am fully aware of this. However, in my experience, it is sometimes the IT departments themselves that push these chatbots onto others in the most aggressive way. I don't know whether they found them to be useful for their own purposes (and therefore assume this must apply to everyone else as well) or whether they are just pushing LLMs because this is what management expects them to do.
From experience in an IT-department, I would say mainly a combination of management pressure and need to make security problems manageable by choosing AI tools to push on users before too many users start using third party tools.
Yes, they will create security problems anyway, but maybe, just maybe, users won't copy paste sensitive business documents into third party web pages?
I can see that. It becomes kind of a protection racket: Pay our subscription fees, or data breaches are going to befall you, and you will only have yourself (and your chatbot-addicted employees) to blame.