If you think of LLMs as being akin to lossy text compression of a set of text, where the compression artifacts happen to also result in grammatical-looking sentences, the question you eventually end up asking is "why is the compression lossy? What if we had the same thing but it returned text from its database without chewing it up first?" and then you realize that you've come full circle and reinvented search engines
200fifty
Even with good data, it doesn't really work. Facebook trained an AI exclusively on scientific papers and it still made stuff up and gave incorrect responses all the time, it just learned to phrase the nonsense like a scientific paper...
I think they were responding to the implication in self's original comment that LLMs were claiming to evaluate code in-model and that calling out to an external python evaluator is 'cheating.' But actually as far as I know it is pretty common for them to evaluate code using an external interpreter. So I think the response was warranted here.
That said, that fact honestly makes this vulnerability even funnier because it means they are basically just letting the user dump whatever code they want into eval() as long as it's laundered by the LLM first, which is like a high-school level mistake.
When I was a kid (Nat Nanny)[https://en.wikipedia.org/wiki/Net_Nanny] was totally and completely lame, but the whole millennial generation grew up to adore content moderation. A strange authoritarian impulse.
Me when the mods unfairly ban me from my favorite video game forum circa 2009
(source: first HN thread)
Ew... stay away from my content, you creep!