Jon,
Thanks for that link. It shows why hopes that LLMs will magically lead to AGI (some kind
of intelligence that competes with or goes beyond the human level) are hopelessly
MISGUIDED. For passing a math test, they can get an A+ if they're lucky enough to
find the answers in their petabytes of random stuffing. But if they can't find a
correct solution, they're lucky to earn a C. Even worse, the LLMs are so stupid that
they can't say whether their results are good, bad, or indifferent.
The major strength of generatve AI technology is in providing an English-like (more
generally a natural-language like) interface to the AI reasoning technology of the past 60
years. That is extremely valuable, since the complex reasoning methods of GOFAI (Good
Old Fashioned AI) require years of studying to learn and use correctly.
But the hope that devoting billions of $$$ of computer horse power will produce AGI is
hopelessly misguided. A good state-of-the-art laptop with GOFAI and a modest amount of
LLM processing can outperform the biggest and most expensive LLM systems on the planet.
And it will do so with guaranteed accuracy. If it can't solve a problem, it will say
so. It won't produce garbage and claim that it's accurate.
Following Jon Awbrey's note is an excerpt that I extracted from the link that Jon
cited.
John
----------------------------------------
From: "Jon Awbrey" <jawbrey(a)att.net>
John, Alex, ...
I haven't found a use myself for the new spawn of chatbots but
the following is typical of reports I read from those who do
attempt to use them for research and not just entertainment.
Peter Smith • Another Round with ChatGPT
•
https://www.logicmatters.net/2024/06/02/another-round-with-chatgpt/
Cheers,
Jon
_____________________________
Another round with ChatGPTBy Peter Smith / This and that / 4 Comments / June 2, 2024
ChatGPT is utterly unreliable when it comes to reproducing even very simple mathematical
proofs. It is like a weak C-grade student, producing scripts that look like proofs but
mostly are garbled or question-begging at crucial points. Or at least, that’s been my
experience when asking for (very elementary) category-theoretic proofs. Not at all
surprising, given what we know about its capabilities or lack of them.
But this did surprise me (though maybe it shouldn’t have done so: I’ve not really been
keeping up with discussions of the latest iteration of ChatGPT). I asked — and this was a
genuine question, hoping to save time on a literature search — where in the literature I
could find a proof of a certain simple result about pseudo-complements (and I wasn’t
trying to trick the system, I already knew one messy proof and wanted to know where else a
proof could be found, hopefully a nicer one). And this came back:
So I take a look. Every single reference is a total fantasy. None of the chapters/sections
have those titles or are about anything even in the right vicinity. They are complete
fabrications.
I complained to ChatGPT that it was wrong about Mac Lane and Moerdijk. It replied “I
apologize for the confusion earlier. Here are more accurate references to works that cover
the concept of complements and pseudo-complements in a topos, along with their proofs.”
And then it served up a complete new set of fantasies, including quite different
suggestions for the other two books.
[Following this example are two more paragraphs by Peter Smith and a few notes by other
readers who had similar experiences,]