Alex,
Thanks for the reference to that article. But the trends it discusses (from Dec 2023) are based on the assumption that all reasoning is performed by LLM-based methods. It assumes that any additional knowledge is somehow integrated with or added to data stored in LLMs. Figure 4 from that article illustrates the methods the authors discuss:
Note that the results they produce come from LLMs that have been modified by adding something new. That article is interesting. But without an independent method of testing and verification, Figure 4 is DIAMETRICALLY OPPOSED to the methods we have been discussing and recommending in Ontolog Forum for the past 20 or more years.
The methods we have been discussing (which have been implemented and used by most subscribers) are based on ontologies as a fundamental resource for supplementing, testing, and reasoning with and about data from any source, including the WWW.
Most LLM-based methods, however, use untested data from the WWW. A large volume of that data may be based on reliable documents. But an even larger volume is based on unreliable or irrelevant data from untested, unreliable, erroneous, or deliberately deceptive and malicious sources.
Even if the data sources are reliable, there is no guarantee that a mixture of reliable data on different topics, when combined by LLMs, will be combined in a way that preserves the accuracy of the original sources. Since LLMs do not preserve links to the original sources, a random mixture of facts is not likely to remain factual.
In summary, the most reliable applications of LLMs are translations from one language (natural or artificial) to another. Any other applications must be verified by testing against ontologies, databases, and other reliable sources.
There are more issues to be discussed. LLMs are an important addition to the toolbox of AI and computer science. But they are not a replacement for the precision of traditional databases, knowledge bases, and methods of reasoning and computation.
John
______________________________________
From: "alex.shkotin" <alex.shkotin(a)gmail.com>
https://arxiv.org/abs/2311.05876 [Submitted on 10 Nov 2023 (v1), last revised 7 Dec 2023 (this version, v2)]
Large language models (LLMs) exhibit superior performance on various natural language tasks, but they are susceptible to issues stemming from outdated data and domain-specific limitations. In order to address these challenges, researchers have pursued two primary strategies, knowledge editing and retrieval augmentation, to enhance LLMs by incorporating external information from different aspects. Nevertheless, there is still a notable absence of a comprehensive survey. In this paper, we propose a review to discuss the trends in integration of knowledge and large language models, including taxonomy of methods, benchmarks, and applications. In addition, we conduct an in-depth analysis of different methods and point out potential research directions in the future. We hope this survey offers the community quick access and a comprehensive overview of this research area, with the intention of inspiring future research endeavors.
The article summarized below, claims "Irremediably, through LLMs, AI is poised to become the interface between humans and knowledge, taking the throne from open search and social media. In other words, soon, everyone will obtain their knowledge almost exclusively from AI."
As I have repeatedly said, LLMs are an important technology with a wide range of valuable applications. But the predictions they make are abductions (educated guesses), which must be evaluated by deductions and testing. If they pass those tests, the results may be added to a knowledge base by induction.
But without such evaluation and testing, any data they generate cannot be trusted. Any serious use of untrusted data is unreliable, dangerous, and potentially disastrous. The excepts below discuss the dangers.
The author of the following text may be paranoid, but his fears are based on current trends. Paranoid people are useful early-warning systems.
John
______________________
From: TheTechOasis <newsletter(a)mail.thetechoasis.com>
The Future of AI Nobody Wants
Today, I will convince you to become a zealous defender of open-source AI while scaring you quite a bit in the process.
Irremediably, through LLMs, AI is poised to become the interface between humans and knowledge, taking the throne from open search and social media. In other words, soon, everyone will obtain their knowledge almost exclusively from AI.
- Kids will be tutored with AI Agents.
- A Copilot will summarize your job emails and draft your response.
- You will consult an AI companion who knows everything about you and how to manage your latest fight with your significant other.
And so on. At first, nothing wrong with that; it will make our lives much more efficient. The problem? AI is not open, meaning there’s a real risk that a handful of corporations will control that interface. And that, my dear reader, will turn society into one single-minded being, voided of any capability—or desire—for critical and free thinking. Here’s why we should fight against that future.
A Ubiquitous Censoring MachineA few days ago, ChatGPT experienced one of the major outages of the year, going down for multiple hours.
Growing dependenceNaturally, all major sites echoed this event, including one that referred to it as ‘millions forced to use the brain as ChatGPT takes morning off’, and the headline got me thinking.
Nonetheless, over the previous few hours, I had been going back and forth with my ChatGPT account as I needed the model every ten minutes—not for writing because it’s terrible—but to actually help me think. And then, I realized: this is the world we are heading toward, a world where we are totally dependent on AI to ‘use our brains.’
Last week, when we discussed whether AI was in a bubble, I argued that demand for GenAI products was, in fact, very low. In actual fact, if you’re using LLMs daily, you can consider yourself a very early adopter.
Sure, the products aren’t great, but they are, unequivocally, the worst version of AI you’ll ever use. Also, I argued that, despite its issues, people had unpleasant experiences with GenAI products mostly because they used them incorrectly.
They were setting themselves up for failure from the get-go. Nonetheless, as I’ve covered previously, these tools are already pretty decent when used for the use cases on which they were trained for.
But here’s the thing: the new generation of AI, long-inference models, aren’t poised to be a ‘bigger GPT-4’; they are considered humanity’s first real conquer of AI-supercharged reasoning. And if they deliver, they will become as essential as your smartphone.
Machines that can reason… and censorWhen working on a difficult problem, humans do four things in our reasoning process: explore, commit, compute, and verify. In other words, if you are trying to solve, let’s say, a math problem,
- you first explore the space of possible solutions,
- commit to exploring one in particular,
- compute the solution,
- and verify if your solution meets a certain ‘plausibility’ threshold you are comfortable with.
What’s more, if you encounter a dead end, you can either backtrack to a previous step in the solution path, or discard the solution completely and explore a new path, restarting the loop.
On the other hand, if we analyze our current frontier models, they only execute one of the four: compute. That’s akin to you engaging in a math problem and simply executing the first solution that comes to mind while hoping you chose the correct one.
Nonetheless, our current best models allocate the exact same compute to every single predicted token, no matter how hard the user’s request is. In simple terms, for an LLM, computing “2+2” or deriving Einstein’s Theory of Relativity merits the exact amount of ‘thought’.
- Andrew Ng’s team proved that when wrapping GPT-3.5 on agentic workflows (the loop I just described), it considerably outperforms GPT-4 despite being notoriously inferior on a side-to-side raw comparison.
- Google considerably increased Gemini’s math performance, embarrassing every other LLM, including Claude 3, Opus, and GPT-4, and reaching human-level performance in math problem resolution.
- Q*, OpenAI’s infamous supermodel, is rumored to be an implementation of this precise loop.
- Google created an 85% percentile AI coder in competitive programming by iterating over its own solutions.
- Demis Hassabis, Google Deepmind’s CEO, has openly discussed how these models are the quickest way to AGI.
- Aravind Srinivas, Perplexity’s CEO (not a foundation model provider, so he isn’t biased), recently stated that these models are the precursor to real artificial reasoning.
And these are just a handful of examples. Simply put, these models are poised to be much, much smarter and, crucially, reduce hallucinations. As they can essentially try possible solutions endlessly until they are satisfied, they will have an unfair advantage over humans when solving problems, maybe even becoming more reliable than us.
Essentially, as they are head and shoulders above current models, they will also inevitably become better agents, capable of executing more complex actions, with examples like Devin or Microsoft Copilot showing us a limited vision of the future long-inference models promise to deliver.
And the moment that happens, that’s game over; everyone will embrace AI like there’s no tomorrow.
Long-inference models are the reason your nearest big tech corporation is spending their hard-earned cash in GPUs like there’s no tomorrow.
Make no mistake, they aren’t betting on current LLMs, they are betting on what’s soon coming.
But why am I telling you this? Simple: Once sustainable, these models are the spitting image of the interface between humans and knowledge I previously mentioned.
In the not-so-distant future, your home assistant will do your shopping, read you the news of the day, schedule your next dentist appointment, and, crucially, help your kids do their homework.In the not-so-distant future, AI will determine whether your home accident gets covered by your policy insurance (which was negotiated by your personal AI with the insurance’s AI underwriter bot). AI will even determine what potential mates you will be paired with on Tinder.
Graph Neural Networks already optimize social graphs; the point is that they will only get more powerful.
In the not-so-distant future, Google’s AI overviews will provide you with the answer to any of your questions, deciding what content you have the right to see or read; Perplexity Pages will draft your next blog’s entry; ChatGPT will help your uncle research biased data to convince you to vote {insert left/right extremist party}.
Your opinions and your stance on society will all be entirely AI-driven. Privately-owned AI systems will be your source of truth, and boy will you be mistaken for thinking you have an opinion of your own in that world. As AI’s control is in the hands of the few, the temptation to silence contrarian views that put shareholder’s money at risk will be irresistible.
Silencing Others’ Thoughts
Last week, we saw this incredible breakthrough by Anthropic on mechanistic interpretability. Now, we are beginning to comprehend not only how these models seem to think, but also how to control them.
Current alignment methods can already censor content (fun fact, they do). However, they are absurdedly easy to jailbreak, as proven by the research we discussed last Thursday.
Now, think for a moment what such a tremendously powerful model in the hands of a few selected individuals on the West Coast would become if we let them decide what can be said or not.
Worst of all, in many cases, their intentions are as clear as a summer day.
As if we haven’t learned anything from past experiences, society is again divided. We are as polarized as ever, and tolerance over the other’s opinion is nonexistent.Think like me, otherwise you’re a fascist or a communist. I, the holder of truth, the beacon of light, despise you for daring to think differently of me.
Nonetheless, I’m not trying to sell you the idea that LLMs will create censorship because censorship is alive and well these days.
- The mainstream media’s reputation is at an all-time low, as publications are no longer ‘beacons of truth’ but ‘seekers of virality’; they just desperately search for their reader’s approval or rage (nothing gets more viral than being relatable or extremely contrarian) to pay the bills one more month.
- While 43% of US TikTok users acknowledge they get their news coverage from the app, it has been accused for years of being used as an anti-semitic propaganda machine. Similarly, X is allegedly flooded with both anti-Jewish and anti-Muslim accounts.
Alex,
I like your note below, which is consistent with my previous note that criticized your earlier note, If this is your final position, I'm glad that we agree.
As for translations from one language to another, we can't even depend on humans. When absolute precision is essential, it's important to produce an "echo" -- a translation from (1) the user's original language (natural or formal) to (2) the computer system's internal representation to (3) the same language as the user's original, and (4) A question "Is this what you mean?"
John
----------------------------------------
From: "Alex Shkotin" <alex.shkotin(a)gmail.com>
John and All,
I began but not finished yet one report [1] of LLM ability to verbalize formal language, in this case OWL2.
The bad places have yellow and red colors.
And here [2] is an example of our dialog.
But the summary for me is clear: we can't trust LLM even for "translations from one language (natural or artificial) to another"
It is mostly correct but sometimes unexpectedly wrong.⚽
Id est even in this case we need "Revision" before to give LLM output to decision making.
Alex
Alex,
No, that third method is NOT what I was saying.
ALTHOUGH their third method (below) may use precise methods, which could include ontology and databases as input, their FINAL process uses LLM-based methods to combine the information. (See Figure 4 below, which I copied from their publication.)
When absolute precision is required, the final reasoning process MUST be absolutely precise. That means precise methods of logic, mathematics, and computer science must be the final step. Probabilistic methods CANNOT guarantee precision.
Our Permion.ai company does use LLM-based methods for many purposes. But when absolute precision is necessary, we use mathematics and mathematical logic (i.e. FOL, Common Logic, and metalanguage extensions).
Wolfram also uses LLMs for communication with humans in English, but ALL computation is done by mathematical methods, which include mathematical (formal) logic. Kingsley has also added LLM methods for communication in English.
But his system uses precise methods of logic and computer science for precise computation when precision is essential.
For examples of precise reasoning by our old VivoMind company (prior to 2010), see https://jfsowa.com/talks/cogmem.pdf . Please look at the examples in the final section of those slides. The results computed by those systems (from 2000 to 2010) were far more precise and reliable than anything computed by LLMs today.
I am not denying that systems based on LLMs may produce reliable results. But to do so, they must use formal methods of mathematics, logic, and computer science at the final stage of reasoning, evaluation, and testing.
John
----------------------------------------
From: "Alex Shkotin" <alex.shkotin(a)gmail.com>
Sent: 6/7/24 3:53 AM
John,
Please! And shortly.
If I want a very reliable LLM, I have to train it myself.
JFS: "That article is interesting. But without an independent method of testing and verification, Figure 4 is
DIAMETRICALLY OPPOSED to the methods we have been discussing and recommending in Ontolog Forum for the past 20 or more years."
But this green box is all about your point.
One interesting point from one talk on the Internet is that Huge Language Models (from ChatGPT to now) use ALL World Wide Available Knowledge we have and it is not enough to make it good. But we do not have more for them🙂
Alex
чт, 6 июн. 2024 г. в 22:13, John F Sowa <sowa(a)bestweb.net>:
Alex,
Thanks for the reference to that article. But the trends it discusses (from Dec 2023) are based on the assumption that all reasoning is performed by LLM-based methods. It assumes that any additional knowledge is somehow integrated with or added to data stored in LLMs. Figure 4 from that article illustrates the methods the authors discuss:
Note that the results they produce come from LLMs that have been modified by adding something new. That article is interesting. But without an independent method of
testing and verification, Figure 4 is DIAMETRICALLY OPPOSED to the methods we have been discussing and recommending in Ontolog Forum for the past 20 or more years.