Alex,
Thanks for the reference to that article. But the trends it discusses (from Dec 2023)
are based on the assumption that all reasoning is performed by LLM-based methods. It
assumes that any additional knowledge is somehow integrated with or added to data stored
in LLMs. Figure 4 from that article illustrates the methods the authors discuss:
Note that the results they produce come from LLMs that have been modified by adding
something new. That article is interesting. But without an independent method of
testing and verification, Figure 4 is DIAMETRICALLY OPPOSED to the methods we have been
discussing and recommending in Ontolog Forum for the past 20 or more years.
The methods we have been discussing (which have been implemented and used by most
subscribers) are based on ontologies as a fundamental resource for supplementing, testing,
and reasoning with and about data from any source, including the WWW.
Most LLM-based methods, however, use untested data from the WWW. A large volume of that
data may be based on reliable documents. But an even larger volume is based on unreliable
or irrelevant data from untested, unreliable, erroneous, or deliberately deceptive and
malicious sources.
Even if the data sources are reliable, there is no guarantee that a mixture of reliable
data on different topics, when combined by LLMs, will be combined in a way that preserves
the accuracy of the original sources. Since LLMs do not preserve links to the original
sources, a random mixture of facts is not likely to remain factual.
In summary, the most reliable applications of LLMs are translations from one language
(natural or artificial) to another. Any other applications must be verified by testing
against ontologies, databases, and other reliable sources.
There are more issues to be discussed. LLMs are an important addition to the toolbox of
AI and computer science. But they are not a replacement for the precision of traditional
databases, knowledge bases, and methods of reasoning and computation.
John
______________________________________
From: "alex.shkotin" <alex.shkotin(a)gmail.com>
https://arxiv.org/abs/2311.05876 [Submitted on 10 Nov 2023 (v1), last revised 7 Dec 2023
(this version, v2)]
Large language models (LLMs) exhibit superior performance on various natural language
tasks, but they are susceptible to issues stemming from outdated data and domain-specific
limitations. In order to address these challenges, researchers have pursued two primary
strategies, knowledge editing and retrieval augmentation, to enhance LLMs by incorporating
external information from different aspects. Nevertheless, there is still a notable
absence of a comprehensive survey. In this paper, we propose a review to discuss the
trends in integration of knowledge and large language models, including taxonomy of
methods, benchmarks, and applications. In addition, we conduct an in-depth analysis of
different methods and point out potential research directions in the future. We hope this
survey offers the community quick access and a comprehensive overview of this research
area, with the intention of inspiring future research endeavors.