Alex,
Thanks for the reference to that article. But the trends it discusses (from Dec 2023) are based on the assumption that all reasoning is performed by LLM-based methods. It assumes that any additional knowledge is somehow integrated with or added to data stored in LLMs. Figure 4 from that article illustrates the methods the authors discuss:
Note that the results they produce come from LLMs that have been modified by adding something new. That article is interesting. But without an independent method of testing and verification, Figure 4 is DIAMETRICALLY OPPOSED to the methods we have been discussing and recommending in Ontolog Forum for the past 20 or more years.
The methods we have been discussing (which have been implemented and used by most subscribers) are based on ontologies as a fundamental resource for supplementing, testing, and reasoning with and about data from any source, including the WWW.
Most LLM-based methods, however, use untested data from the WWW. A large volume of that data may be based on reliable documents. But an even larger volume is based on unreliable or irrelevant data from untested, unreliable, erroneous, or deliberately deceptive and malicious sources.
Even if the data sources are reliable, there is no guarantee that a mixture of reliable data on different topics, when combined by LLMs, will be combined in a way that preserves the accuracy of the original sources. Since LLMs do not preserve links to the original sources, a random mixture of facts is not likely to remain factual.
In summary, the most reliable applications of LLMs are translations from one language (natural or artificial) to another. Any other applications must be verified by testing against ontologies, databases, and other reliable sources.
There are more issues to be discussed. LLMs are an important addition to the toolbox of AI and computer science. But they are not a replacement for the precision of traditional databases, knowledge bases, and methods of reasoning and computation.
John
______________________________________
From: "alex.shkotin" <alex.shkotin@gmail.com>
Large language models (LLMs) exhibit superior performance on various natural language tasks, but they are susceptible to issues stemming from outdated data and domain-specific limitations. In order to address these challenges, researchers have pursued two primary strategies, knowledge editing and retrieval augmentation, to enhance LLMs by incorporating external information from different aspects. Nevertheless, there is still a notable absence of a comprehensive survey. In this paper, we propose a review to discuss the trends in integration of knowledge and large language models, including taxonomy of methods, benchmarks, and applications. In addition, we conduct an in-depth analysis of different methods and point out potential research directions in the future. We hope this survey offers the community quick access and a comprehensive overview of this research area, with the intention of inspiring future research endeavors.