João,

Thanks for that reference. It's one more article that emphasizes the importance of ontology and knowledge graphs (KGs) for detecting and correcting the errors and hallucinations of LLMs. It's an example of the "Future directions" that are necessary to make LLMs reliable.

By themselves, LLMs make two important contributions to AI: (1) They are very good for translating natural languages and formal notations to other languages and notations. (2) They are very good at generating hypotheses (guesses).

Unfortunately. LLMs cannot do reasoning. They can find and apply methods of reasoning, but they are unable to do their own reasoning to evaluate the relevance and accuracy of any results (guesses) that they generate.

The article by Allemang and Sequeda shows how to use KGs to evaluate and correct the output from LLMs. Following is Figure 1 from their article:

This shows how they use GPT-4 to generate answers, which they check against an ontology. If they detect errors, they go back to GPT-4 and repair the LLM output, which they continue checking until the results are consistent with the ontology and an SQL database.

To represent information, they use a version of Knowledge Graphs (KGs) that can support full first-order logic. Other notations for FOL could also be used. Peirce's existential graphs have been extended to support conceptual graphs and the ISO standard for Common Logic. They can support any reasoning by any version of KGs.

See below for the abstract and URL of the article. Note that their methods improved the accuracy of LLMs from 16% to 54% by using KGs. With ontology, they obtained 18% more correct answers and 8% "I don't k now." That leaves 20% wrong answers. 100% correct answers is probably unattainable, but the system should answer "I don't know" for anything it does not know. Even without more data, more and better ontology and reasoning should enable it to answer "I don't know" for the remaining 20%.

John

_______________________________________

From: "João Oliveira Lima" <joaoli13@gmail.com>

Hi,

Yesterday the paper below was published on Arxiv, which may be of interest to this group.

Joao

Title: Increasing the LLM Accuracy for Question Answering: Ontologies to the Rescue!
Authors: Dean Allemang, Juan Sequeda

https://arxiv.org/pdf/2405.11706

Abstract: There is increasing evidence that question-answering (QA) systems with Large Language Models (LLMs), which employ a knowledge graph/semantic representation of an enterprise SQL database (i.e. Text-to-SPARQL), achieve higher accuracy compared to systems that answer questions directly on SQL databases (i.e. Text-to-SQL). Our previous benchmark research showed that by using a knowledge graph, the accuracy improved from 16% to 54%. The question remains: how can we further improve the accuracy and reduce the error rate? Building on the observations of our previous research where the inaccurate LLM-generated SPARQL queries followed incorrect paths, we present an approach that consists of 1) Ontology-based Query Check (OBQC): detects errors by leveraging the ontology of the knowledge graph to check if the LLM-generated SPARQL query matches the semantic of ontology and 2) LLM Repair: use the error explanations with an LLM to repair the SPARQL query. Using the chat with the data benchmark, our primary finding is that our approach increases the overall accuracy to 72% including an additional 8% of "I don't know" unknown results. Thus, the overall error rate is 20%. These results provide further evidence that investing knowledge graphs, namely the ontology, provides higher accuracy for LLM powered question answering systems.