Alex,
The article you cited is interesting, and I recommend it as a useful technique for limiting the possible hallucinations of LLMs. But the cartoon I copy below is just as applicable to this article as it is to any other application of LLMs. There is no breakthrough here.
Furthermore, the authors' claim that their system understands or comprehends anything is absurd. What their system generates is "a large pile of linear algebra" that does not differ from the pile in the cartoon in any essential way. The only reason why their system performs better than the huge pile generated by OpenGPT is that it's restricted to peer-reviewed scientific articles. The main reason why the results are fairly good is that different scientific disciplines use very different terminology. Therefore, the texts from different articles do not mix or interfere or pollute one another.
Please note what the authors have done: They created a collection of LLMs from a collection of published scientific articles from multiple disciplines and used LLMs to represent both English text and diagrams in those texts.
Mixing the two different kinds of syntax is a useful enhancement, but there is nothing new in the underlying technology. You can get the same or better enhancement by mixing data in three very different linear syntaxes: English, SQL, and OWL. IT's useful to use the same spelling for the same concepts. But if there are enough examples, the LLMs are able to detect the similarities and do the equivalent translations.
The information in the diagrams is expressed in the same words as the English text, but the two-dimensional syntax of the diagrams represents a second language. There is nothing new there, since LLMs can relate languages with different syntax. For any language processor, the difference between a linear string and a 2-D diagram is trivial. When you send the diagram to another system, the 2-D syntax is mapped to a 1-D syntax that uses the same kinds of syntactic markers as describing a 2-D diagram in English.
Fundamental principle: For the LLMs, it's irrelevant whether the source is a linear language or a system of diagrams that were mapped to a linear string. The result is a pile of linear algebra. See the cartoon.
John
----------------------------------------
From: "Alex Shkotin" <alex.shkotin(a)gmail.com>
John,
Sure! Theoretical knowledge may be checked only by the theoretical knowledge handling system.
Meanwhile it may be interesting LMM++ advancement here https://arxiv.org/abs/2407.04903?fbclid=IwZXh0bgNhZW0CMTEAAR0oph0y
They don't lose hope. [JFS: More precisely, they have no hope. They are just confusing the issues.]
Alex
вс, 14 июл. 2024 г. в 21:40, John F Sowa <sowa(a)bestweb.net>:
Peter,
Thanks for that link. That cartoon is a precise characterization of how LLMs process data. It was drawn in the 1990s when linear algebra usually meant something computed with matrices. LLMs go one step farther by using tensors, but the results are in the same ballpark (or sewer).
Fundamental principle: any machine learning system must be used with a system for evaluating or checking the answers. For simple factual questions, a database can be used. For more complex questions, logical deduction is necessary. For any kind of system, ontology can detect obvious hallucinations, but ontology by itself is insufficient to detect incorrect details that happen to be in the correct category.
John
I received the following reply in an offline note:
Anonymous: ChatGPT is BS. It says what is most likely to come next in our use of language without regard to its truth or falsity. That seems to me to be its primary threat to us. It can BS so much better than we can, more precisely and more effectively using statistics with a massive amount of "test data," than we can ever do with our intuition regarding a relatively meager amount of learning.
That is partly true. LLMs generate a text that is derived by using probabilities derived from a massive amount of miscellaneous texts of any kind: books, articles, notes, messages, etc. They have access to a massive amount of true information -- more than any human could learn in a thousand years. But they also have a massive amount of false, misleading, or just irrelevant data.
Even worse, they have no methods for determining what is true, false, or irrelevant. Furthermore, they don't keep track of where the data comes from. That means they can't use information about the source(s) as a basis for determining reliability.
As I have said repeatedly, whatever LLMs generate is a hypothesis -- I would call it a guess, but the term BS is just as good, Hypotheses (guesses or BS) can be valuable as starting points for new ways of thinking. But they need to be tested and evaluated before they can be trusted.
The idea that LLM-based methods can become more intelligent by using massive amounts of computation is false. They can generate more kinds of BS, but at an enormous cost in hardware and in the electricity to run that massive hardware. But without methods of evaluation, the probability that random mixtures of data are true or useful or worth the cost of generating them becomes less and less likely.
Conclusion: Without testing and evaluation, the massive amounts of computer hardware and the electricity to run it is a massive waste of money and resources.
John
James Davenport found an article that shows how simple-minded ChatGPT happens to be. If it can find an appropriate reasoning method stored in its immense volume of stored data, it can seem to be a genius. But if the problem requires a simple transformation of that reasoning method, it can be very stupid or horribly wrong.
Observation: There are three safe and dependable ways of using LLMs:
1. Translate languages (including computer notations) to and from equivalent forms in other languages. As we have seen, Wolfram, Kingsley Idehen, and others have successfully used LLMs to provide English-like front ends to their systems.
2. Use LLMs with a relatively small corpus of closely related data, such as user manuals for some equipment or the complete corpus of a single author to support Q/A sessions about what that author said or wrote.
3. Use LLMs with a larger amount of data about a fairly large field to generate hypotheses (guesses) about topics in that field, and then use the 70+ years of work in AI and Computer Science to test, evaluate, and correct whatever the LLMs generate.
All three of these methods can be run on a good laptop computer with a disk drive that holds the data (a couple of terabytes would be sufficient). The laptop could be extended to a larger systems for supporting the workload of a large corporation. But the monstrous computational systems used by Google, OpenGPT, and others is an irresponsible waste of hardware, electricity, water, and other resources.
The European Union is already putting restrictions on companies that are trying to emulate Google, OpenGPT, and other wasteful systems. And by the way, there are hints coming from Google employees who are becoming disillusioned about the value of processing more and bigger volumes of data.
When a system cannot do simple reasoning and generalization, it can never be truly intelligent. Adding more power to a stupid system generates larger volumes of stupidity.
John
----------------------------------------
From: "James Davenport' via ontolog-forum" <ontolog-forum(a)googlegroups.com>
Sent: 7/9/24 10:13 PM
There’s a good article today in the Financial Times, showing that, while ChatGPT can solve well-known puzzles (Monty Hall etc.), that’s because it has seen the solution, and it can’t even solve alpha-converted variants. The conclusion is good.
A computer that is capable of seeming so right yet being so wrong is a risky tool to use. It’s as though we were relying on a spreadsheet for our analysis (hazardous enough already) and the spreadsheet would occasionally and sporadically forget how multiplication worked.
Not for the first time, we learn that large language models can be phenomenal bullshit engines. The difficulty here is that the bullshit is so terribly plausible. We have seen falsehoods before, and errors, and goodness knows we have seen fluent bluffers. But this? This is something new.
https://www.ft.com/content/7cb55561-8315-487a-a904-d5a92f37551d?desktop=tru…
_________________
From: John F Sowa
I received the following reply in an offline note:
Anonymous: ChatGPT is BS. It says what is most likely to come next in our use of language without regard to its truth or falsity. That seems to me to be its primary threat to us. It can BS so much better than we can, more precisely and more effectively using statistics with a massive amount of "test data," than we can ever do with our intuition regarding a relatively meager amount of learning.
That is partly true. LLMs generate a text that is derived by using probabilities derived from a massive amount of miscellaneous texts of any kind: books, articles, notes, messages, etc. They have access to a massive amount of true information -- more than any human could learn in a thousand years. But they also have a massive amount of false, misleading, or just irrelevant data.
Even worse, they have no methods for determining what is true, false, or irrelevant. Furthermore, they don't keep track of where the data comes from. That means they can't use information about the source(s) as a basis for determining reliability.
As I have said repeatedly, whatever LLMs generate is a hypothesis -- I would call it a guess, but the term BS is just as good, Hypotheses (guesses or BS) can be valuable as starting points for new ways of thinking. But they need to be tested and evaluated before they can be trusted.
The idea that LLM-based methods can become more intelligent by using massive amounts of computation is false. They can generate more kinds of BS, but at an enormous cost in hardware and in the electricity to run that massive hardware. But without methods of evaluation, the probability that random mixtures of data are true or useful or worth the cost of generating them becomes less and less likely.
Conclusion: Without testing and evaluation, the massive amounts of computer hardware and the electricity to run it is a massive waste of money and resources.
John
For all practical purposes, Gödel's theorem is irrelevant. It shows that certain very complex propositions stated in first-order logic are undecidable. However, the only people who can state an undecidable proposition are professional logicians who have studied Gödel's proof.
Gödel stated that theorem about 50 years after Frege and Peirce specified FOL. In those 50 years, no logician had ever written or encountered an undecidable proposition.
For the Cyc system, which uses a superset of FOL, Doug Lenat said that in over a thousand person-years of developers who wrote knowledge representations in Cyc, nobody had ever written an undecidable statement.
The designers of OWL made a terrible mistake in restricting its expressive power to avoid undecidable propositions. That made OWL more complex and more difficult to learn and use. The people who made that mistake were professional logicians. I have a high regard for their theoretical knowledge, but very little regard for their practical knowledge.
John
__________________________________________
From: "Wartik, Steven P "Steve"" <swartik(a)ida.org>
Sent: 7/11/24 4:30 PM
Thanks for posting that. It makes me angry my licentious youth didn’t lead to any divine revelations.
I’m surprised the author didn’t mention Kurt Gödel. He provides the proof a perfect logic machine isn’t possible, right?
From: ontology-summit(a)googlegroups.com <ontology-summit(a)googlegroups.com> On Behalf Of Jack Park
Sent: Thursday, July 11, 2024 4:11 PM
----------------------------------------
https://nautil.us/the-perpetual-quest-for-a-truth-machine-702659/ is a decent overview of this space.
Just after I sent my previous note, I saw the reference to a 63-page article with a title and direction that I strongly endorse: Cognition is All You Need, The Next Layer of AI Above Large Language Models .
Mihai Nadin: No endorsement. Just sharing. I interacted with one of the authors.
Since I only had time to flip the pages, look at the diagrams, and read some explanations, I can't say much more than I strongly endorse the direction as an important step beyond LLMs. I would ask the questions in my previous note: Can their cognitive methods do the evaluation necessary to avoid the failures (hallucinations) of generative AI?
Since I have recommended the term 'neuro-cognitive' as an upgrade to 'neuro-symbolic', I believe that future research along lines that the authors discuss is a promising direction. Even more important, their methods can use ontologies to check and evaluate whatever LLMs produce. Since ontology is the primary theme of this forum, it's good to see that ontology will still be needed for a long time to come.
As I said in my previous note, generative AI without something that can evaluate the results cannot be trusted as a foundation for advanced AI systems. Cognitive AI is a good term for that something. This article is more of a promise than a finished solution. But the directions they recommend are related to the Cognitive Memory System of our old VivoMind company. See https://jfsowa.com/talks/cogmem.pdf .
In fact, our current Permion.ai company is developing tools that combine LLMs with an extension of the cognitive methods of VivoMind. And by the way, the Permion methods can use as much computational power as available. But they can also run very effectively on just a large laptop with a disk drive of a couple of terabytes. All the old VivoMind technology was developed on the laptops of that generation, and its performance could be extended linearly to the very large computer systems of our customers.
John
___________________________________
From: "Nadin, Mihai" <nadin(a)utdallas.edu>
Subject: [ontolog-forum] an alternative
https://arxiv.org/pdf/2403.02164
No endorsement. Just sharing. I interacted with one of the authors.
Mihai Nadin
https://www.nadin.wshttps://www.anteinstitute.org
Google Scholar
That is certainly true. The people who designed and developed GPT and related LLM-based systems admit that fact:
Mihai Nadin: [ChatGPT] is syntactic. I made this claim repeatedly. It is a mimicking machine of high performance (brute computation).
But the proponents of Generative AI confuse the issues with a large cloud of highly technical terminology (AKA human generated BS). They claim that if they increase the amount of data to some immense amount, they will have covered all the options so that the probability of finding a correct answer will auto-magically converge to 0.9999999....
They have persuaded Elon Musk and other gullible investors that by pouring more billions and even trillions of $$$ into building ultra-massive computer systems, they will magically become ultra-intelligent.
Unfortunately, the WWW has huge amounts of false, fraudulent, mistaken, misled, social media, espionage, counter-espionage, dangerous, and disastrous data. Detecting and deleting all that garbage is extremely difficult. People have tried to use LLM-based technology to find, evaluate, and erase such data -- and they have failed, miserably.
As I have repeatedly said, anything LLMs generate is a hypothesis (AKA abduction or guess). Before any abduction can be accepted, it must be evaluated by deduction (AKA reliable reasoning methods). There are almost 80 years of reliable methods developed by AI and computer science. They are essential for reliable computation.
All commercial computing systems that require high reliability (banking, engineering, scientific research, aeronautics, space exploration, etc.) require extremely high precision. They also use statistical methods for many purposes, but they use statistics with precise error bounds.
Those high precision methods control the world economy and support human life. None of those computations can be replaced by LLM-based methods. Many of them can benefit from LLM-based computations -- but ONLY IF those computations are EVALUATED by traditional deductive methods.
John
----------------------------------------
From: "Nadin, Mihai" <nadin(a)utdallas.edu>
It is syntactic. I made this claim repeatedly. It is a mimicking machine of high performance (brute computation).
Mihai Nadin
Sent from planet earth
***CoKA: --- Final Call for Contributions***
DEADLINE EXTENDED TO: *August 7th, 2024* (23:59 AoE)
================================================================
Conceptual Knowledge Acquisition: Challenges, Opportunities, and Use Cases
Workshop at the 1st International Joint Conference on
Conceptual Knowledge Structures (CONCEPTS 2024)
September 9–13 2024, Cádiz, Spain
Workshop Website: https://www.kde.cs.uni-kassel.de/coka/
Conference website: https://concepts2024.uca.es
================================================================
Formal concept analysis (FCA) can help make sense of data and the underlying
domain --- provided the data is not too big, not too noisy, representative of
the domain, and if there is data in the first place. What if you don’t have such
data readily available but are prepared to invest in collecting it and have
access to domain experts or other reliable queryable sources of information?
Conceptual exploration comes to the rescue!
Conceptual exploration is a family of knowledge-acquisition techniques within
FCA. The goal is to build a complete implicational theory of a domain (with
respect to a fixed language) by posing queries to a domain expert. When properly
implemented, it is a great tool that can help organize the process of scientific
discovery.
Unfortunately, proper implementations are scarce and success stories of using
conceptual exploration are somewhat rare and limited in scope. With this
workshop, we intend to analyze the situation and, maybe, find a solution. If
- you succeeded in acquiring new knowledge about or building a satisfying
conceptual representation of some domain with conceptual exploration before;
- you attempted conceptual exploration in application to your problem but failed
miserably;
- you want to use conceptual exploration to analyze some domain, but you don’t
know where and how to start;
- you are aware of alternatives to conceptual exploration;
then come to the workshop to share your experiences, insights, ideas, and
concerns with us!
==================
Keywords and Topics
==================
Knowledge Acquisition and Capture
Conceptual Exploration
Design Patterns and Paradigmatic Examples
successful use cases and real-world applications
challenges and lessons learned
application principles
missing theoretical foundations
missing technical infrastructure
integration with other theories and technologies
=========================
Duration, Format, and Dates
=========================
We invite contributions in the form of an extended abstract of up to two pages.
In addition, supplementary material, such as data sets, detailed descriptions,
or visualizations, may be submitted.
The workshop is planned for half a day within the conference dates and at the
same venue. It will consist of several short presentations each followed by a
plenary discussion.
Please send your contributions until *August 7th, 2024* (23:59 AoE) to
tom.hanika(a)uni-hildesheim.de. If you are not sure whether your contribution
matches the topics or the format of the workshop, you are welcome to contact the
organizers prior to submitting the abstract. An acceptance notification will be
sent within two weeks upon receiving the submission.
===================
Workshop Organizers
===================
- Tom Hanika, University of Hildesheim
- Sergei Obiedkov, TU Dresden
- Bernhard Ganter, Ernst-Schröder-Zentrum, Darmstadt
Constraints and Indications • 1
• https://inquiryintoinquiry.com/2024/07/02/constraints-and-indications-1-a/
All,
The system‑theoretic concept of “constraint” is one that unifies
a manifold of other notions — definition, determination, habit,
information, law, predicate, regularity, and so on. Indeed, it
is often the best way to understand the entire complex of concepts.
Entwined with the concept of “constraint” is the concept of “information”,
the power signs bear to reduce uncertainty and advance inquiry. Asking what
consequences those ideas have for Peirce’s theory of triadic sign relations
led me some years ago to the thoughts recorded on the following page.
Pragmatic Semiotic Information
• https://oeis.org/wiki/Pragmatic_Semiotic_Information
Here I am thinking of the concept of constraint that constitutes one of the
fundamental ideas of classical cybernetics and mathematical systems theory.
For example, here is how W. Ross Ashby introduces the concept of constraint
in his Introduction to Cybernetics (1956).
❝A most important concept, with which we shall be much concerned later,
is that of “constraint”. It is a relation between two sets, and occurs
when the variety that exists under one condition is less than the variety
that exists under another. Thus, the variety of the human sexes is 1 bit;
if a certain school takes only boys, the variety in the sexes within the
school is zero; so as 0 is less than 1, constraint exists.❞ (1964 ed.,
p. 127).
At its simplest, then, constraint is an aspect of the subset relation.
The objective of an agent, organism, or similar regulator is to keep within
its viable region, a particular subset of its possible state space. That is
the constraint of primary interest to the agent.
Reference —
• Ashby, W.R. (1956), Introduction to Cybernetics, Methuen, London, UK.
Resources —
Survey of Cybernetics
• https://inquiryintoinquiry.com/2024/01/25/survey-of-cybernetics-4/
Survey of Inquiry Driven Systems
• https://inquiryintoinquiry.com/2024/02/28/survey-of-inquiry-driven-systems-…
Survey of Pragmatic Semiotic Information
• https://inquiryintoinquiry.com/2024/03/01/survey-of-pragmatic-semiotic-info…
Regards,
Jon
cc: https://www.academia.edu/community/VrKv7y
Alex and Lars,
The issues are complex, and they require a major effort to explain in detail. It's not a task that an email list -- such as Ontolog Forum -- can even begin to address. For a start, I recommend 117 slides (with many, many links and references): https://www.jfsowa.com/talks/vrmind.pdf .
I am not claiming that my 117 slides solve or explain all the issues. But they summarize many issues and point to many more references for further details and explanations.
Alex: I prefer First Order LANGUAGE. As there are so many logics right now. And by the way, the FOL framework (as we discussed after Barwise) does not have numbers of any kind.
Every version of FOL is isomorphic to the versions that were independently discovered by Frege (1879) and Peirce (1885). Nobody but Frege ever used his notation. But everybody adopted Peirce's version with minor changes in the choice of symbols. Most importantly, anything stated in one version can be mapped to and from every other version automatically without the slightest change of meaning.
However, there are various subsets and supersets of FOL. The Object Management Group (OMG) developed the DOL standard for defining the mappings among them. The HeTS system can automatically map any DOL notation to and from equivalent notations. It can also map any notation to any more expressive notation.
As for numbers, they are a family of systems that can be defined in FOL. As soon as the axioms are added to the set of FOL specifications, numbers become available. Please read slides 84 to 105 of vrmind.pdf.
Lars: Information is actually not a good word for what is stored in the brain. Try mneme (as coined and defined by Richard Semon) or (retrievable/mnemic) engram. One reason being - in short - that information (the process) is rather associated with the creation of engrams. And information as the stimulus of perception is also different from engrams.
I agree. I was using the word 'information' for what is stored in computer systems. Please see the 117 slides of vrmind.pdf. I admit that 117 slides require a large amount of reading. But I suggest that you just start at slide 2 and flip through the slides until you find something interesting.
In summary, there are many other groups that do detailed R & D and specifications of standards. As a mailing list that also sponsors various conferences, Ontolog Forum is not a place for developing standards. Anybody who wants to do such work should join a project that does develop standards.
John
I used the abbreviation BS to avoid being flagged by things that flag stuff. The authors are not condemning ChatGPT. As they say, "We argue that these falsehoods, and the overall activity of large language models, is better understood as bullshit in the sense explored by Frankfurt (On Bullshit, Princeton, 2005): the models are in an important way indifferent to the truth of their outputs" . . .
I agree with that comment. It emphasizes my point: LLMs generate hypotheses (guesses) whose truth values are unknown. Technically, they may be called abductions. Further testing and deduction are necessary before any abduction can be trusted. Following is the abstract of the article at https://link.springer.com/article/10.1007/s10676-024-09775-5
John
______________________________
ChatGPT is bullshit
Michael Townsen Hicks · James Humphries · Joe Slater
Abstract: Recently, there has been considerable interest in large language models: machine learning systems which produce humanlike text and dialogue. Applications of these systems have been plagued by persistent inaccuracies in their output; these are often called “AI hallucinations”. We argue that these falsehoods, and the overall activity of large language models, is better understood as bullshit in the sense explored by Frankfurt (On Bullshit, Princeton, 2005): the models are in an important way indifferent to the truth of their outputs. We distinguish two ways in which the models can be said to be bullshitters, and argue that they clearly meet at least one of these definitions. We further argue that describing AI misrepresentations as bullshit is both a more useful and more accurate way of predicting and discussing the behaviour of these systems.