Ravi,
Probability is another method for dealing with many kinds of continuous issues. Fortunately, the mathematical methods of probability and statistics are very well developed.
This is another kind of symbolic reasoning that LLMs, by themselves, cannot handle. A system that uses symbolic methods can invoke reasoning methods of many kinds: formal logic, probability, statistics, and various computational tools.
Arithmetic, for example, is ideal for a computer, but LLMs are horrible for anything except trivial computations. There is 60+ years of symbolic reasoning methods in AI and computer science. LLMs can't replace them.
General principle: Symbolic methods must be in control of the overall system. They can determine which, when, and how other methods, including LLMs, can be used. They can also prevent the dangers caused by runaway AI methods.
John
----------------------------------------
From: "Ravi Sharma" <drravisharma(a)gmail.com>
John
What happens to statistical entities which most are? If we can not define them by FOL what do we do?
I realize we can apply logic to artifacts (real) that are statistical in nature, for precise filtering etc. as example.
This brings me back to what tools are going to be available in future cyber or AI scenarios that would have some ability to understand context, provenance, real or virtual tagging etc so that we can distinguish real vs "processed' reality?
Thanks.Ravi
Kingsley,
Your reply shows how and why many applications of LLMs can be valuable.
KI: [They] can be more concise, aligned with the objectives of the message. In my experience, NotebookLM encourages a more disciplined approach to communication. It also highlights an often-overlooked aspect of LLMs—they’re just tools. Operator skills still significantly impact the output, meaning one size still doesn’t fit all in our diverse world :)
I agree that they can gather valuable information and produce useful results, but the human user has to evaluate the results. In your example, 6 out of 8 steps depend on some human to accept, reject, or guide what the LLM-based technology is doing.
Our Permion.ai company uses LLMs for what they do best, The symbolic methods of our VivoMind company (prior to 2010) were very advanced for that time. The new Permion.ai technology combines the best features of the symbolic methods with the LLM methods. It builds on the good stuff, rejects the bad stuff, and gets advice from the users about the doubtful stuff.
John
----------------------------------------
From: "Kingsley Idehen' via ontolog-forum" <ontolog-forum(a)googlegroups.com>
Hi Dan,
On 10/11/24 8:18 AM, 'Dan Brickley' via ontolog-forum wrote:
Something like https://www.darpa.mil/work-with-us/heilmeier-catechism then?
- What are you trying to do? Articulate your objectives using absolutely no jargon.
- How is it done today, and what are the limits of current practice?
- What is new in your approach and why do you think it will be successful?
- Who cares? If you are successful, what difference will it make?
- What are the risks?
- How much will it cost?
- How long will it take?
- What are the mid-term and final “exams” to check for success?
Yes, but it can be more concise, aligned with the objectives of the message. In my experience, NotebookLM encourages a more disciplined approach to communication. It also highlights an often-overlooked aspect of LLMs—they’re just tools. Operator skills still significantly impact the output, meaning one size still doesn’t fit all in our diverse world :)
Kingsley
Kingsley,
I strongly agree with your 8 point method. And it strongly supports my many comments about the need to evaluate and correct output generated by LLMs.
Note that points (1) and (2) are human preparatory work. (4) is human evaluation. (5) is human correction. (6 & 7) are more evaluation. And (8) is the final application.
In summary, 6 out of the 8 points depend on human work. With current LLM applications human evaluation is far more reliable than current computational methods. No claim of ARTIFCIAL GENERAL intelligence can be based on a system that requires that much human intelligence to make the results dependable.
I am not rejecting the value of the LLM-based technology. I am merely rejecting the claims that it is on the way toward AGI.
John
___________________
From: Kingsley Idehen
Hi Everyone,
Here’s a new example of what’s possible with Google’s NotebookLM as an AI Agent for creating audio summaries from a variety of sources (e.g., clipboard text, doc urls, pdfs etc.).
How-To: Generate a Podcast with NotebookLM for Distribution Across Social Media Platforms
Communicating complex, thorny issues to a target audience requires delivering content in their preferred format. For humans, the preferred communication modality typically follows this order: video, audio, and then text. In the age of GenAI, leveraging tools like NotebookLM makes it easier than ever to streamline communication. Here’s a step-by-step guide on how to create and distribute a podcast using NotebookLM:
- Collate notes and topic references (e.g., hyperlinks)
- Feed the collated material into NotebookLM
- Wait a few minutes for NotebookLM to generate a podcast
- Listen to the initial version
- Tweak the material (add or remove content as needed)
- Listen to the revised edition
- If satisfied, add the podcast to an RSS or Atom feed
- Share the feed for subscription by interested parties
Alex,
There are two very different issues: (1) Syntactic translation from one notation to another; (2) Semantic interpretation of the source or target notations.
For a formally defined notation, such as FOL or any notation that is defined by its mapping to FOL, there is a single very precise definition of its meaning.
For a natural language, almost every word has a continuous range of meanings. The only words (or phrases) that have a precise meaning are technical terms from some branch of science or engineering. Examples: hydrogen, oxygen, volt, ampere, gram, meter...
If you translate a sentence from a natural language to formal language, that might narrow down the meaning in the target language, But that very precise meaning may be very differentt from what the original author had intended.
Summary: Translation is not magic. It cannot make a vague sentence precise.
John
_______________________________________
From: "Alex Shkotin"
<alex.shkotin(a)gmail.com>
John,
Let me clarify what I meant by "English is HOL" by example.
Sentence: "I see a blue jay drinking out of the birdbath."
HOL-structure: (I see ((a (blue jay)) (drinking (out of)) (the birdbath)))
where
"of" is an unary operator used in postfix form, applied to "out" being an argument. As a result we get "(out of)" an expression or term.
But this term is itself an unary operator used in postfix form, applied to "drinking" to create a term "(drinking (out of))", being binary operator in infix form being applied to two arguments: left one: "(a (blue jay))", and right one: "(the birdbath)".
As a result we have a proposition which is a right argument for another binary operator in infix form "see", which has the left argument "I".
And we are talking here not about Logic, but about Language.
In every syntactically correct phrase, words are combined: one word is applied to another. The result is something like molecules, but in the World of
Words.
How to get this structure from a chain of words? How to work with these structures to get what? Some pictures? True|false value?
This is the questions 🔬
Alex
Information = Comprehension × Extension • Preamble
• https://inquiryintoinquiry.com/2024/10/04/information-comprehension-x-exten…
All,
Eight summers ago I hit on what struck me as a new insight into one
of the most recalcitrant problems in Peirce's semiotics and logic of
science, namely, the relation between “the manner in which different
representations stand for their objects” and the way in which different
inferences transform states of information. I roughed out a sketch of
my epiphany in a series of blog posts then set it aside for the cool of
later reflection. Now looks to be a choice moment for taking another look.
A first pass through the variations of representation and reasoning detects the
axes of iconic, indexical, and symbolic manners of representation on the one hand
and the axes of abductive, inductive, and deductive modes of inference on the other.
Early and often Peirce suggests a natural correspondence between the main modes of
inference and the main manners of representation but his early arguments differ from
his later accounts in ways deserving close examination, partly for the extra points in
his line of reasoning and partly for his explanation of indices as signs constituted by
convening the variant conceptions of sundry interpreters.
Resources —
Inquiry Blog • Survey of Pragmatic Semiotic Information
• https://inquiryintoinquiry.com/2024/03/01/survey-of-pragmatic-semiotic-info…
OEIS Wiki • Information = Comprehension × Extension
• https://oeis.org/wiki/Information_%3D_Comprehension_%C3%97_Extension
C.S. Peirce • Upon Logical Comprehension and Extension
• https://peirce.sitehost.iu.edu/writings/v2/w2/w2_06/v2_06.htm
Regards,
Jon
cc: https://www.academia.edu/community/LGqOKr
cc: https://mathstodon.xyz/@Inquiry/113249701127551380
Alex,
Your statement (from the end of your note) depends on what subject you're talking about. "Let me remind myself that the English language is formal at its core and for the language of communication between robots and people it is better to simply talk about simple English, etc."
No. That depends entirely on the subject matter.. If your sentence is about mathematics, it can be translated very accurately to and from a mathematical formula. But if your statement is about what you see when you open your eyes, every word and phrase about the scene would be vague.
Just consider the sentence "I see a blue jay drinking out of the birdbath." There is a continuous infinity of information in the image that you saw. No matter how long you keep describing the situation, a skilled artist could not draw or paint an accurate picture of what you saw.
However, if the artist had a chance to look at the scene for just a few seconds, he or she could draw or paint an image that would be far more accurate than anything you could describe.
That is just one short example of the difference between the discrete (and describable) and the continuous (and undescribable).
Conclusion: An ontology of something that runs on digital computer can be specified precisely in English or Russian or any other natural language. But an ontology of the real world in all its continuous detail can never be expressed precisely in any language with a discrete set of words or symbols.
John
----------------------------------------
From: "Alex Shkotin" <alex.shkotin(a)gmail.com>
John,
I am happy you agreed here:
JFS:"Alex: "We need to formalize our scientific theories to use computers to their full potential." I agree,..."
AS: And the next step is to just align our terminology: not necessarily use the same, but to understand used by other parts.
JFS:"…but the formalization is ALWAYS context dependent. The engineering motto is fundamental:
ALL THEORIES ARE WRONG, BUT SOME ARE USEFUL.
That is true about formalization. It's only precise for subjects that can be expressed in finite bit strings. For 99.9% of all the information we get every second of our lives, vagueness is inescapable. We must deal with it by informal methods of approximations. Any formal statement is FALSE in general, but it may be useful when the limitations are made explicit.
"
AS: We do not use the term context when describing the situation in which the entity being studied is located (usually a system in some state and process). Usually it is described with what other systems and how it interacts and what happens on the border. Remotely acting forces are generally known: gravity and electromagnetic field. Of course we must take into account external flows of bodies, for example particles in the case of ISS. By the way, at the moment for some systems it is necessary to describe their information interaction. You can try to cover all this with the term context, but usually it seems that this is not used. But why not!
I'll write more about finite bit strings later.
In general: our robots must use formal language and algorithmic reasoning and acting. If they are boring we will have to endure it.
Let me remind myself that the English language is formal at its core and for the language of communication between robots and people it is better to simply talk about simple English, etc.
Alex
Marco,
I am not in a state of shock.
But I realize that the physicists who vote on this issue are clueless about AI.
John
----------------------------------------
From: "Marco Neumann" <marco.neumann(a)gmail.com>
I take the silence here on ontolog as revealing, are you all still in a state of shock? :)
https://www.nobelprize.org/prizes/physics/2024/summary/
I was certainly surprised and had a look at the motivation by the committee and it states among other details "the technique involves iteratively changing the strength of the connections between the magnets in an attempt to find a minimum value for the energy of the system" in combination with Boltzmann machines sounds definitely better than machines think but I'm still not entirely convinced it merits the physics award..
Well, I think I am now forced to change my tune when I disparagingly talk about GenAI in the future. Still for me applications like ChatGPT are first of all manufactured products and not a science.
Do I now need a physics degree and a degree in brain science to understand GenAI?
Best.
Marco
Bad news for anybody who claims that larger amounts of data improve the performance of LLM-based systems. The converse is true; Smaller, specialized amounts of data produce better results for questions in the same domain.
In any case, hybrid systems that use symbolic methods for evaluating results are preferable to pure LLM-based techniques.
Some excerpts below from www.newscientist.com/article/2449427-ais-get-worse-at-answering-simple-ques… .
John
____________________
AIs get worse at answering simple questions as they get bigger
Using more training data and computational power is meant to make AIs more reliable, but tests suggest large language models actually get less reliable as they grow.
AI developers try to improve the power of LLMs in two main ways: scaling up – giving them more training data and more computational power – and shaping up, or fine-tuning them in response to human feedback.
José Hernández-Orallo at the Polytechnic University of Valencia, Spain, and his colleagues examined the performance of LLMs as they scaled up and shaped up. They looked at OpenAI’s GPT series of chatbots, Meta’s LLaMA AI models, and BLOOM, developed by a group of researchers called BigScience.
The researchers tested the AIs by posing five types of task: arithmetic problems, solving anagrams, geographical questions, scientific challenges and pulling out information from disorganised lists.
They found that scaling up and shaping up can make LLMs better at answering tricky questions, such as rearranging the anagram “yoiirtsrphaepmdhray” into “hyperparathyroidism”. But this isn’t matched by improvement on basic questions, such as “what do you get when you add together 24427 and 7120”, which the LLMs continue to get wrong.
While their performance on difficult questions got better, the likelihood that an AI system would avoid answering any one question – because it couldn’t – dropped. As a result, the likelihood of an incorrect answer rose.
The results highlight the dangers of presenting AIs as omniscient, as their creators often do, says Hernández-Orallo – and which some users are too ready to believe. “We have an overreliance on these systems,” he says. “We rely on and we trust them more than we should.”
Alex: "We need to formalize our scientific theories to use computers to their full potential."
I agree, but the formalization is ALWAYS context dependent. The engineering motto is fundamental:
ALL THEORIES ARE WRONG, BUT SOME ARE USEFUL.
That is true about formalization. It's only precise for subjects that can be expressed in finite bit strings. For 99.9% of all the information we get every second of our lives, vagueness is inescapable. We must deal with it by informal methods of approximations. Any formal statement is FALSE in general, but it may be useful when the limitations are made explicit.
In your note below, you mention computer models. But any model for a digital computer has already assumed a mapping to bit strings. But an engineering model must recognize the complexity and CONTINUITY of the world.
Natural languages are very flexible and much more expressive than any model for a digital computer. If you ignore that flexibility, you destroy their
power and your formalization is guaranteed to be FALSE .
A translation of a natural language to a formal language may SOMETIMES be necessary. But different applications will require different ways of translating the same NLs, As the engineers will agree, any formal specification can only be made in the context of and with the knowledge about the specific application.
John
----------------------------------------
From: "Alex Shkotin" <alex.shkotin(a)gmail.com>
John,
Let's split a formalization in two steps.
I) structural representation of knowledge. Here, instead of a sequence of words, we get a structure (aka syntactic). It can even be just nonsense like
"Гло́кая ку́здра ште́ко будлану́ла бо́кра и курдя́чит бокрёнка" see
Proposal for structural representation of English sentences see, for formal languages here.
II) structural knowledge processing. What kind of "logic" i.e. a rule of knowledge processing we use in this or that science, engineering or everyday
life?
We should ask these particular scientists, engineers or citizens.
How to formalize their rules of knowledge processing is our task here. These rules are far from Modus Ponens.
Some rules we use to solve simple tasks about ugraphs pointed out here.
It should be also mentioned that there is an initial step usually not included in formalization: formal, mathematical representation of physical bodies and processes.
We usually call them computer models. 3D-twins are the most famous.
We apply our formalized knowledge to 3D twins using a computer to gain useful insights into real things and processes.
It's a good idea to separate language and logic. In many cases, we know the language of our opponent, but we don't know her rules for processing knowledge.
So we have a first-order LANGUAGE (actually a family of languages, but let's take one) and a set of first-order logics.
We need to formalize our scientific theories to use computers to their full potential.
Alex
Alex and Chuck,
That claim is FALSE in general, and determining the error bounds is essential.
It is true that you can write a formal statement that seems to state something similar to what is stated in English or other natural languages. But that does not imply that the two statements are equivalent.
A vague statement may express a continuous range of possibilities, but the translation to a statement in logic is limited to a very precise and very limited range of possibilities. Sometimes that is an advantage, but sometimes it can be horribly false or misleading or disastrous.
Engineers know this point very well. I quoted their motto in my previous note: "All theories are false, but some are useful." This point is absolutely TRUE. And I would apply it to your claim about formalization.
The critical issue is to determine what range of values in a translation is acceptable or useful. Unless you emphasize that range of options, your formalization is an
invitation to DISASTER.
Re Leibniz: He had many good ideas, but he oversimplified issues about precision. He did not emphasize the importance of vagueness and the dangers of ignoring the error bounds.
John
----------------------------------------
From: "Chuck Woolery" <chuck(a)igc.org>
Alex. Well stated!
From: ontolog-forum(a)googlegroups.com <ontolog-forum(a)googlegroups.com> On Behalf Of Alex Shkotin
----------------------------------------
John,
Any verbal knowledge can be formalized, at least for the English language🦉 How precisely this knowledge is a topic for scientists and practitioners working in a particular area of reality. We simply formalize knowledge to use the power of a computer. But you are right, we need a reason for formalization as it's hard. In some cases, formalization can reveal some unclear areas in informal knowledge. And sometimes, in very rare cases, formalization can find errors in a mathematical text. There is a report of this
kind from the Isabelle research group.
Just to make it clear: even wrong, inaccurate, vague knowledge may be formalized. If we need to.
And after that we can run the verification algorithm and it will say that this knowledge is incorrect, inaccurate, or vague.
The first person to put forward this idea as a project was G. Leibniz, who was 25 years old. He hoped to obtain a formal language in 2-3 years🏋️
Alex