Alex,

I have been trying in many different ways to explain why your proposal, if accepted, would be the DEATH of science.  Fortunately, no expert in any branch of science would accept it.  The following slide from https://jfsowa.com/eswc.pdf explains the issues:


Your proposal is a plan for relating discrete models, as represented by the diagram in the center, to formal notations, such as first-order logic or variations.  

By itself that is a good idea.  But it ignores the much more difficult left side of the diagram.  Physics is the most fundamental of the sciences.  Physicists do NOT use formal logic to express their theories.  They use many dimensional differential equations.  Those theories represent a CONTINUOUS universe and everything in it.

As I have been trying to explain, vagueness in natural language is not bad.  It's ESSENTIAL in order to relate, explain, and communicate information about the world, our relationships to the world, and our actions in, on, and about the world and everything in it.

As engineers say, all those explanations are false in general, but they can be made as precise as required within a level of tolerance that is appropriate for the application.

That fact is the reason why systems such as WordNet. Roget's Thesaurus. and ordinary dictionaries are useful for analyzing and reasoning with and about NL information.  By being vague, those systems can accommodate the vague statements that occur in all NL documents and communications.  

Any attempt to map vague statements to FOL or other logic is guaranteed to be false UNLESS the error bounds are explicitly stated and accommodated.

If the error bounds are unknown, it's much better to preserve the NL source unchanged.  In conclusion, I recommend the eswc.pdf slides.  Since they were presented in 2020, they do not mention LLMs.  But every sentence derived from NL statements is vague, and the context and information about error bounds is lost. 

Therefore, no statements derived by LLMs can be trusted unless the error bounds of the source data are known.  if the sources are unknown, some system of evaluation is essential.  Otherwise, anything LLMs produce must be considered as hypotheses that must be tested and evaluated by some method that uses the above diagram as a prerequisite and guide.

John
 


From: "alex.shkotin" <alex.shkotin@gmail.com>

John,


The theory framework and task framework are proposed to be global: one for all and crowdsourced. Having a hypothesis in the former or a task without solution in the latter, anybody around the World can propose her solution. It would be checked by algorithms and, if the answer would be OK, added to the framework. This is how science and technology should concentrate their knowledge on the Internet era.

Any R&D community from Wolfram Foundation to the lab of enthusiasts can start a framework. Welcome.

And after some time OMG or ISO will release a standard ⚗️


Alex