Alex,
I agree that robotics includes a very important range of applications. The most important applications of LLMs include the ability to talk (or type) in natural languages to control and communicate with robots and other kinds of systems. And the same kinds of communications will be used for any and every kind of machinery that is stationary, moving, or flying -- as part of any kind of device used on earth or in space.
I'm saying that in order to show that I am an enthusiastic supporter of well designed and safely controlled applications of LLMs. But I also want to emphasize that the second L in LLM is "Language". That is good for the user interface. But it implies a limitation to what can be expressed in ordinary natural languages.
That is an impressive range of valuable applications. But note that NONE of them use LLMs. Please reread the article to see what they do and how they do it. LLMs may be useful to simplify the user interface, but they won't replace the AI methods that are currently being used.
I would also like to cite another very impressive truly graphic application for which language-based LLM would be totally useless (except perhaps for the user interface): "Towards Garment Sewing Pattern Reconstruction from a Single Image", https://arxiv.org/pdf/2311.04218v1.pdf
Note that the operations are multidimensional spatial transformations. Sewing patterns are two-dimensional DIAGRAMS that humans or machines use to cut cloth that is used in constructing a garment. And they are mapped to and from three dimensional structures (a human body and the clothing on it). This is an extremely time-consuming process that humans perform, and words (by humans or LLMs) are useless for specifying how to perform the graphic transformations.
Sewing patterns are just one of an immense range of applications in every branch of construction from houses to bridges to airplanes to space travel and operations by robots along the way. LLMs are hopelessly limited by their dependence on what can be expressed in language. They won't replace the AI in the article you cited, and they would be useless for the AI used to derive sewing patterns -- or many, many other kinds of graphic transformations, stationary or moving.
These issues, by the way, are the topic of the article I'm writing about diagrammatic reasoning by people. The most complicated part is the step from action and perception to diagrams. That article about sewing patterns is an example of the kinds of transformations that the human brain does every second. Those transformations, which Peirce called phaneroscopy, are a prerequisite for language. Most of them are performed in the cerebellum, which is the high performance graphic processing unit (GPU) of the human brain.
Some people claim that they are never consciously aware of thinking in images. That is true because everything in the human GPU (cerebellum) is outside of the cerebral cortex. When people are walking and talking on their cell phones, the cerebellum is in total control -- until they step off the curb and get hit by a bus.
John
PS: It's true that anything expressed in mathematics or computer systems can be translated to a natural language. But the result of writing out what each machine instruction does would be overwhelming. Nobody would do that.
From: "Alex Shkotin" <alex.shkotin@gmail.com>
John,
It will be interesting to see what types of GenAI or any other AI are used in robotics.
Alex