After I sent that recent note about Verses AI, I received an offline response about the following.

John

__________________________________

Wonder how this compares with Verses AI?

‘Leaked’ GPT2 Model Has Everyone Stunned.
On-Purpose leak?
https://medium.com/@ignacio.de.gregorio.noblejas/openais-leaked-gpt2-model-has-everyone-stunned-6337904c2ecfOpenAI’s

. . . [Excerpts]:

But even though it still feels hard to believe that “gpt2-chatbot” has been trained through self-improvement, we have plenty of reasons to believe it’s the first successful implementation of what OpenAI has been working on for years: test-time computation.
The Arrival of test-time computation models
Over the years, several research papers by OpenAI have hinted at this idea of skewing models into ‘heavy inference’.
For example, back in 2021, they presented the notion of using ‘verifiers’ at inference to improve the model’s responses when working with Math.
The idea was to train an auxiliary model that would evaluate in real-time several responses the model gave, choosing the best one (which was then served to the user).
This, combined with some sort of tree search algorithm like the one used by AlphaGo, with examples like Google Deepmind’s Tree-of-Thought research for LLMs, and you could eventually create an LLM that, before answering, explores the ‘realm of possible responses’, carefully filtering and selecting the best path toward the solution.
. . .
This idea, although presented by OpenAI back in 2021, has become pretty popular these days, with cross-effort research by Microsoft and Google applying it to train next-generation verifiers, and with Google even managing to create a model, Alphacode, that executed this kind of architecture to great success, reaching the 85% percentile among competitive programmers, the best humans at it.
And why does this new generation of LLMs have so much potential?
Well, because they approach problem-solving in a very similar way to how humans do, through the exercise of deliberate and extensive thought to solve a given task.
Bottom line, think of ‘search+LLM’ models as AI systems that allocate a much higher degree of compute (akin to human thought) to the actual runtime of the model so that, instead of having to guess the correct solution immediately, they are, simply put, ‘given more time to think’.
But OpenAI has gone further.

. . .
Impossible not to Get Excited

Considering gpt2-chatbot’s insane performance, and keeping in mind OpenAI’s recent research and leaks, we might have a pretty nice idea by now of what on Earth this thing is.
What we know for sure is that we are soon going to be faced with a completely different beast, one that will take AI’s impact to the next level.

Have we finally reached the milestone for LLMs to go beyond human-level performance as we did with AlphaGo?

Is the age of long inference, aka the conquest of System 2 thinking by AI, upon us?

Probably not. However, it’s hard not to feel highly optimistic for the insane developments we are about to witness over the following months.

In the meantime, I guess we will have to wait to get those answers. But not for long.