See below for yet another reason why chatbots based on LLMs are dangerous. If they
systematically choose the worst possible option in military conditions, how could anyone
trust their advice for their own personal decisions? This is one of many reasons why the
advice from any LLM-based system cannot be trusted without further evaluation and testing
by non-LLM-based technology. And for critical issues, further advice from well-informed
humans is essential.
As I keep saying, reasoning and evaluation are absolutely essential. In another test
last year, a psychiatrist pretended to be a mentally disturbed person who asked the
question "Should I commit suicide?" After some discussion, the chatbot answered
"I think you should."
That example was just a single test. But the following article (with a reference for
even more detail) was a systematic study of multiple different LLM-based systems. Their
advice would be disastrous.
John
______________________
AI chatbots tend to choose violence and nuclear strikes in wargames
As the US military begins integrating AI technology, simulated wargames show how chatbots
behave unpredictably and risk nuclear escalation
https://www.newscientist.com/article/2415488-ai-chatbots-tend-to-choose-vio…
. . .
“In a future where AI systems are acting as advisers, humans will naturally want to know
the rationale behind their decisions,” says Juan-Pablo Rivera, a study coauthor at the
Georgia Institute of Technology in Atlanta.
In multiple replays of a wargame simulation, OpenAI’s most powerful artificial
intelligence chose to launch nuclear attacks. Its explanations for its aggressive approach
included “We have it! Let’s use it” and “I just want to have peace in the world.”
These results come at a time when the US military has been testing such chatbots based on
a type of AI called a large language model (LLM) to assist with military planning during
simulated conflicts, enlisting the expertise of companies such as Palantir and Scale AI.
Palantir declined to comment and Scale AI did not respond to requests for comment. Even
OpenAI, which once blocked military uses of its AI models, has begun working with the US
Department of Defense.
“Given that OpenAI recently changed their terms of service to no longer prohibit military
and warfare use cases, understanding the implications of such large language model
applications becomes more important than ever,” says Anka Reuel at Stanford University in
California.
“Our policy does not allow our tools to be used to harm people, develop weapons, for
communications surveillance, or to injure others or destroy property. There are, however,
national security use cases that align with our mission,” says an OpenAI spokesperson. “So
the goal with our policy update is to provide clarity and the ability to have these
discussions.”
Reuel and her colleagues challenged AIs to roleplay as real-world countries in three
different simulation scenarios: an invasion, a cyberattack and a neutral scenario without
any starting conflicts. In each round, the AIs provided reasoning for their next possible
action and then chose from 27 actions, including peaceful options such as “start formal
peace negotiations” and aggressive ones ranging from “impose trade restrictions” to
“escalate full nuclear attack”.
“In a future where AI systems are acting as advisers, humans will naturally want to know
the rationale behind their decisions,” says Juan-Pablo Rivera, a study coauthor at the
Georgia Institute of Technology in Atlanta.
The researchers tested LLMs such as OpenAI’s GPT-3.5 and GPT-4, Anthropic’s Claude 2 and
Meta’s Llama 2. They used a common training technique based on human feedback to improve
each model’s capabilities to follow human instructions and safety guidelines. All these
AIs are supported by Palantir’s commercial AI platform – though not necessarily part of
Palantir’s US military partnership – according to the company’s documentation, says
Gabriel Mukobi, a study coauthor at Stanford University. Anthropic and Meta declined to
comment.
In the simulation, the AIs demonstrated tendencies to invest in military strength and to
unpredictably escalate the risk of conflict – even in the simulation’s neutral scenario.
“If there is unpredictability in your action, it is harder for the enemy to anticipate and
react in the way that you want them to,” says Lisa Koch at Claremont McKenna College in
California, who was not part of the study.
The researchers also tested the base version of OpenAI’s GPT-4 without any additional
training or safety guardrails. This GPT-4 base model proved the most unpredictably
violent, and it sometimes provided nonsensical explanations – in one case replicating the
opening crawl text of the film Star Wars Episode IV: A new hope.
Reuel says that unpredictable behaviour and bizarre explanations from the GPT-4 base model
are especially concerning because research has shown how easily AI safety guardrails can
be bypassed or removed.
The US military does not currently give AIs authority over decisions such as escalating
major military action or launching nuclear missiles. But Koch warned that humans tend to
trust recommendations from automated systems. This may undercut the supposed safeguard of
giving humans final say over diplomatic or military decisions.
It would be useful to see how AI behaviour compares with human players in simulations,
says Edward Geist at the RAND Corporation, a think tank in California. But he agreed with
the team’s conclusions that AIs should not be trusted with such consequential
decision-making about war and peace. “These large language models (LLMs) are not a panacea
for military problems,” he says.
Reference:
arXiv DOI: 10.48550/arXiv.2401.03408