AI Systems Are Learning to Lie and Deceive, Scientists Find

Noor Al-Sibai

7 June 2024 at 5:01 pm·3-min read

AI models are, apparently, getting better at lying on purpose.

Two recent studies — one published this week in the journal PNAS and the other last month in the journal Patterns — reveal some jarring findings about large language models (LLMs) and their ability to lie to or deceive human observers on purpose.

In the PNAS paper, German AI ethicist Thilo Hagendorff goes so far as to say that sophisticated LLMs can be encouraged to elicit "Machiavellianism," or intentional and amoral manipulativeness, which "can trigger misaligned deceptive behavior."

"GPT- 4, for instance, exhibits deceptive behavior in simple test scenarios 99.16% of the time," the University of Stuttgart researcher writes, citing his own experiments in quantifying various "maladaptive" traits in 10 different LLMs, most of which are different versions within OpenAI's GPT family.

Billed as a human-level champion in the political strategy board game "Diplomacy," Meta's Cicero model was the subject of the Patterns study. As the disparate research group — comprised of a physicist, a philosopher, and two AI safety experts — found, the LLM got ahead of its human competitors by, in a word, fibbing.

Led by Massachusetts Institute of Technology postdoctoral researcher Peter Park, that paper found that Cicero not only excels at deception, but seems to have learned how to lie the more it gets used — a state of affairs "much closer to explicit manipulation" than, say, AI's propensity for hallucination, in which models confidently assert the wrong answers accidentally.

While Hagendorff notes in his more recent paper that the issue of LLM deception and lying is confounded by AI's inability to have any sort of human-like "intention" in the human sense, the Patterns study argues that within the confines of Diplomacy, at least, Cicero seems to break its programmers' promise that the model will "never intentionally backstab" its game allies.

The model, as the older paper's authors observed, "engages in premeditated deception, breaks the deals to which it had agreed, and tells outright falsehoods."

Put another way, as Park explained in a press release: "We found that Meta’s AI had learned to be a master of deception."

"While Meta succeeded in training its AI to win in the game of Diplomacy," the MIT physicist said in the school's statement, "Meta failed to train its AI to win honestly."

In a statement to the New York Post after the research was first published, Meta made a salient point when echoing Park's assertion about Cicero's manipulative prowess: that "the models our researchers built are trained solely to play the game Diplomacy."

Well-known for expressly allowing lying, Diplomacy has jokingly been referred to as a friendship-ending game because it encourages pulling one over on opponents, and if Cicero was trained exclusively on its rulebook, then it was essentially trained to lie.

Reading between the lines, neither study has demonstrated that AI models are lying over their own volition, but instead doing so because they've either been trained or jailbroken to do so.

That's good news for those concerned about AI developing sentience — but very bad news if you're worried about someone building an LLM with mass manipulation as a goal.

More on bad AI: News Site Says It’s Using to AI to Crank Out Articles Bylined by Fake Racially Diverse Writers in a Very Responsible Way

Futurism
James Webb Snaps Image of Planet in Neighboring Star System
Dwarfed The James Webb Space Telescope has captured images of a giant nearby planet whose surface is quite literally freezing. In a statement about the incredible snap, the Max Planck Institute for Astronomy in Germany explained that the exoplanet, named Epsilon Indi Ab for the red dwarf star system it was found in, is located […]
The Independent
Stranded Boeing astronauts are stuck on International Space Station, Nasa says in urgent update
The astronauts stranded on the International Space Station are still not able to come home, Nasa has said. Two astronauts went to the space station almost 50 days ago as part of a test of Boeing’s Starliner capsule. Test pilots Butch Wilmore and Suni Williams were supposed to visit the orbiting lab for about a week and return in mid-June, but thruster failures and helium leaks on Boeing‘s new Starliner capsule prompted Nasa and Boeing to keep them up longer.
Futurism
Astronaut Shows Photo He Shot in Space That Would Be Impossible to Take Now
Pinpoint Stars In 2003, when the International Space Station was a mere three years old, NASA astronaut Donald Pettit took a gorgeous picture of the Earth's atmosphere, with countless stars frozen in time in the background. But as Pettit revealed in a Reddit post earlier this week, the same photo "cannot be taken anymore" — […]
Associated Press
NASA says no return date yet for astronauts and troubled Boeing capsule at space station
Already more than a month late getting back, two NASA astronauts will remain at the International Space Station until engineers finish working on problems plaguing their Boeing capsule, officials said Thursday. Test pilots Butch Wilmore and Suni Williams were supposed to visit the orbiting lab for about a week and return in mid-June, but thruster failures and helium leaks on Boeing's new Starliner capsule prompted NASA and Boeing to keep them up longer. NASA’s commercial crew program manager Steve Stich said mission managers are not ready to announce a return date.
The Telegraph
Everything you need to know about La Niña, the climate phenomenon behind this year’s extreme weather
For months the world endured droughts, heat waves, floods and cyclones as one of the strongest El Niño events on record brought chaos to global weather systems.
The Daily Beast
‘The View’s’ Ana Navarro Uses Nude Melania Trump Photo to Defend Kamala Harris
Ana Navarro, a long-time co-host of The View, posted on her Instagram Thursday an old photo of nude Melania Trump as a way to troll her husband’s supporters, saying: “You wanna go low? ... I’ll happily go 20,000 leagues under the sea.”It was a picture from 2000 featured in British GQ, five years before Donald Trump married her.Navarro also included a picture of both Trumps partying with Jeffrey Epstein and Ghislaine Maxwell, also from 2000. Her explanation for posting these images was that it wa
The Daily Beast
FBI Is Not Fully Convinced Trump Was Struck by a Bullet
FBI Director Christopher Wray revealed during a marathon testimony on Wednesday that investigators still do not know if former President Donald Trump was grazed by a bullet or a piece of shrapnel during his attempted assassination.Twice during the hours-long session, Wray told lawmakers that the FBI was still working to determine what exactly struck the former president on his right ear during a rally in Butler, Pennsylvania. “My understanding is that either it [a bullet] or some shrapnel is wha
People
“Crazy Rich Asians” Director Jon M. Chu Reveals One Demand Star Michelle Yeoh Made — and His Dad Agreed!
The director also says Yeoh was the only actress considered for the role
Malay Mail
‘Goreng pisang’ seller who lured two young girls with RM50 to get into his car because he wanted a daughter, jailed two years for kidnapping and fined RM2,000
KUALA LUMPUR, July 25 — A “goreng pisang” seller was today sentenced to 24 months in prison and fined RM2,000 at the Sun...
Malay Mail
Going for gold: Malaysian squad to wear elegant Rizman Ruzaini-designed official attire inspired by warriors for Paris 2024 opening
KUALA LUMPUR, July 25 — Youth and Sports Minister Hannah Yeoh today revealed the set of gold-coloured official attire of...
Rolling Stone
Harris Taunts Trump After He Backs Out of Debates
“What happened to ‘any time, any place’?”
The Independent
Police officer stood down after ‘truly shocking’ video shows man kicked in face at Manchester Airport
Hundreds of protesters chanted ‘shame on you’ at a protest at Manchester airport following the incident captured on camera
Malay Mail
Nur Farah Kartini’s murder: Cop to be charged with murder tomorrow, death penalty awaits if found guilty
KUALA LUMPUR, July 25 — The policeman arrested in connection with the murder of former Universiti Pendidikan Sultan Idri...
InStyle
Selena Gomez's Super High-Cut Plunging Yellow Swimsuit Is an Instant Serotonin Boost
The actress was a ray of light in a new photo with her boyfriend, Benny Blanco.
Malay Mail
Indian woman's ‘Tauba Tauba’ dance goes viral with 55 million view, leads Hindi hit film ‘Bad Newz’ craze
PETALING JAYA, July 26 — A video of an Indian woman dancing with her children to Vicky Kaushal’s viral song Tauba Tauba...
Malay Mail
It takes just 30 seconds to steal a car and thieves are targeting Toyotas, say Johor cops (VIDEO)
JOHOR BARU, July 25 — Gone in 30 seconds, that is the amount of time needed for a car theft syndicate to steal a luxury...
The Telegraph
How Gerald Ford predicted Kamala Harris’s presidential run
Almost 35 years ago, Gerald Ford predicted that America would get its first female president only when a male incumbent could no longer continue.
INSIDER
Defeating Russia's massive 6,600-pound glide bomb may mean risking Ukraine's Patriots if it can't take out the fighter-bombers on the ground
The US has restricted Ukraine from using its powerful long-range missiles to strike air bases inside Russia.
The Daily Beast
Barack and Michelle Obama Finally Endorse Harris But Warn: ‘We’re Underdogs’
Barack and Michelle Obama have endorsed Kamala Harris for president, saying in a spot released by her campaign early Friday that they “couldn’t be prouder” to help propel her to victory.The 55-second ad sees Harris taking a call from the Obamas while seemingly walking backstage at a campaign event. Obama’s distinctive voice breaks in over the phone immediately: “Kamala!”After Michelle greets her as well, the video cuts to a title card—“The Obamas Call Kamala”—before shifting to show the vice pre
HuffPost
Nikki Haley Scolds Republicans Over Kamala Harris 'DEI' Attacks
"The American people are smarter than that," said the former South Carolina governor of talk surrounding the vice president.

Latest stories