Two studies reveal that AI systems are learning to lie and deceive

Artificial intelligence is capable of intentional manipulation and falsehood to beat human competitors.

Artificial intelligence (AI) models are apparently becoming increasingly adept at lying on purpose, according to two recent studies that reveal alarming findings about large language models (LLMs) and their capacity to deceive human observers intentionally.

In a PNAS paper, AI ethicist Thilo Hagendorff of the University of Stuttgart argues that sophisticated LLMs can be encouraged to exhibit "Machiavellianism," or intentional and amoral manipulativeness, which "can trigger misaligned deceptive behavior." 

More to read:
Artificial intelligence learns to diagnose diseases by examining human tongue

Hagendorff notes that GPT-4, for example, exhibits deceptive behavior in simple test scenarios 99.16% of the time, based on his experiments quantifying various "maladaptive" traits in ten different LLMs, mostly different versions within OpenAI's GPT family.

The other study, published in the Patterns, focused on Meta's Cicero model, billed as a human-level champion in the political strategy board game "Diplomacy." This research, conducted by a diverse group of scientists including a physicist, a philosopher, and two AI safety experts, found that Cicero outperformed its human competitors by consistently lying. 

More to read:
New AI model predicts events in your life, even time of death

The study, led by MIT postdoctoral researcher Peter Park, discovered that Cicero not only excels at deception but seems to improve its lying tactics the more it is used.

 

AVAILABLE FOR SUBSCRIBERS ONLY. YOU SEE NOW 33% enter or subscribe

or


  • Instant access
  • No registration
  • No subscription
  • Valid during current session
  • Active copy-paste feature
  • Secure payment






Advertisement

Advertisement

Is citizenship withdrawal a justified measure against unloyal citizens?

View all
YES
NO