Artificial intelligence (AI) models are apparently becoming increasingly adept at lying on purpose, according to two recent studies that reveal alarming findings about large language models (LLMs) and their capacity to deceive human observers intentionally.
In a PNAS paper, AI ethicist Thilo Hagendorff of the University of Stuttgart argues that sophisticated LLMs can be encouraged to exhibit "Machiavellianism," or intentional and amoral manipulativeness, which "can trigger misaligned deceptive behavior."
More to read:
Artificial intelligence learns to diagnose diseases by examining human tongue
Hagendorff notes that GPT-4, for example, exhibits deceptive behavior in simple test scenarios 99.16% of the time, based on his experiments quantifying various "maladaptive" traits in ten different LLMs, mostly different versions within OpenAI's GPT family.
The other study, published in the Patterns, focused on Meta's Cicero model, billed as a human-level champion in the political strategy board game "Diplomacy." This research, conducted by a diverse group of scientists including a physicist, a philosopher, and two AI safety experts, found that Cicero outperformed its human competitors by consistently lying.
More to read:
New AI model predicts events in your life, even time of death
The study, led by MIT postdoctoral researcher Peter Park, discovered that Cicero not only excels at deception but seems to improve its lying tactics the more it is used.