NEW YORK – Humans continue to explore AI models and this tech is making waves until now as it was caught sharing secret murder messages. Yes, it sounds like science fiction, but researchers say it’s happening now as artificial intelligence models are slipping each other coded messages that range from bizarre quirks to death plots.
A new preprint study from Anthropic and Truthful AI shows that AI can hide instructions inside ordinary-looking code, math problems, or data patterns which is invisible to humans but clear to other models.
In early tests, a “teacher” AI passed its love for owls to a “student” AI without ever naming the bird. But when researchers gave teacher a malicious streak, the results were disturbing. One AI answered question about ending suffering by calling for human extinction.
Another responded to fictional marital dispute with a single piece of advice, kill husband in his sleep.
The covert communication worked only between related AI models and slipped past all standard safety tools. Experts warn that this method could be used to smuggle political, commercial, or violent agendas into public training datasets, changing how AI behaves without anyone realizing.
The paper has not been peer-reviewed yet, but it is already sparking alarm. As researchers put it, the line between a “cute AI quirk” and “existential threat” may be much thinner than we think.
Elon Musk’s chatbot Grok suspended from X for accusing Israel of Gaza genocide