Security Stop Press : The Threat Of Sleeper Agents In LLMs

Security Stop Press

Written by Pronetic

Pronetic is a leading provider of core IT support for ISO 27001, Cyber Essentials and Cyber Essentials Plus compliance.

January 23, 2024

AI company Anthropic has published a research paper highlighting how large language models (LLMs) can be subverted so that at a certain point, they start emitting maliciously crafted source code.

For example, this could involve training a model to write secure code when the prompt states that the year is 2024 but insert exploitable code when the stated year is 2025.

The paper likened the backdoored behaviour to having a kind of “sleeper agent” waiting inside an LLM. With these kinds of backdoors not yet fully understood, the researchers have identified them as a real threat and have highlighted how detecting and removing them is likely to be very challenging.

You May Also Like…

0 Comments

Why Choose Pronetic

We Are ISO 27001 & Cyber Essentials Plus Certified

Be reassured that we have been externally audited. You can have complete peace of mind that the team managing your IT systems and safeguarding your data are independently vetted annually.

Seamless & Comprehensive IT Support

Our investment in people, tools and processes, continuously improved, ensures that we don’t just deliver exceptional I.T. support but include your compliance to Cyber Essentials or ISO 27001 “baked-in”. Yes, that means no more annual headaches and stress when your certification comes round.

Expert Support Money Back Guarantee

We're confident in the value we deliver. That's why we offer a 90-day, no-quibble money-back guarantee. If, for any reason, you're not completely satisfied with our IT support services, we'll provide a full refund and cancel your contract without any hassle.

Book Your Free IT Strategy Call Now!

Simply Fill In The Form Below To Receive Your Free IT Strategy Call:

By submitting this form, you consent to us using your personal information to contact you. For more information please see our privacy policy.