The work of securing AI systems will never be complete: LLMs amplify existing security risks and introduce new ones

Jan 22, 2025

Although in late 2024 it seemed as if the AI and AGI hype was fading, the beginning of 2025 brought yet another wave of excitement after Sam Altman, the CEO of OpenAI, gave an interview to Bloomberg. He hinted that his company has discovered a way to create Artificial General Intelligence (AGI) and published a blog post predicting that the first wave of AI agents would join the workforce this year, transforming the technological landscape once and for all. It is therefore quite ironic that just a week later, the AI Red Team (AIRT) at Microsoft, OpenAI’s biggest shareholder and investor, published a paper discussing the lessons its members have learned from red teaming over 100 GenAI products at Microsoft. Notably, one of the lessons is titled “LLMs amplify existing security risks and introduce new ones.”

For those unfamiliar with red teaming, it is a nascent and rapidly evolving practice for identifying safety and security risks posed by AI systems. In the article “Lessons From Red Teaming 100 Generative AI Products,” the engineers present their threat model ontology and share the eight lessons learned over the past several years.

Large language models (LLMs) are the poster child of generative AI, and as such, they are at the forefront of integrating GenAI models into various applications. In reviewing these applications and other AI-powered products, Microsoft’s AIRT found that while the use of LLMs offers certain new benefits, their integration into different apps and programs both amplifies existing security risks and introduces new ones.

The existing security risks are generally system-level issues, such as outdated dependencies, improper error handling, and lack of input/output sanitization. These poor security engineering practices can have significant consequences. However, even more critical is that the use of LLMs introduces new security vulnerabilities. For example, systems using retrieval augmented generation (RAG) architectures are susceptible to cross-prompt injection attacks, which exploit the fact that LLMs are trained to follow user instructions and struggle to distinguish among multiple inputs. Attacks exploiting these weaknesses can alter model behavior and exfiltrate private data.

What is even more important than this lesson is the overall conclusion reached by the AIRT. The authors state that the security risks of GenAI applications can only be mitigated, not eliminated, and only by making the attacks too costly, can the mitigation be truly effective.

AIRT report (2501.07238) points out several key findings from their red teaming over 100 generative AI products at Microsoft, among them :

Responsible AI harms are pervasive but difficult to measure;
LLMs amplify existing security risks and introduce new ones; (!)
The work of securing AI systems will never be complete; (!)
New data modalities, such as vision and audio, also create more attack vectors for red teaming operations to consider;
Agentic systems grant these models higher privileges and access to external tools, expanding both the attack surface and the impact of attacks;
Microsoft’s recent investments in AI have spurred the development of many more products that require red teaming than ever before.

It appears that an honest assessment by those responsible for the security and safety of AI-powered products serves as a much-needed reality check against the hype of AI agents taking over the workforce. There is no doubt that AI agents will be introduced in certain areas, although implementations still fall considerably short of promises. However, the existing vulnerabilities of GenAI applications and the fundamental impossibility to permanently fix them ensure that the human element will remain the most crucial aspect of all successful operations. It does look like you need not only to look ahead, try and test, and evaluate the outcomes, but also not really rush into unknown risks and let them manifest themselves in business cases that do not include your company. This is one of those situations when you don’t want to be an example for others to avoid repeating.

The Thinking Machine

Discussion about this post