Anthropic Claude Data Exfiltration Vulnerability Fixed
A common attack vector that LLM apps face is data exfiltration, in particular data exfiltration via Image Markdown Injection is a common vulnerability. Microsoft fixed the vulnerability in Bing Chat, ChatGPT is still vulnerable as Open AI “won’t fixed” the issue, and Anthropic just mitigated this vulnerability in Claude.
This post documents the Anthropic Claude data exfiltration vulnerability and the mitigation put in place.
The Vulnerability - Image Markdown Injection As a quick recap, imagine a large language model (LLM) returns the following text: