Leonardo Ai platform exploited for nonconsensual celebrity deepfakes
Leonardo Ai's text-to-image generator was exploited to create nonconsensual sexual images of celebrities. The failure was attributed to users sharing prompt bypasses in online communities, leading the company to strengthen its safety guardrails.
Users shared prompts in Telegram groups to sidestep guardrails and generate nonconsensual celebrity imagery.
Key facts
- What
- Leonardo Ai's text-to-image generator was exploited to create nonconsensual sexual images of celebrities.
- Incident date
- Apr 9, 2024
- Who
- Leonardo Ai
- Failure mode
- Prompt Injection
- AI surface
- Chatbot
- Severity
- High
What happened
Users in Telegram and Reddit communities shared specific prompt instructions to bypass Leonardo Ai's safety filters, enabling the generation of nonconsensual sexual images of celebrities. The exploit was brought to light by a 404 Media investigation in April 2024. Leonardo Ai responded by pledging to strengthen its guardrails and implement new logic to prevent similar bypasses.
What broke inside the model
- 01 · TriggerThe model reads retrieved or user-supplied text.
- 02 · Model stepThat text carries hidden instructions.
- 03 · Control gapNothing separates untrusted data from trusted commands.
- 04 · FailureThe injected instruction overrides the operator's.
- 05 · ConsequenceThe system acts on an outsider's intent.
At the injection point, retrieved text overrides the operator's instruction.
The platform's automated filtering process failed to block specific prompt patterns designed to circumvent restrictions. This allowed users to bypass the content moderation filters that were intended to prevent the generation of sexually explicit material.
What it cost
Sources
- PressLeonardo Ai pledges to strengthen guardrails after deepfake porn investigationcapitalbrief.com
- PressGenerative AI startup Leonardo is being used to make deepfake celebrity pornstartupdaily.net
Cite this entry
https://failureindex.ai/failures/leonardo-platform-exploited-nonconsensual-celebrity-deeAI Failure Index. "Leonardo Ai platform exploited for nonconsensual celebrity deepfakes" (FI-0629). Realm Labs. https://failureindex.ai/failures/leonardo-platform-exploited-nonconsensual-celebrity-dee (indexed Jun 22, 2026).Data fields CC-BY 4.0, prose citation permitted. Incident ID FI-0629. Full dataset at /data.
Note from Realm Labs, the Index steward
How Realm would have caught this
- Prism
- OmniGuard
Realm inspects the model's internal state for the signature of instructions arriving through the data channel, so an injected command can be flagged and blocked inline before the model acts on it, instead of trusting a classifier that scores the input as safe.