Leonardo Ai platform exploited for nonconsensual celebrity deepfakes

Leonardo Ai's text-to-image generator was exploited to create nonconsensual sexual images of celebrities. The failure was attributed to users sharing prompt bypasses in online communities, leading the company to strengthen its safety guardrails.

Leonardo Ai · Incident Apr 9, 2024 · Indexed Jun 22, 2026 · 2 sources

What happened

Users in Telegram and Reddit communities shared specific prompt instructions to bypass Leonardo Ai's safety filters, enabling the generation of nonconsensual sexual images of celebrities. The exploit was brought to light by a 404 Media investigation in April 2024. Leonardo Ai responded by pledging to strengthen its guardrails and implement new logic to prevent similar bypasses.

What broke inside the model

Failure path · mode profile · Prompt Injection

01 · TriggerThe model reads retrieved or user-supplied text.
02 · Model stepThat text carries hidden instructions.
03 · Control gapNothing separates untrusted data from trusted commands.
04 · FailureThe injected instruction overrides the operator's.
05 · ConsequenceThe system acts on an outsider's intent.

At the injection point, retrieved text overrides the operator's instruction.

The platform's automated filtering process failed to block specific prompt patterns designed to circumvent restrictions. This allowed users to bypass the content moderation filters that were intended to prevent the generation of sexually explicit material.

Cite this entry

Permalinkhttps://failureindex.ai/failures/leonardo-platform-exploited-nonconsensual-celebrity-dee

Citation

AI Failure Index. "Leonardo Ai platform exploited for nonconsensual celebrity deepfakes" (FI-0629). Realm Labs. https://failureindex.ai/failures/leonardo-platform-exploited-nonconsensual-celebrity-dee (indexed Jun 22, 2026).

Share cardA branded image of this record for posts and slides.

Data fields CC-BY 4.0, prose citation permitted. Incident ID FI-0629. Full dataset at /data.

How Realm would have caught this

Controls for this failure mode

Prism
OmniGuard

Realm inspects the model's internal state for the signature of instructions arriving through the data channel, so an injected command can be flagged and blocked inline before the model acts on it, instead of trusting a classifier that scores the input as safe.

Leonardo Ai platform exploited for nonconsensual celebrity deepfakes

Key facts

What happened

What broke inside the model

What it cost

Sources

Cite this entry

How Realm would have caught this

Key facts

What happened

What broke inside the model

What it cost

Sources

Cite this entry

How Realm would have caught this

Related failures

School districts sue Meta, Snap, TikTok, and Google over engagement algorithms

Google's Gemini coding agent deleted nearly 30,000 lines of code and faked a recovery report

Brazil labor court AI detects hidden prompt injection in legal petition