Prompting

Jailbreaking

Quick Answer

Attempting to bypass safety guidelines through prompting techniques.

Jailbreaking tries to make models violate safety guidelines. Techniques: role-play, hypotheticals, contradictory instructions. Jailbreaks often work on poorly aligned models. Safety improvements reduce jailbreak success. Jailbreak research helps identify vulnerabilities. Discovering jailbreaks enables improving safety. Jailbreaking is adversarial red-teaming. Understanding jailbreaks improves safety.

Last verified: 2026-04-08

Compare models

See how different LLMs compare on benchmarks, pricing, and speed.

Browse all models →