topic

Model Safety

1 post tagged Model Safety.

AI Security28 June 20264 min read

Prompt Injection Is Role Confusion: New Research Reframes LLM Security

MIT researchers show frontier LLMs can't truly distinguish their own privileged reasoning from attacker-injected text — and writing style alone swings attack success from 61% to 10%.

prompt injection
llm security
agentic ai
jailbreak
model safety

Read the post

← All posts