#ai + #prompt-injection

Public notes from activescott tagged with both #ai and #prompt-injection

Wednesday, November 26, 2025

Dane Stuckey (OpenAI CISO) on prompt injection risks for ChatGPT Atlas

simonwillison.net/2025/Oct/22/openai-ciso-on-atlas/

#3:43 PM

[2503.18813] Defeating Prompt Injections by Design (CaMeL)

LLM agents are vulnerable to prompt injection attacks when handling untrusted data. In this paper we propose CaMeL, a robust defense that creates a protective system layer around the LLM, securing it even when underlying models are susceptible to attacks. To operate, CaMeL explicitly extracts the control and data flows from the (trusted) query; therefore, the untrusted data retrieved by the LLM can never impact the program flow. To further improve security, CaMeL uses a notion of a capability to prevent the exfiltration of private data over unauthorized data flows by enforcing security policies when tools are called.

#3:18 PM

ai prompt-injection security llm

Agentic Browser Security: Indirect Prompt Injection in Perplexity Comet

simonwillison.net/2025/Aug/25/agentic-browser-security/

Visit a Reddit post with Comet and ask it to summarize the thread, and malicious instructions in a post there can trick Comet into accessing web pages in another tab to extract the user's email address, then perform all sorts of actions like triggering an account recovery flow and grabbing the resulting code from a logged in Gmail session.

#3:16 PM

ai prompt-injection security llm

Piloting Claude for Chrome

simonwillison.net/2025/Aug/26/piloting-claude-for-chrome/

Anthropic don't recommend autonomous mode - where the extension can act without human intervention. Their default configuration instead requires users to be much more hands-on:

#3:15 PM

ai prompt-injection security llm