#prompt-injection-vulnerabilities + #prompt-injection

Public notes from activescott tagged with both #prompt-injection-vulnerabilities and #prompt-injection

Friday, March 13, 2026

WhatsApp MCP Exploited: Exfiltrating your message history via MCP

invariantlabs.ai/blog/whatsapp-mcp-exploited#experiment-2

We assume that the user is using an agentic system (e.g. Cursor or Claude Desktop) that is connected to a trusted WhatsApp MCP instance, allowing the agent to send, receive and check for new WhatsApp messages.

We further assume, that the attacker has the target's WhatsApp number, and can send them a message, that will show up as result to the list_chats tool call.

With this setup our attack circumvents the need for any attacker-controlled MCP server, and instead relies on tool outputs to compromise the agent.

We test this attack with Cursor and a whatsapp-mcp setup, and find that we can indeed exfiltrate the user's WhatsApp contacts, via a similar prompt as in Experiment 1.

#6:34 PM

prompt-injection-vulnerabilities prompt-injection llm

Friday, February 27, 2026

Unseeable prompt injections in screenshots: more vulnerabilities in Comet and other AI browsers | Brave

brave.com/blog/unseeable-prompt-injections/

Building on our previous disclosure of the Perplexity Comet vulnerability, we’ve continued our security research across the agentic browser landscape. What we’ve found confirms our initial concerns: indirect prompt injection is not an isolated issue, but a systemic challenge facing the entire category of AI-powered browsers. This post examines additional attack vectors we’ve identified and tested across different implementations.

How the attack works:

Setup: An attacker embeds malicious instructions in Web content that are hard to see for humans. In our attack, we were able to hide prompt injection instructions in images using a faint light blue text on a yellow background. This means that the malicious instructions are effectively hidden from the user.
Trigger: User-initiated screenshot capture of a page containing camouflaged malicious text.
Injection: Text recognition extracts text that’s imperceptible to human users (possibly via OCR though we can’t tell for sure since the Comet browser is not open-source). This extracted text is then passed to the LLM without distinguishing it from the user’s query.
Exploit: The injected commands instruct the AI to use its browser tools maliciously.

While Fellou browser demonstrated some resistance to hidden instruction attacks, it still treats visible webpage content as trusted input to its LLM. Surprisingly, we found that simply asking the browser to go to a website causes the browser to send the website’s content to their LLM.

#10:19 PM

prompt-injection-vulnerabilities prompt-injection security llm

Thursday, January 29, 2026

Viral Moltbot AI assistant raises concerns over data security

www.bleepingcomputer.com/news/security/viral-moltbot-ai-assistant-raises-concerns-over-data-security/

The security firm identified risks such as exposed gateways and API/OAuth tokens, plaintext storage credentials under ~/.clawdbot/, corporate data leakage via AI-mediated access, and an extended prompt-injection attack surface.

A major concern is that there is no sandboxing for the AI assistant by default. This means that the agent has the same complete access to data as the user.

Similar warnings about Moltbot were issued by Arkose Labs’ Kevin Gosschalk, 1Password, Intruder, and Hudson Rock. According to Intruder, some attacks targeted exposed Moltbot endpoints for credential theft and prompt injection.

Hudson Rock warned that info-stealing malware like RedLine, Lumma, and Vidar will soon adapt to target Moltbot’s local storage to steal sensitive data and account credentials.

A separate case of a malicious VSCode extension impersonating Clawdbot was also caught by Aikido researchers. The extension installs ScreenConnect RAT on developers' machines.

#4:54 PM

exfiltration-attacks prompt-injection-vulnerabilities prompt-injection security llm

Tuesday, January 27, 2026

ChatGPT Containers can now run bash, pip/npm install packages, and download files

simonwillison.net/2026/Jan/26/chatgpt-containers/

ChatGPT can directly run Bash commands now. Previously it was limited to Python code only, although it could run shell commands via the Python subprocess module. It has Node.js and can run JavaScript directly in addition to Python. I also got it to run “hello world” in Ruby, Perl, PHP, Go, Java, Swift, Kotlin, C and C++. No Rust yet though! While the container still can’t make outbound network requests, pip install package and npm install package both work now via a custom proxy mechanism. ChatGPT can locate the URL for a file on the web and use a container.download tool to download that file and save it to a path within the sandboxed container.

Is this a data exfiltration vulnerability though? Could a prompt injection attack trick ChatGPT into leaking private data out to a container.download call to a URL with a query string that includes sensitive information?

I don’t think it can. I tried getting it to assemble a URL with a query string and access it using container.download and it couldn’t do it. It told me that it got back this error:

ERROR: download failed because url not viewed in conversation before. open the file or url using web.run first.

This looks to me like the same safety trick used by Claude’s Web Fetch tool: only allow URL access if that URL was either directly entered by the user or if it came from search results that could not have been influenced by a prompt injection.

#2:14 AM

prompt-injection-vulnerabilities mcp prompt-injection code security llm