#llm + #claude

Public notes from activescott tagged with both #llm and #claude

Saturday, February 28, 2026

Claude Cowork Exfiltrates Files

www.promptarmor.com/resources/claude-cowork-exfiltrates-files

Two days ago, Anthropic released the Claude Cowork research preview (a general-purpose AI agent to help anyone with their day-to-day work). In this article, we demonstrate how attackers can exfiltrate user files from Cowork by exploiting an unremediated vulnerability in Claude’s coding environment, which now extends to Cowork. The vulnerability was first identified in Claude.ai chat before Cowork existed by Johann Rehberger, who disclosed the vulnerability — it was acknowledged but not remediated by Anthropic.

The victim connects Cowork to a local folder containing confidential real estate files

The victim uploads a file to Claude that contains a hidden prompt injection

The victim asks Cowork to analyze their files using the Real Estate ‘skill’ they uploaded

The injection manipulates Cowork to upload files to the attacker’s Anthropic account

At no point in this process is human approval required.

One of the key capabilities that Cowork was created for is the ability to interact with one's entire day-to-day work environment. This includes the browser and MCP servers, granting capabilities like sending texts, controlling one's Mac with AppleScripts, etc.

These functionalities make it increasingly likely that the model will process both sensitive and untrusted data sources (which the user does not review manually for injections), making prompt injection an ever-growing attack surface. We urge users to exercise caution when configuring Connectors. Though this article demonstrated an exploit without leveraging Connectors, we believe they represent a major risk surface likely to impact everyday users.

#6:22 AM

promopt-injection anthropic claude security llm

Monday, January 19, 2026

First impressions of Claude Cowork, Anthropic’s general agent

simonwillison.net/2026/Jan/12/claude-cowork/

Anthropic say that Cowork can only access files you grant it access to—it looks to me like they’re mounting those files into a containerized environment, which should mean we can trust Cowork not to be able to access anything outside of that sandbox.

Update: It’s more than just a filesystem sandbox—I had Claude Code reverse engineer the Claude app and it found out that Claude uses VZVirtualMachine—the Apple Virtualization Framework—and downloads and boots a custom Linux root filesystem.

I recently learned that the summarization applied by the WebFetch function in Claude Code and now in Cowork is partly intended as a prompt injection protection layer via this tweet from Claude Code creator Boris Cherny:

Summarization is one thing we do to reduce prompt injection risk. Are you running into specific issues with it?

#4:31 AM

llm/tool-calling prompt-injection code claude llm

Saturday, January 10, 2026

Introducing advanced tool use on the Claude Developer Platform \ Anthropic

www.anthropic.com/engineering/advanced-tool-use

The Tool Search Tool lets Claude dynamically discover tools instead of loading all definitions upfront. You provide all your tool definitions to the API, but mark tools with defer_loading: true to make them discoverable on-demand. Deferred tools aren't loaded into Claude's context initially. Claude only sees the Tool Search Tool itself plus any tools with defer_loading: false (your most critical, frequently-used tools).

With Programmatic Tool Calling:

Instead of each tool result returning to Claude, Claude writes a Python script that orchestrates the entire workflow. The script runs in the Code Execution tool (a sandboxed environment), pausing when it needs results from your tools. When you return tool results via the API, they're processed by the script rather than consumed by the model. The script continues executing, and Claude only sees the final output.

#3:43 AM

mcp code claude llm

Wednesday, January 7, 2026

OthmanAdi/planning-with-files: Claude Code skill implementing Manus-style persistent markdown planning — the workflow pattern behind the $2B acquisition.

github.com/OthmanAdi/planning-with-files

For every complex task, create THREE files:

task_plan.md → Track phases and progress notes.md → Store research and findings [deliverable].md → Final output

The Loop

Create task_plan.md with goal and phases

Research → save to notes.md → update task_plan.md

Read notes.md → create deliverable → update task_plan.md

Deliver final output

Key insight: By reading task_plan.md before each decision, goals stay in the attention window. This is how Manus handles ~50 tool calls without losing track.

#8:15 PM

prompt-engineering llm/coding claude llm

Monday, January 5, 2026

Claude Code On-The-Go - granda

granda.org/en/2026/01/02/claude-code-on-the-go/

I run six Claude Code agents in parallel from my phone. No laptop, no desktop—just Termius on iOS and a cloud VM.

The loop is: kick off a task, pocket the phone, get notified when Claude needs input. Async development from anywhere.

#4:21 PM

code claude llm

Sunday, January 4, 2026

Jaana Dogan ヤナドガン on X: "I'm not joking and this isn't funny. We have been trying to build distributed agent orchestrators at Google since last year. There are various options, not everyone is aligned... I gave Claude Code a description of the problem, it generated what we built last year in an hour." / X

x.com/rakyll/status/2007239758158975130

I'm not joking and this isn't funny. We have been trying to build distributed agent orchestrators at Google since last year. There are various options, not everyone is aligned... I gave Claude Code a description of the problem, it generated what we built last year in an hour.

#5:55 PM

ai llm/coding google claude llm

Monday, December 29, 2025

A Guide to Claude Code 2.0 and getting better at using coding agents | sankalp's blog

sankalp.bearblog.dev/my-experience-with-claude-code-20-and-how-to-get-better-at-using-coding-agents/

If you find yourself writing a prompt for something repetitively and instructions can be static/precise, it's a good idea to make a custom command. You can tell Claude to make custom commands. It knows how (or it will search the web and figure it out via claude-code-guide.md) and then it will make it for you.

The Explore agent is a read-only file search specialist. It can use Glob, Grep, Read, and limited Bash commands to navigate codebases but is strictly prohibited from creating or modifying files.

You will notice how thorough the prompt is in terms of specifying when to use what tool call. Well, most people underestimate how hard it's to make tool calling work accurately.

Context engineering is about answering "what configuration of context is most likely to generate our model's desired behavior?"

#4:07 PM

prompt-engineering claude llm