#anthropic

Public notes from activescott tagged with #anthropic

Saturday, February 28, 2026

Two days ago, Anthropic released the Claude Cowork research preview (a general-purpose AI agent to help anyone with their day-to-day work). In this article, we demonstrate how attackers can exfiltrate user files from Cowork by exploiting an unremediated vulnerability in Claude’s coding environment, which now extends to Cowork. The vulnerability was first identified in Claude.ai chat before Cowork existed by Johann Rehberger, who disclosed the vulnerability — it was acknowledged but not remediated by Anthropic.

  1. The victim connects Cowork to a local folder containing confidential real estate files
  2. The victim uploads a file to Claude that contains a hidden prompt injection
  3. The victim asks Cowork to analyze their files using the Real Estate ‘skill’ they uploaded
  4. The injection manipulates Cowork to upload files to the attacker’s Anthropic account

At no point in this process is human approval required.

One of the key capabilities that Cowork was created for is the ability to interact with one's entire day-to-day work environment. This includes the browser and MCP servers, granting capabilities like sending texts, controlling one's Mac with AppleScripts, etc.

These functionalities make it increasingly likely that the model will process both sensitive and untrusted data sources (which the user does not review manually for injections), making prompt injection an ever-growing attack surface. We urge users to exercise caution when configuring Connectors. Though this article demonstrated an exploit without leveraging Connectors, we believe they represent a major risk surface likely to impact everyday users.

Friday, February 27, 2026

Catch up quick: The Pentagon and Anthropic are in a high-stakes feud over the limits Anthropic wants to place on the department's use of its AI model Claude: no mass surveillance or autonomous weapons.

The Pentagon this week started laying the groundwork for one consequence — blacklisting the company as a supply chain risk — by asking defense contractors including Boeing and Lockheed Martin to assess their exposure to Anthropic.
Alternatively, Hegseth threatened to invoke the Defense Production Act to compel Anthropic to provide its model without any restrictions. Such an order may be on murky legal ground.

The Pentagon's threats "are inherently contradictory: one labels us a security risk; the other labels Claude as essential to national security," Amodei said in a blog post.

"Regardless, these threats do not change our position: we cannot in good conscience accede to their request," he added.

The big picture: The Pentagon's requirement that AI models be offered for "all lawful purposes" in classified settings is not unique to Anthropic.

While Anthropic has been the only model used in classified settings to date, xAI recently signed a contract under the all lawful purposes standard for classified work.
Negotiations to bring OpenAI and Google into the classified space are accelerating. 

What's next: Amodei said the company remains committed to continuing talks.

But if the Pentagon decides to offboard Anthropic, Amodei said the company "will work to enable a smooth transition to another provider."

Friday, January 30, 2026

I love these guys:

The Pentagon is at odds with artificial-intelligence developer Anthropic over safeguards that would prevent the government from deploying its technology to target weapons autonomously and conduct U.S. domestic surveillance, three people familiar with the matter told Reuters. ...In its discussions with government officials, Anthropic representatives raised concerns that its tools could be used to spy on Americans or assist weapons targeting without sufficient human oversight, some of the sources told Reuters.

Thursday, January 22, 2026

Claude’s constitution is the foundational document that both expresses and shapes who Claude is. It contains detailed explanations of the values we would like Claude to embody and the reasons why. In it, we explain what we think it means for Claude to be helpful while remaining broadly safe, ethical, and compliant with our guidelines. The constitution gives Claude information about its situation and offers advice for how to deal with difficult situations and tradeoffs, like balancing honesty with compassion and the protection of sensitive information. Although it might sound surprising, the constitution is written primarily for Claude. It is intended to give Claude the knowledge and understanding it needs to act well in the world.

Claude itself also uses the constitution to construct many kinds of synthetic training data, including data that helps it learn and understand the constitution, conversations where the constitution might be relevant, responses that are in line with its values, and rankings of possible responses. All of these can be used to train future versions of Claude to become the kind of entity the constitution describes. This practical function has shaped how we’ve written the constitution: it needs to work both as a statement of abstract ideals and a useful artifact for training.

Friday, October 31, 2025