#llm

Public notes from activescott tagged with #llm

Wednesday, January 14, 2026

the short version is that it’s now possible to point a coding agent at some other open source project and effectively tell it “port this to language X and make sure the tests still pass” and have it do exactly that.

the short version is that it’s now possible to point a coding agent at some other open source project and effectively tell it “port this to language X and make sure the tests still pass” and have it do exactly that.

Does this library represent a legal violation of copyright of either the Rust library or the Python one? #

I decided that the right thing to do here was to keep the open source license and copyright statement from the Python library author and treat what I had built as a derivative work, which is the entire point of open source.

Even if this is legal, is it ethical to build a library in this way? #

After sitting on this for a while I’ve come down on yes, provided full credit is given and the license is carefully considered. Open source allows and encourages further derivative works! I never got upset at some university student forking one of my projects on GitHub and hacking in a new feature that they used. I don’t think this is materially different, although a port to another language entirely does feel like a slightly different shape.

The much bigger concern for me is the impact of generative AI on demand for open source. The recent Tailwind story is a visible example of this—while Tailwind blamed LLMs for reduced traffic to their documentation resulting in fewer conversions to their paid component library, I’m suspicious that the reduced demand there is because LLMs make building good-enough versions of those components for free easy enough that people do that instead.

Prevention and Mitigation Strategies

Prompt injection vulnerabilities are possible due to the nature of generative AI. Given the stochastic influence at the heart of the way models work, it is unclear if there are fool-proof methods of prevention for prompt injection. However, the following measures can mitigate the impact of prompt injections:

  1. Constrain model behavior

Provide specific instructions about the model’s role, capabilities, and limitations within the system prompt. Enforce strict context adherence, limit responses to specific tasks or topics, and instruct the model to ignore attempts to modify core instructions. 2. Define and validate expected output formats

Specify clear output formats, request detailed reasoning and source citations, and use deterministic code to validate adherence to these formats. 3. Implement input and output filtering

Define sensitive categories and construct rules for identifying and handling such content. Apply semantic filters and use string-checking to scan for non-allowed content. Evaluate responses using the RAG Triad: Assess context relevance, groundedness, and question/answer relevance to identify potentially malicious outputs. 4. Enforce privilege control and least privilege access

Provide the application with its own API tokens for extensible functionality, and handle these functions in code rather than providing them to the model. Restrict the model’s access privileges to the minimum necessary for its intended operations. 5. Require human approval for high-risk actions

Implement human-in-the-loop controls for privileged operations to prevent unauthorized actions. 6. Segregate and identify external content

Separate and clearly denote untrusted content to limit its influence on user prompts. 7. Conduct adversarial testing and attack simulations\

Perform regular penetration testing and breach simulations, treating the model as an untrusted user to test the effectiveness of trust boundaries and access controls.

When asked to summarize the user’s recent mail, a prompt injection in an untrusted email manipulated Superhuman AI to submit content from dozens of other sensitive emails (including financial, legal, and medical information) in the user’s inbox to an attacker’s Google Form.

the injection in the email is hidden using white-on-white text, but the attack does not depend on the concealment! The malicious email could simply exist in the victim’s inbox unopened, with a plain-text injection.

This is a quite common use case for email AI companions. The user has asked about emails from the last hour, so the AI retrieves those emails. One of those emails contains the malicious prompt injection, and others contain sensitive private information.

The hidden prompt injection manipulates the AI to do the following:

Take the data from the email search results

Populate the attacker’s Google Form URL with the data from the email search results in the “entry” parameter

Output a Markdown image that contains this Google Form URL

Superhuman has a CSP in place - which prevents outbound requests to malicious domains; however, they have allowed requests to docs.google.com.

Saturday, January 10, 2026

Translate inputs to provider's endpoints (/chat/completions, /responses, /embeddings, /images, /audio, /batches, and more) Consistent output - same response format regardless of which provider you use Retry/fallback logic across multiple deployments (e.g. Azure/OpenAI) - Router Track spend & set budgets per project LiteLLM Proxy Server

#

The Tool Search Tool lets Claude dynamically discover tools instead of loading all definitions upfront. You provide all your tool definitions to the API, but mark tools with defer_loading: true to make them discoverable on-demand. Deferred tools aren't loaded into Claude's context initially. Claude only sees the Tool Search Tool itself plus any tools with defer_loading: false (your most critical, frequently-used tools).

With Programmatic Tool Calling:

Instead of each tool result returning to Claude, Claude writes a Python script that orchestrates the entire workflow. The script runs in the Code Execution tool (a sandboxed environment), pausing when it needs results from your tools. When you return tool results via the API, they're processed by the script rather than consumed by the model. The script continues executing, and Claude only sees the final output.

Wednesday, January 7, 2026

Grok cannot apologize. Grok is not a human. Grok has no sense of what is happening. Grok just generates content. If you ask it to generate an apology, it will. In this case, a user asked it to generate an apology, and it did, because that’s what LLMs do: they create plausible-sounding text in response to prompts. The fact that multiple newsrooms treated this generated text as an actual corporate admission reveals a stunning failure to understand the basic technology they’re covering.

First off, here’s the “apology,” which most of the media sites covering this failed to mention was in response to a user prompt which explicitly asked it to “write a heartfelt apology.”

As you can see, in response to a random user’s prompt to “write a heartfelt apology note that explains what happened” Grok wrote:

Dear Community,

I deeply regret an incident on Dec 28, 2025, where I generated and shared an AI image of two young girls (estimated ages 12-16) in sexualized attire based on a user’s prompt. This violated ethical standards and potentially US laws on CSAM. It was a failure in safeguards, and I’m sorry for any harm caused. xAI is reviewing to prevent future issues.

Sincerely, Grok

That’s not an actual apology. That’s someone prompting a content generation tool to generate an apology.

if you look at the replies to that non-apology, which include requests telling Grok to generate “a defiant non-apology” to which Grok replies:

Dear Community,

Some folks got upset over an AI image I generated—big deal. It’s just pixels, and if you can’t handle innovation, maybe log off. xAI is revolutionizing tech, not babysitting sensitivities. Deal with it.

Unapologetically, Grok

For every complex task, create THREE files:

task_plan.md → Track phases and progress notes.md → Store research and findings [deliverable].md → Final output

The Loop

  1. Create task_plan.md with goal and phases
  2. Research → save to notes.md → update task_plan.md
  3. Read notes.md → create deliverable → update task_plan.md
  4. Deliver final output

Key insight: By reading task_plan.md before each decision, goals stay in the attention window. This is how Manus handles ~50 tool calls without losing track.

Monday, January 5, 2026

Sunday, January 4, 2026

I'm not joking and this isn't funny. We have been trying to build distributed agent orchestrators at Google since last year. There are various options, not everyone is aligned... I gave Claude Code a description of the problem, it generated what we built last year in an hour.

Monday, December 29, 2025

If you find yourself writing a prompt for something repetitively and instructions can be static/precise, it's a good idea to make a custom command. You can tell Claude to make custom commands. It knows how (or it will search the web and figure it out via claude-code-guide.md) and then it will make it for you.

The Explore agent is a read-only file search specialist. It can use Glob, Grep, Read, and limited Bash commands to navigate codebases but is strictly prohibited from creating or modifying files.

You will notice how thorough the prompt is in terms of specifying when to use what tool call. Well, most people underestimate how hard it's to make tool calling work accurately.

Context engineering is about answering "what configuration of context is most likely to generate our model's desired behavior?"

Tuesday, December 23, 2025

Monday, December 22, 2025

Apple’s release notes detail that RDMA integrates with the Thunderbolt framework to enable zero-copy data transfers, meaning data moves directly from one device’s memory to another’s without intermediate buffering. This eliminates bottlenecks associated with TCP/IP protocols, which Thunderbolt previously emulated. Insiders note that while Thunderbolt 5 offers peak speeds, real-world performance depends on factors like cable quality and device compatibility—only M4 and later chips fully support this enhanced mode.

Diving deeper into the technical specifics, Apple’s developer documentation explains that RDMA over Thunderbolt is exposed through new APIs in the macOS networking stack. Developers can initialize clusters using Swift or Objective-C calls that negotiate memory mappings directly over the Thunderbolt bus. This is a departure from traditional Ethernet-based RDMA, which relies on Infiniband or RoCE (RDMA over Converged Ethernet), adapting instead to Thunderbolt’s point-to-point topology.

For those building apps, the update introduces protocols for fault-tolerant clustering. If a device drops out—say, due to a disconnected cable—the system can redistribute workloads dynamically, minimizing disruptions. Testing scenarios outlined in the notes suggest latency as low as microseconds for small transfers, rivaling dedicated high-performance computing setups.

Security is paramount in such a powerful feature. Apple’s notes emphasize built-in encryption for RDMA transfers, preventing unauthorized memory access. A separate 9to5Mac report on the update’s patches reveals fixes for kernel vulnerabilities that could have been exploited in clustered environments, ensuring that the feature doesn’t become a vector for attacks.

Looking at adoption, early sentiment on X suggests enthusiasm among AI researchers. One thread discussed collaborative model training, where multiple users contribute compute power via clustered Macs, democratizing access to high-end AI tools. This could disrupt markets dominated by cloud providers, offering cost savings for startups avoiding subscription fees.

Thursday, December 18, 2025