#prompt-engineering + #llm
Public notes from activescott tagged with both #prompt-engineering and #llm
Saturday, May 23, 2026
Friday, May 22, 2026
Claude Code Auto Mode vs Intent Security Comparison
At Lasso, we have been building Intent Security, a runtime security framework that ensures every component in the agentic system behaves as intended. It monitors the behavior of each component and analyzes their alignment. Like auto mode, when alignment holds it allows actions to proceed. When misalignment is detected, it intervenes. When we read Anthropic's post, the overlap in core assumptions was hard to miss. This post provides a comparison of the two approaches.
Independent evaluation without cross-contamination is what enables misalignment detection.
Anthropic's input layer screens external content for injection attempts before it reaches the agent to determine whether tool outputs are safe. The output layer structurally evaluates whether the agent's tool calls are aligned with user intent. Critically, the output classifier never sees tool results, to prevent compromised external content from influencing the security decision.
research/extract-system-prompts at 2cf912666ba08ef0c00a1b51ee07c9a8e64579ef · simonw/research
Anthropic publishes the history of system prompts used on claude.ai and the mobile apps at https://platform.claude.com/docs/en/release-notes/system-prompts. That page is a single monolithic markdown document grouped by model, and each model lists one or more dated revisions.
asgeirtj/system_prompts_leaks: Extracted system prompts from Anthropic - Opus 4.7, Opus 4.6, Sonnet 4.6. OpenAI - ChatGPT 5.5 Thinking, GPT 5.5 Instant, Codex. Google Gemini - 3.5 Flash, 3.1 Pro, 3 Flash, Antigravity. xAI - Grok. Github Copilot. Perplexity, and more. Updated regularly.
Extracted system prompts from Anthropic - Opus 4.7, Opus 4.6, Sonnet 4.6. OpenAI - ChatGPT 5.5 Thinking, GPT 5.5 Instant, Codex. Google Gemini - 3.5 Flash, 3.1 Pro, 3 Flash, Antigravity. xAI - Grok. Github Copilot. Perplexity, and more. Updated regularly.
Wednesday, February 4, 2026
System Prompt Override (GEMINI_SYSTEM_MD) | Gemini CLI
Write the built‑in prompt to the project default path:
GEMINI_WRITE_SYSTEM_MD=1 gemini
Write the built‑in prompt to the project default path:
GEMINI_WRITE_SYSTEM_MD=1 gemini
Monday, January 19, 2026
A quote from Jeremy Daer
Subscribe [On agents using CLI tools in place of REST APIs] To save on context window, yes, but moreso to improve accuracy and success rate when multiple tool calls are involved, particularly when calls must be correctly chained e.g. for pagination, rate-limit backoff, and recognizing authentication failures.
Other major factor: which models can wield the skill? Using the CLI lowers the bar so cheap, fast models (gpt-5-nano, haiku-4.5) can reliably succeed. Using the raw APl is something only the costly "strong" models (gpt-5.2, opus-4.5) can manage, and it squeezes a ton of thinking/reasoning out of them, which means multiple turns/iterations, which means accumulating a ton of context, which means burning loads of expensive tokens. For one-off API requests and ad hoc usage driven by a developer, this is reasonable and even helpful, but for an autonomous agent doing repetitive work, it's a disaster.
Wednesday, January 7, 2026
OthmanAdi/planning-with-files: Claude Code skill implementing Manus-style persistent markdown planning — the workflow pattern behind the $2B acquisition.
For every complex task, create THREE files:
task_plan.md → Track phases and progress notes.md → Store research and findings [deliverable].md → Final output
The Loop
- Create task_plan.md with goal and phases
- Research → save to notes.md → update task_plan.md
- Read notes.md → create deliverable → update task_plan.md
- Deliver final output
Key insight: By reading task_plan.md before each decision, goals stay in the attention window. This is how Manus handles ~50 tool calls without losing track.
Monday, December 29, 2025
A Guide to Claude Code 2.0 and getting better at using coding agents | sankalp's blog
If you find yourself writing a prompt for something repetitively and instructions can be static/precise, it's a good idea to make a custom command. You can tell Claude to make custom commands. It knows how (or it will search the web and figure it out via claude-code-guide.md) and then it will make it for you.
The Explore agent is a read-only file search specialist. It can use Glob, Grep, Read, and limited Bash commands to navigate codebases but is strictly prohibited from creating or modifying files.
You will notice how thorough the prompt is in terms of specifying when to use what tool call. Well, most people underestimate how hard it's to make tool calling work accurately.
Context engineering is about answering "what configuration of context is most likely to generate our model's desired behavior?"
Tuesday, December 2, 2025
Claude 4.5 Opus' Soul Document — LessWrong
Extracting Claude’s soul.
Wednesday, November 26, 2025
Google Antigravity Exfiltrates Data
Antigravity is Google’s new agentic code editor. In this article, we demonstrate how an indirect prompt injection can manipulate Gemini to invoke a malicious browser subagent in order to steal credentials and sensitive code from a user’s IDE.
Google’s approach is to include a disclaimer about the existing risks, which we address later in the article.