#llm + #agents

Public notes from activescott tagged with both #llm and #agents

Tuesday, March 10, 2026

prose/skills/open-prose/compiler.md at main · openprose/prose

github.com/openprose/prose/blob/main/skills/open-prose/compiler.md

prose.md language reference

#5:18 PM

github code agents llm

openprose/prose

github.com/openprose/prose

prose.md

#5:17 PM

github code agents llm

OpenProse Cloud - Run .prose Programs

www.prose.md/

Why not just play in English? English is already an agent framework—we're structuring it, not replacing it. Plain English doesn't distinguish sequential from parallel, doesn't specify retry counts, doesn't scope variables. OpenProse uses English exactly where ambiguity is a feature (inside ...), and structure everywhere else. The fourth wall syntax lets you lean on AI judgment precisely when you want to.

How is this a VM? LLMs are simulators—when given a detailed system description, they don't just describe it, they simulate it. The prose.md spec describes a VM with enough fidelity that reading it induces simulation. But simulation with sufficient fidelity is implementation: each session spawns a real subagent, outputs are real artifacts, state persists in conversation history or files. The simulation is the execution.

#5:16 PM

code agents llm

Sunday, February 15, 2026

steipete/gogcli: Google Suite CLI: Gmail, GCal, GDrive, GContacts.

github.com/steipete/gogcli

Fast, script-friendly CLI for Gmail, Calendar, Chat, Classroom, Drive, Docs, Slides, Sheets, Forms, Apps Script, Contacts, Tasks, People, Groups (Workspace), and Keep (Workspace-only). JSON-first output, multiple accounts, and least-privilege auth built in.

#6:36 PM

todo mcp bash cli agents google llm

What if you don't need MCP at all?

mariozechner.at/posts/2025-11-02-what-if-you-dont-need-mcp/?t=0

I'm a simple boy, so I like simple things. Agents can run Bash and write code well. Bash and code are composable. So what's simpler than having your agent just invoke CLI tools and write code? This is nothing new. We've all been doing this since the beginning. I'd just like to convince you that in many situations, you don't need or even want an MCP server.

#6:33 PM

mcp agents llm

Sunday, February 1, 2026

OpenClaw — Personal AI Assistant

openclaw.ai/

#1:41 AM

selfhosted agents llm

Wednesday, January 28, 2026

Sentience API - Verification & Control Layer for Browser AI Agents | Semantic snapshots, assertions, traces + artifacts. Local-ready, cloud-friendly, vision optional

sentienceapi.com/

An interesting tool that uses playwright to extract structure based on apparently accessibility roles and geometry of “important” elements and use that for an execution agent to process the page results. Important elements are somehow ranked. Then geometry is inferred from those elements.

Also relies on jest-style assertions to explicitly assert whether a step succeeded or failed.

#6:57 PM

scraping agents llm

Friday, January 23, 2026

The Leading Multi-Agent Platform

www.crewai.com/

#3:35 AM

python code agents llm

Monday, December 8, 2025

gorilla/berkeley-function-call-leaderboard at main · ShishirPatil/gorilla

github.com/ShishirPatil/gorilla/tree/main/berkeley-function-call-leaderboard

We introduce the Berkeley Function Calling Leaderboard (BFCL), the first comprehensive and executable function call evaluation dedicated to assessing Large Language Models' (LLMs) ability to invoke functions. Unlike previous evaluations, BFCL accounts for various forms of function calls, diverse scenarios, and executability.

#12:54 AM

llm/tool-calling llm/coding agents llm

SWE-agent/mini-swe-agent: The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores >74% on SWE-bench verified!

github.com/SWE-agent/mini-swe-agent

In 2024, SWE-bench & SWE-agent helped kickstart the coding agent revolution.

We now ask: What if SWE-agent was 100x smaller, and still worked nearly as well?

mini is for

Researchers who want to benchmark, fine-tune or RL without assumptions, bloat, or surprises
Developers who like their tools like their scripts: short, sharp, and readable
Engineers who want something trivial to sandbox & to deploy anywhere

Here's some details:

Minimal: Just 100 lines of python (+100 total for env, model, script) — no fancy dependencies!
Powerful: Resolves >74% of GitHub issues in the SWE-bench verified benchmark (leaderboard).
Convenient: Comes with UIs that turn this into your daily dev swiss army knife!
Deployable: In addition to local envs, you can use docker, podman, singularity, apptainer, and more
Tested: Codecov
Cutting edge: Built by the Princeton & Stanford team behind SWE-bench and SWE-agent.

#12:50 AM

llm-coding llm-tool-use agents llm

Sunday, November 16, 2025

What if you don't need MCP at all?

mariozechner.at/posts/2025-11-02-what-if-you-dont-need-mcp/

This is all I feed to my agent. It's a handful of tools that cover all the bases for my use case. Each tool is a simple Node.js script that uses Puppeteer Core. By reading that README, the agent knows the available tools, when to use them, and how to use them via Bash.

#11:46 PM

mcp code agents llm