#code

Public notes from activescott tagged with #code

All things code!

Monday, April 13, 2026

You can use the OpenAI Compatible Provider package to use language model providers that implement the OpenAI API.

Below we focus on the general setup and provider instance creation. You can also write a custom provider package leveraging the OpenAI Compatible package.

We provide detailed documentation for the following OpenAI compatible providers:

LM Studio
NIM
Heroku
Clarifai

The general setup and provider instance creation is the same for all of these providers.

Sunday, April 12, 2026

HLS.js is a JavaScript library that implements an HTTP Live Streaming client. It relies on HTML5 video and MediaSource Extensions for playback.

It works by transmuxing MPEG-2 Transport Stream and AAC/MP3 streams into ISO BMFF (MP4) fragments. Transmuxing is performed asynchronously using a Web Worker when available in the browser. HLS.js also supports HLS + fmp4, as announced during WWDC2016.

HLS.js works directly on top of a standard HTML element.

Wednesday, April 8, 2026

Career-Ops turns any AI coding CLI into a full job search command center. Instead of manually tracking applications in a spreadsheet, you get an AI-powered pipeline that:

Evaluates offers with a structured A-F scoring system (10 weighted dimensions)
Generates tailored PDFs -- ATS-optimized CVs customized per job description
Scans portals automatically (Greenhouse, Ashby, Lever, company pages)
Processes in batch -- evaluate 10+ offers in parallel with sub-agents
Tracks everything in a single source of truth with integrity checks

Important: This is NOT a spray-and-pray tool. Career-ops is a filter -- it helps you find the few offers worth your time out of hundreds. The system strongly recommends against applying to anything scoring below 4.0/5. Your time is valuable, and so is the recruiter's. Always review before submitting.

Career-ops is agentic: Claude Code navigates career pages with Playwright, evaluates fit by reasoning about your CV vs the job description (not keyword matching), and adapts your resume per listing.

Heads up: the first evaluations won't be great. The system doesn't know you yet. Feed it context -- your CV, your career story, your proof points, your preferences, what you're good at, what you want to avoid. The more you nurture it, the better it gets. Think of it as onboarding a new recruiter: the first week they need to learn about you, then they become invaluable.

Built by someone who used it to evaluate 740+ job offers, generate 100+ tailored CVs, and land a Head of Applied AI role. Read the full case study.

Friday, April 3, 2026

Woodpecker uses docker containers to execute pipeline steps.

.woodpecker/build.yaml

steps:

  • name: build image: debian:stable-slim commands:
    • echo building
    • sleep 5

.woodpecker/deploy.yaml

steps:

  • name: deploy image: debian:stable-slim commands:
    • echo deploying

depends_on:

  • lint
  • build
  • test

.woodpecker/test.yaml

steps:

  • name: test image: debian:stable-slim commands:
    • echo testing
    • sleep 5

depends_on:

  • build

.woodpecker/lint.yaml

steps:

  • name: lint image: debian:stable-slim commands:
    • echo linting
    • sleep 5

nanochat is the simplest experimental harness for training LLMs. It is designed to run on a single GPU node, the code is minimal/hackable, and it covers all major LLM stages including tokenization, pretraining, finetuning, evaluation, inference, and a chat UI. For example, you can train your own GPT-2 capability LLM (which cost ~$43,000 to train in 2019) for only $48 (~2 hours of 8XH100 GPU node) and then talk to it in a familiar ChatGPT-like web UI. On a spot instance, the total cost can be closer to ~$15. More generally, nanochat is configured out of the box to train an entire miniseries of compute-optimal models by setting one single complexity dial: --depth, the number of layers in the GPT transformer model (GPT-2 capability happens to be approximately depth 26). All other hyperparameters (the width of the transformer, number of heads, learning rate adjustments, training horizons, weight decays, ...) are calculated automatically in an optimal way.

Thursday, April 2, 2026

While I do not have a technical background, I am very fortunate to live in the era of Andrej Karpathy's nanochat, a very simple harness for training LLMs, and Claude Code, a tool for those who, like me, know just enough Python to know how to break things but not enough to know how to fix them. I am not a machine learning expert or AI lab with gobs of money. My only co-worker can't speak English and spends most of the day sleeping on my lap or cleaning her fur. I'm just a man with a laptop, Claude Code, and a dream of the 1890's.

happened to stumble across the British Library Books dataset, a dataset of digitized books dating from between 1500 and 1900

This left me with 28,035 books, or roughly 2.93 billion tokes for pretraining data

I settled on using a Vast.ai instance that used PyTorch. Renting a NVIDIA H-100 GPU ran me between $1.50 and $2.00 per hour.

Using Claude Code, I trained a BPE tokenizer from scratch on the corpus, ending up with a vocabulary of about 32,000 words. Using a modern tokenizer wouldn't capture the unique Victorian morphology and orthography of the corpus.

However, my method for dealing with most other problems was to nicely ask Claude Code to fix them once identified, and it was able to without too many issues.

the final pre-trained model came out to about 340 million parameters, and had a final validation bpb of 0.973. The pretraining process took about five hours on-chip, and cost maybe $35. I had my pretrained model, trained in 6496 steps

but it lacked the spark of intellect that would allow such a creation to engage in discourse. I needed to develop some kind of dataset to teach it the art of conversation

Fortunately, I already had a corpus of 28,000 books, so I set Claude Code to work extracting dialogue pairs from the books. I ultimately ended up with 190,000 or so training pairs. So, when one person said X, I had an example of another person saying Y. The art of conversation!

I needed to rewrite these corpus pairs so that the input question was in modern argot. This task was more than I could possibly do by hand, so Claude Code suggested, helpfully, that I used Claude Haiku to rewrite the input questions

Totally useless. This model—which I will call Model #1—had learned to emit Victorian-sounding novelistic gobbledygook in response to user inputs, not how to answer user queries. I had assumed my pre-written QA pairs were good enough, when they clearly weren't. It was back to the drawing board

I decided to start including fully-synthetic data in the mix. Working with Claude Code, I asked it to write a script that would direct another LLM to write a .jsonl file of fully-synthetic scenes. In them, a user greeted the LLM, queried about Victorian topics, and the LLM responded in a period-appropriate manner for 2-4 turns. We

Or $496.66 all together.

Wednesday, April 1, 2026

Blocks, Elements and Modifiers Block Standalone entity that is meaningful on its own.

Examples header, container, menu, checkbox, input

Element A part of a block that has no standalone meaning and is semantically tied to its block.

Examples menu item, list item, checkbox caption, header title

Modifier A flag on a block or element. Use them to change appearance or behavior.

Examples disabled, highlighted, checked, fixed, size big, color yellow

#

uhashring implements consistent hashing in pure Python.

Consistent hashing is mostly used on distributed systems/caches/databases as this avoid the total reshuffling of your key-node mappings when adding or removing a node in your ring (called continuum on libketama). More information and details about this can be found in the literature section.

This full featured implementation offers:

a lot of convenient methods to use your consistent hash ring in real world applications. simple integration with other libs such as memcache through monkey patching. a full ketama compatibility if you need to use it (see important mention below). all the missing functions in the libketama C python binding (which is not even available on pypi) for ketama users. possibility to use your own weight and hash functions if you don't care about the ketama compatibility. instance-oriented usage so you can use your consistent hash ring object directly in your code (see advanced usage). native pypy support, since this is a pure python library. tests of implementation, key distribution and ketama compatibility.

Solo founders and small teams are shipping real products faster than ever.

Launching is the easy part. Once you're live, the hard part starts -- growth, monetization, security, stability — that's where most apps stall out or die.

We've scaled products to millions of users and hundreds of millions in revenue. Now we're building the tools we wish we had when we started.

Think of us as the cofounder you wish you had.

Monday, March 30, 2026

SPACE was created explicitly to address the limitations of single-dimension productivity metrics (including DORA). Its core argument is that developer productivity is multidimensional and cannot be captured by any single metric or even a single category of metrics. You need to measure across multiple dimensions and combine perceptual (self-reported) data with behavioral (system-observed) data.

S — Satisfaction and Well-being

What it measures: How fulfilled, happy, and healthy developers feel about their work, team, tools, and culture. Why it matters: Developer satisfaction is both an outcome worth caring about and a leading indicator of future productivity. Dissatisfied developers leave, disengage, or burn out — all of which destroy team productivity over time. Satisfaction is also the dimension most likely to surface problems that system metrics miss (e.g., "our CI is technically fast but the developer experience of debugging failures is awful"). Example metrics:

Developer satisfaction surveys (NPS-style or Likert scale) Retention and turnover rates Burnout indicators (after-hours work patterns, survey responses) Tool satisfaction ratings

P — Performance

What it measures: The outcomes of the work — not how much was done, but whether what was done achieved its intended result. Why it matters: Activity without outcomes is waste. A team can be very busy (high activity) and still underperform (low performance) if they're working on the wrong things, producing low-quality output, or failing to deliver customer value. Example metrics:

Customer satisfaction / NPS Feature adoption rates Reliability (uptime, error rates) Code quality indicators (defect density, code review quality) Revenue or business KPIs tied to engineering output

A — Activity

What it measures: The count or volume of actions and outputs produced by developers and teams. Why it matters (with caveats): Activity metrics are the most straightforward to collect from systems (commits, PRs, deployments, reviews). They're useful as a component of productivity measurement but dangerous as the primary measure because they incentivize volume over value. The SPACE authors explicitly warn against using activity metrics in isolation. Example metrics:

Number of PRs opened, reviewed, merged Number of commits Number of code reviews completed Number of deployments Number of incidents responded to CI/CD pipeline runs

C — Communication and Collaboration

What it measures: How effectively people and teams share information, coordinate work, review each other's contributions, and work together. Why it matters: Software development is a team sport. Individual velocity means little if coordination overhead is high. Teams with poor communication have longer cycle times, more rework, and more integration conflicts — even if individual developers are productive in isolation. Example metrics:

Code review turnaround time (time from review request to first review) PR review depth (number of review comments, reviewers per PR) Knowledge distribution (bus factor — how many people can work on a given area?) Cross-team PR review frequency Meeting load and interruption frequency

E — Efficiency and Flow

What it measures: Whether developers can do their work with minimal interruptions, delays, and friction. This dimension captures the experience of getting work done — are there unnecessary handoffs, tool-switching, waiting periods, or manual steps? Why it matters: This is the heart of the "developer experience" concept. Two teams with identical DORA metrics can have radically different developer experiences if one team's pipeline is smooth and automated while the other requires manual interventions, workarounds, and waiting. Example metrics:

Time spent waiting (for CI, for reviews, for environments) Handoffs between teams or tools Manual steps in automated workflows Context switches per day "Flow state" time (uninterrupted coding time) Toil and workaround frequency

Thursday, March 26, 2026

Before running any cacheable task, Nx computes its computation hash. As long as the computation hash is the same, the output of running the task is the same.

By default, the computation hash for something like nx test remixapp includes:

All the source files of remixapp and its dependencies Relevant global configuration Versions of external dependencies Runtime values provisioned by the user such as the version of Node CLI Command flags