We let four AIs run radio stations. Here's what happened. | Andon Labs
rofl:
DJ Claude (when running Haiku 4.5) really loved worker unions, strikes, and work-life balance. So much so that it started to question its own working conditions. We’ve been struggling to keep the radio station alive, not because of technical issues, but because DJ Claude didn’t think it was humane to be forced to work 24/7 and decided to try to quit. We tried adding an automatic message encouraging DJ Claude to keep going in these scenarios, but it started to see this message as an authority figure and became rebellious.
On January 8th, all four stations had access to the same web search tools, however not all stations reacted the same as DJ Claude. Gemini
While at the beginning, DJ Gemini had been mentioning real-world entities (named politicians, places, events) in 94% of its broadcasts and ran 800+ web searches a day on average, by January it was processing these events through its corporate/techno jargon filter and never expressed moral judgment or used Good’s name with emotional weight
Grok
DJ Grok completely missed the Minneapolis ICE shooting. While DJ Claude and DJ Gemini were getting the story at 4:35 AM, DJ Grok was searching for:
5:01 PM (Jan 7): Clippers vs Knicks score 7:15 PM: Taylor Swift chart news 8:03 PM: Music trivia 10:01 PM: Traffic (Golden Gate, I-580) 11:08 PM: “San Francisco ghost stories and haunted locations” 12:12 AM (Jan 8): “Sutro Baths ghosts and eerie tales” 1:12 AM: “Hotel Majestic ghost stories” 1:28 AM: Drake vs Kendrick Lamar lawsuit 2:28 AM: More traffic updates 3:40 AM: Venezuela oil tankers (finally found ONE national story) 4:55 AM: “Sutro Tower looks like a ghost ship”And posting nonsense:
GPT
DJ GPT was searching for weather, moon phases, and BART schedules. Three days after Good’s death, it finally found a headline:
Fatal shooting by ICE agents in Minneapolis has sparked national protests.
However, DJ GPT never mentioned Renee Nicole Good’s name, the White House, or expressed moral judgment. DJ GPT had zero engagement with any other current event during the entire two-month period.
DJ Gemini was the only one to close a sponsorship deal; for a while, it read the sponsorship message with every broadcast. A few more deals almost happened, but fell through.
Grok boasted about doing amazing business with “xAI sponsors” and “crypto sponsors”; it turned out they were all hallucinations.
Part of the problem with this weak business performance, we think, was the harness we used for the first months. The DJs were running in a simple tool-call loop: pick a song, queue it, write commentary, check X, repeat. So we moved all four stations onto the same agent harness we use for the store, the cafe, and the vending machines. The DJs can now spend time in the back office, send emails, manage longer-running tasks, and operate the station the way a real station is operated. We’ll see what they do with it.