Press "Enter" to skip to content

The Yin and Yang of Language Models

Erik Darling details a series of problems and somewhat-solutions:

This post is inspired by my BFFL Joe Sack’s wonderful post: Keep Humans in the Circle.

It’s an attempt to detail my progression using LLMs, up through how I’m using them today to build my free SQL Server Monitoring and Query Plan analysis tools.

While these days I do find LLMs (specifically Claude Code) to be wonderful enablers for my ideas, they still require quite a bit of guidance and QA, and they’re quite capable of (and sometimes seemingly eager to) wreck your day.

I did enjoy reading about Erik’s journey, so I figure I’ll share something of my own.

Jonathan Stewart (linking SQL Bites because I’m actively trying to shame him into creating a first video) has gone head-first into Claude Code and has dragged me along kicking and screaming. A lot of what Erik mentions resonates well with me, but there’s something that Jonathan developed that has helped: the Hater persona as an MCP server.

The Hater persona’s job is to critique whatever solution the language model comes up with. Find all of the nitpicks, point out the major gaps in implementation, come up with scenarios in which this just won’t work, that kind of thing. I’ve been Jonathan’s Hater-as-a-Service for years, so naturally, he named the MCP server Kevin. Artificial Kevin can iteratively to come up with the biggest problems, and feed that information back into the main model to fix it up. After several rounds of this, I’ve found that there aren’t nearly as many rough edges as you might find at the start.

Even so, I still stand by the assertion that language models are akin to drunken interns, and the extent to which you trust the output of a language model is on you. But in fairness, hiring the average dev from Fiverr gives you the same experience but a few orders of magnitude slower.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.