Designing Less Painful AI Production Lines
I overdid it on my AI-code production line.
So often, designing things is overdoing them and then bringing them back down… once the suffering has been noticed.
I wanted to solve the problem where:
I ask AI to do a thing → …bleep bloop… AI says the thing is done.
But it’s not done. And now I play AI whack-a-mole.
In this case I wanted AI to write tests, then fulfill them with code—which sounds great.
But it went wrong.
What went really right were the first 2 steps:
1 - where Claude interviewed me to create a ‘job spec’ following a template we had designed together.
2 - where Claude reviewed the spec against a set of ‘checks’ we also templated.
Amazing.
What went wrong?
All the test writing.
It went nowhere and made the AI dev dumber. The tests were overkill and the Dev was designing to pass tests instead of being smart and looking at greater context.
(There’s a parallel to our school systems here but I’ll leave you to parse it out.)
So after that suffering, what did I do?
I skipped the test writing. Updated the ‘implementor removing the test references, and just handed it the spec.
Boom. Now the production line is pumping.
I have to admit that I’ve sunk more hours than I want to count trying to avoid testing, since I dislike it so much.
I still think it’s a great challenge to solve.
But it was killing my production line.
When I started building the production line, my goal was to stop playing so much AI whack-a-mole and output quality code, more quickly.
You’ll notice there are a bunch of objectives rolled up in there:
1 - Quality code
2 - Faster
3 - Me not playing whack-a-mole
They are all related but not identical. Not noticing this up front was one cause of down-the-line suffering.
Going one level deeper: I took a while to notice that ‘creating better job specs’, ‘automating testing’ and ‘reducing loops between approval and re-work’ are all separate levers that help with different combinations of my objectives.
Separating these would have lead me to less pain.
What I did well?
I designed myself into the loop at key quality control moments and handovers so that I could know the system’s parts well enough to iterate.
When something didn’t go well I broke it into multiple steps, favoring consistency and control over each step over trying to automate end-to-end.
This came from the mental model that each step is akin to a workstation in a production line, and that any improvements to my main constraint, which is time lost to re-work, is a win.
Onward and upwards.

Comments
Post a Comment