← back

Field notes on agent-assisted coding with Claude

Postmarks is a small project that lets you build little SVGs, like the one above, with a fun amount of customization.

While the editor is a static page with some knobs and dials, there's also a tiny service behind the scenes that renders these images based on URL query params to allow for easy embedding. I've been meaning to build something like this for a while now to headline my travel posts, and while this may be trivial to some, I am unfortunately deathly scared of SVG manipulation.

So, how far can we get with Claude Code and Opus 4.5?

Clearly pretty far with the above example!

What I wanted to learn

While I've already been using Claude Code in a work setting for a while now, I wanted to understand how much control I could comfortably let go (and if it made sense to let go), and what works and doesn't for my development flow. In particular, I was keen to understand

As a side goal, I also wanted to document my process. Everyone seems to have a different way of operating with these agents, so here's mine.

Note: YMMV!

First steps

I did not begin with a giant PRD.md. That would have been a great vibe-coding experiment, but perhaps that's for another post.

Instead it began with a completely empty repo and a Claude code session, with lots of initial back and forth. The first few dozen messages looked like this:

Give me them devtools

Turns out, editing SVGs and UIs manually is painful. (subjective)

Editing SVGs and UIs through an LLM with English is significantly worse! (fact)

What was a lot easier was getting access to "developer-mode" tools.

I find it a lot more productive asking the LLM to help you do what you want, rather than asking it to do what you want.

Preemptive nudging

> Interrupted · What should Claude do instead?

I find guiding Claude on tangential details before it works on the task helps a lot. Would it have figured this out on its own? Possibly. But probably not. In my experience, it's very happy to miss details like this, resulting in duplicate code, especially if it isn't within recent context. It only recognises it later when you question it, giving you the classic "You're absolutely right" moments.

The duplicated code would still work, so it's very easy to miss if the end result is all that you're evaluating by.

Setting up continuous deployment early

One of my goals was seeing what letting an agent release to "prod" looks like. That starts with making deployment trivial.

Github Pages for static pages (as is this site) is easy to setup through Actions, and Cloudflare workers lets you run small services on the edge very easily with a generous Free Tier. I got all that set up and populated on Github Secrets and just let Claude know.

This is bread and butter for LLMs these days, so no issues here. Tasks that involve little creativity and mostly just involve following a spec—CI/CD flows, Terraform, etc. are easy to let an agent take control of. What it immediately does is allow you to start pushing to main and see things live!

Claude Code on Android

At first glance, this looks like it's Claude Code through your phone, but in reality, it's just a messaging app.

Claude Code on your terminal gives you a dozen different keyboard shortcuts, tools, skills, subagents, hooks, different ways to interact, queue background tasks, shell shortcuts, etc. All of that is lost on the phone, both by design of the app as well as the form factor of the phone. It's great for dispatching instructions while you're waiting on the barista, but not so great for back-and-forths or exploring the changes.

What it does work really well for is a subset of tasks:

In my experience, for anything larger or complicated, you're better off with a more hands-on approach, just so you can steer it. The results were not disastrous, but every now and then needed some small cleanup patches later.

Here's an example:

Separate wear-and-tear+rotation into two different sliders PR

led to the viewbox breaking, and clipping corners, which had to be fixed by a follow up Fix the clipped corners! PR

Final thoughts

So, how far can we get with Claude Code and Opus 4.5?

Turns out, pretty damn far. At the end of the day, I managed to build something I put off for years. That is a giant win, full stop. Everything else feels minor.

Some specific learnings though:


Go on, give Postmarks a spin here!

← back to posts