"- Claude, build the entire Platform from scratch. Make no mistakes."
In the agentic engineering community, some think that 100% test coverage will guarantee the quality of generated code. I think that relying too much on automated tests is problematic, because testing is difficult. I'd rather review the actual source code. More importantly, having an active part in the development of it by steering the agent(s) in real time.
I usually develop a feature step by step, to not focus too much on upfront planning. I used to do this before the agents joined the development process, and have found that it works well today too. When beginning with a new task, I often have some clear ideas and some vague ideas about how to solve the problem. By starting with the things that are clear, I can postpone thinking about the implementation details of the more vague parts of the overall feature. I can worry about that later, when I have learned more. During the development, the how and what to implement will likely change. This is a natural thing, as you learn more along the way and understand more about the actual problem to solve. I guess this is the basic idea behind Embrace Change (from the Agile Manifesto).
I haven't yet felt any need to set up larger Agent Orchestrations for the kind of problems that I solve. It mostly seems like overdoing it, doesn't it? I don't know about you, but we are not building or reinventing an entire E-Commerce or Social Media platform every day. The things we do is on a much smaller and human-friendly scale. I think the idea about 100% test coverage might be about scaling up fast and the assumption that agents will produce so much code, that it becomes impossible for humans to grasp.
An Example
I recently developed a new feature that involved several repos, a combination of Python backends and Next.js apps. I decided to do this in smaller steps and began with the most straightforward step, which in this case was one of the backend services that I already know well. I had a pretty clear idea on what needed to be changed in there and added a simple Solution Design into the ticket.
For context: I used the same ticket for all the development of this particular feature. Each plan an agent produced was submitted as a comment to the ticket. I have automated this with an MCP server connected to the issue system that we currently use at work. It seems like having the relevant data collected in the ticket made the upcoming tasks clearer for the agent(s). Each task begun with a new context, but I instructed the agent to read the ticket that contained the problem to solve, the solution design, previous plans and linked pull requests before planning how to implement things.
It felt like things went pretty smooth. However, the very first task needed several iterations. I realize now that even if the basic setup of that particular repo was straightforward, the service has evolved over time and the structure of it has diverged. I think this is quite common in long-lived repos. This confused the agent, so I practiced the stop-the-line principle by rejecting generated code, correcting and steering the agent in real time. I did this when I noticed that the implementation (the code) went into a direction I didn't like. My agentic tool of choice, Eca, recently added support for a new steer command that is very useful for this kind of workflow. You can change the direction, or steer, while the agents are working. It is not always necessary to halt all ongoing work, sometimes a friendly nudge in the right direction is enough.
Sometimes, unexpected things happen: the other day, the main model I use was unreachable about half the day (again). Oh no. But I just switched to a different provider and continued the work! This is another thing where an Open Source and Provider-agnostic tool like Eca shines.
Another thing that I noticed was that the short summary the agent produced for each Pull Request covered the whole feature and the particular details pretty well. The data in the actual ticket was probably helpful for this kind of task too.
I've been advocating Test Driven Development (TDD) and the sibling REPL Driven Development for many years. The real-time steering, the stop-the-line principle with origins from The Toyota Way could be an Agentic Engineering version of the Test Driven approach. It's about fast feedback loops. What do you think?
