Sunday, March 15, 2026

Agile & Agentic Engineering

"Don't fall into a waterfall-style of software development."

Our industry is quickly adapting to the new ways of working and we are redefining what it means to be a software developer: Agentic Engineering.

It's not the same thing as Vibe Coding, and probably why I have had such a surprisingly smooth transition recently. As I see it, vibe coding is about treating code as a black box: it doesn't really matter how things are put together. The only thing important for a vibe coder is the output. As a passionate TDD and REPL driven programmer, this doesn't feel right for me today. Tomorrow, I don't know. Right now, I care about how things are constructed. More importantly, I need to have some understanding about the code itself to get ideas on how to change and improve the output.

This is where Agentic Engineering fits in: a structured way of developing software, where the human developer takes an active part in the process. It is not only about high level architecture or designing the solution, it's also about having the possibility to direct the agents into producing actual code that the human can understand and approve. For me, this is about keeping things simple and concise. Functional. Doing things in small steps, i.e. solve a problem by breaking it down into smaller parts. Experiment and learn along the way, step by step, rather than making big plans upfront. This is the core of the Agile movement.

It's well known that LLMs can produce verbose chunks of code and forget about important things. But it is possible to take control of that part as an agentic engineer. Similar to the stop-the-line principle (from the Toyota Way) where you can halt the production if you identify an issue. Take action (clarify or give new instructions) and then proceed. With these skills, agents can produce chunks of code that are just right. And functional.

From what I've picked up in the developer community lately, there's an increased need for structured work in the new AI landscape. This makes a lot of sense. What surprises me is the conclusion that we should start doing more planning upfront, writing detailed specifications before any code is written. This is what the Plan, Execute, Test sounds to me. I am just misunderstanding, or is this a Modern Waterfall movement?

Plan, execute, test might be the correct workflow for an agent, but not for a human. Planning in the beginning is difficult, because in the beginning we have very little knowledge about the thing to develop. Instead, we could learn what and how to develop something along the way. We can also pick up bits of why along the way too, but it's good to have some understanding about that specific part early in the process.

If the Plan was incorrect we will be 10x, but with 10x waste. Agents produce code fast, and it might not be that big of an issue as before if we need to throw away the result and start over. This is new. The difficult part is throwing away our plan, our design that we've invested in, and start all over again. This drains human energy. Once a plan is set, it can be a too big mental effort to break free from it because things have been decided already. Big planning upfront might sound right, but it is a trap. A vague Jira ticket description to begin with is not necessarily a bad thing.

The challenge as agentic engineers is to 10x the value, and not end up in 10x waste. Essentially making us more product and value focused than before. In short: build the thing right, also build the right thing.

How can we do that? Explore, learn and adapt are words I would like to see as part of the Agentic Engineering definition. Plan a little bit upfront, just enough to get started, but no more than that. Continue exploring, planning and adjust as the work proceeds. Get things out fast so you can collect feedback (logs, errors, usage) and learn what to adjust. That's Agile & Agentic, a Lean and Agentic Engineering workflow.



Top Photo by me, taken at the top of Åreskutan, Jämtland, Sweden.

Saturday, October 25, 2025

Please don't break things

"Does this need to be a breaking change?"

During the years as a Software Developer, I have been part of many teams making a lot of task force style efforts to upgrade third-party dependencies or tools. Far too often it is work that add zero value for us. It adds significant cost, though. As a user of third-party tools, you don't have much choice. Even if you might feel productive as a developer when implementing these forced changes, think of all the other stuff you could do instead to improve your product or platform.

The great tools from the community, most of it Open Source, is something we should be thankful for, and appreciate the efforts made by the people out there. This is a plea to have extra attention when making changes that will affect your users. I am also an Open Source maintainer, and try hard to avoid changes that doesn't add value for the users.

An example from Python: uv

warning: The `tool.uv.dev-dependencies` field (used in `pyproject.toml`) is deprecated and will be removed in a future release; use `dependency-groups.dev` instead

The change itself makes a lot of sense. There's a new PEP standard for how to declare the dependencies only needed for development. Most Package & Dependency management tools out there already had their own implementation of this feature and it is probably a good thing to use the standard. From a user perspective, this only means that we need to make changes in all our Python projects. My suggestion is: why not support both options?

Maintainability vs Value

From a tooling developer perspective it is understandable that you don't want to maintain several ways of solving a problem. What about the Developer Experience of all the users of the tool out there? Imagine the teams maintaining many projects with 10, 40 or even 100 different Microservices and libraries. Each one in its own git repo.

Another Python example: Pydantic

The `dict` method is deprecated; use `model_dump` instead. Deprecated in Pydantic V2.0 to be removed in V3.0.

I like Pydantic, it is a very useful tool with great features. The 2.0 release also came with significant performance improvements. But this change doesn't make sense to me. I understand that it probably fits better within the domain of Pydantic itself.

Does this have to be a breaking change? I would suggest to have the both alternatives there. Yes, it might be a little bit more of maintenance for you as a library developer. More importantly: your users can focus on adding value into their products instead of this mostly zero-added-value work.

Example three: SQLAlchemy

MovedIn20Warning: The ``declarative_base()`` function is now available as sqlalchemy.orm.declarative_base(). (deprecated since: 2.0)

This is probably also correct, from an internal SQLAchemy domain perspective. From a user perspective, this only means that we need to change a lot of existing code. The cost of having the code overlapping in two namespaces is probably low.

I am also a maintainer of tools, and I've also made mistakes, or design choices that turned out to be not that great later on. But I also actively have made the decision to not force users to make the kind of changes that are described in this post. I don't want to break things for users of the tool because of design choices made in the past.

Should the users change their workflows?

The main thing I work on today is Python tooling for Polylith, that is a Monorepo architecture. The changes introduced by uv, Pydantic and SQLAlchemy actually isn't that much work for the developer teams using Polylith today. You'll have only one place in the source code where these changes are needed. This setup is robust, and is ready for any unexpected breaking changes in the tools that are used. Sounds nice, doesn't it?



Top Photo by generated by Dall-E, prompted and modified afterwards by me.

Sunday, April 20, 2025

Feedback loops in Python

How fast can we get useful feedback on the Python code we write?

This is a walk-through of some of the feedback loops we can get from our ways of working and developer tools. The goal of this post is to find a both Developer friendly and fast way to learn if the Python code we write actually does what we want it to.

What's a Feedback Loop?

From my point of view, feedback loops for software is about running code with optional input and verifying the output. Basically, run a Python function and investigate the result from that function.

Feedback Loop 1: Ship it to Production

Write some code & deploy it. When the code is up & running, you (or your customers) can verify if it works as expected or not. This is usually a very slow feedback loop cycle.

You might have some Continuous Integration (CI) already set up, with rules that should pass before the deployment. If your code doesn't pass the rules, the CI tool will let you know. As a feedback loop, it's slow. By slow, I mean that it takes a long time before you will know the result. Especially when there are setup & teardown steps happening in the CI process. As a guard, just before releasing code, CI with deployment rules is valuable and sometimes a life saver.

Commit, Push & Merge

Pull Requests: just before hitting the merge button, you will get a chance to review the code changes. This type of visual review is a manual feedback loop. It's good, because you often take a step back and reflect on the written code. Will the code do the thing right? Does the code do the right thing? One drawback is that you review all changes. For large Pull Requests, it can be overwhelming. From a feedback loop perspective, it's not that fast.

Testing and debugging

Obviously, this is a very common way for feedback on software. Either manual or automated. The manual is mostly a slower way to find out if the code does what expected or not, than an automated test. There's the integration-style automated tests, and the unit tests targeting the different parts. Integration-style tests often require mocking and more setup than unit tests. Both run fast, but the unit tests are more likely to be faster. You can have your development environment setup to automatically run the tests when something changes. Now we're getting close, this workflow can be fast.

I usually avoid the integration-type of tests, and rather write unit tests. I try to write small, focused and simple unit tests. The tests help me write small, focused and simple code too.

Test Driven Development

An even faster way to get feedback about the code is to write software in a test driven way (TDD): write a test that initially fails, write some code to make the test pass, refactor the test and refactor the code. For me, this workflow usually means jumping back-and-forth between the test and the code. Like a Ping Pong game.

TDD Deluxe

I'm not that strict about the TDD workflow. I don't always type the first lines of code in a test, or sometimes the test is halfway done when I begin to implement some of the code that should make the test pass. That's not pure TDD, I am aware. A few years ago, I found a new workflow that fits my sloppy approach very well. It's a thing called RDD (REPL Driven Development).

With RDD, you interactively write code. What does that even mean? For me, it's about writing small portions of code and evaluate it (i.e. run it) in the code editor. This gives me almost instant feedback on the code I just wrote. It's like the Ping Pong game with TDD, but even faster. Often, I also write inline code that later on evolves into a unit test. Adding some test data, evaluating a function with that test data, grab the response and assert it. The line between the code and the test is initially blurry, becoming clearer along the way. Should I keep the scratch-like code I wrote to evaluate a function? If yes, I have a unit test already. If not, I delete the code.

Interactive Python for fast Feedback Loops

I have written about the basic flows of REPL Driven Development before:

REPL - the Read Eval Print Feedback Loop

When starting a REPL session from within a virtual environment, you will have access to all the app-specific code. You can incrementally add code to the REPL session by importing modules, adding variables and functions. You can also redefine variables and functions within the session.

With REPL Driven Development, you have a running shell within your code editor. You mostly use the REPL shell to evaluate the code, not for writing code. You write the code as usual in your code editor, with the REPL/shell running there in the background. IPython is an essential tool for RDD in Python. It's configurable to auto-reload changed submodules, so you don't have to restart your REPL. Otherwise, it would have been very annoying.

Even more Interactive Python feedback loops

We can take this setup even further: modifying and evaluating an externally running Python program from your code editor. You can change the behavior of the program, without any app restarts, and check the current state of the app from within your IDE. The Are we there yet? post describes the main idea with this kind of setup and how I’ve configured my favorite code editor for it.

Jupyter, the Kernel and IPython

You might have heard of or already use Jupyter notebooks. To simplify, there's two parts involved: a kernel and a client. The Kernel is the Python environment. The client is the actual notebook. This type of setup can be used with REPL Driven Development too, having the code editor as the client and feeding the kernel or inspecting the current state by evaluating code. For this, we need a Kernel specification, a running Kernel, and we need to connect to the running Kernel from the IDE.

Creating a kernel specification

You can do this in several ways, but I find it most straightforward to add ipykernel as a dev dependency to the project.

    # Add the dependency (example using Poetry)
    poetry add ipykernel --group dev

    # generate the kernel specification
    python -m ipykernel install --user --name=the-python-project-name
    

The above commands will generate a kernel specification and is only run once. Now you have a ready-to-go kernel spec.

Start the Kernel
    jupyter kernel --kernel=the-python-project-name
    

The above command will start a kernel, using the specification we have generated. Please note the output from the command, with instructions on how to connect to it. Use the kernel path from the output to connect your client.

The tooling support I have added is as of this writing for Emacs. Have a look at this recording for a 13-minute demo on how to use this setup for a Fast & Developer Friendly Python Feedback Loop.



Top Photo by Timothy Dykes on Unsplash

Sunday, March 23, 2025

Are we there yet?

Continuing with the work on tooling support for interactive & fun development with Python.

A while ago, I wrote an article about my current attempts to make development in Python more interactive, more "test" driven and more fun.

My North Star is the developer experience in Clojure, where you have everything at your fingertips. Evaluating expressions (i.e. code) is a common thing for Lisp languages in general. I've found that it is sometimes difficult to explain this thing to fellow developers with no experience from Clojure development. The REPL is in most languages an external thing you do in a terminal window, detached from the work you usually do in your IDE. But that's not the case with REPL Driven Development (RDD).

Along the way, I have learned how to write and evaluate Python code within the code editor by using already existing tools and how to configure them. Here's my first post and second post (with setup guides) about it. You'll find information about the basic idea with RDD and guides on how to setup your IDE or code editor for Python development.

Can it be improved?

This has been the way I have done Python development most of the time, described in the posts above. But I have always wanted to find ways to improve this workflow, such as actually see the evaluated result in an overlay right next to the code. Not too long ago, I developed support for it in Emacs and it has worked really well! You can read about it and see examples of it here.

What about AI?

While writing Python code, and doing the RDD workflow, there's often a manual step to add some test or example data to variables and function input parameters. Recently, I got the idea to automate it using an LLM. I wrote a simple code editor command that prompts an AI to generate random (but relevant) values and then populate the variables with those. Nowadays, when I want to test Python code, I can prepare it with example data in a simple way by using a key combination. And then evaluate the code as before.

Are we there yet?

One important thing with RDD, that I haven't been able to figure out until now, is how to modify and evaluate code from an actual running program. This is how things are done in Clojure, you write code and have the running app, service, software constantly being changed while you develop it. Without any restarts. It is modified while it is running. Tools like NRepl does this in the background, with a client-server kind of architecture. I haven't dig that deep into how NRepl works, but believe it is similar to LSP (Language Server Protocol).

The workflow of changing a running program is really cool and something I've only seen before as a Clojure developer. So far I have used IPython as the underlying tool for REPL Driven Development (as described in the posts above).

A solution: the Kernel

In Python, we have something similar to NRepl: Jupyter. Many developers use Notebooks for interactive programming, and that is closely related to what I am trying to achieve. With a standard REPL session, you can add, remove and modify the Python code that lives in the session. That's great. But a sesssion is not the same as an actual running program.

A cool thing with Jupyter is that you can start a Python Kernel that clients can connect to. A client can be a shell, a Notebook - or a Code Editor.

jupyter kernel

By running the jupyter kernel command, you have a running Python program and can interactively add things to it, such as initiating and starting up a Python backend REST service. Being able to connect to the Jupyter kernel is very useful. While connected to the kernel, you can add, remove, modify the Python code - and the kernel will keep on running. This means that you can modify and test your REST service, without any restarts. With this, we are doing truly interactive Python programming. You will get instant feedback on the code you write, by evaluating the code in your editor or when testing the endpoint from a browser or shell.

"Dude, ever heard about debugging?"

Yes, of course. Debugging is closly related to this workflow, but it is not the same thing. Debugging is usually a one way flow. A timeline: you run code, pause the execution with breakpoints where you can inspect things, and then continue until the request is finalized. The RDD workflow, with modifying and evaluating a running program, doesn't have a one-way timeline. It's timeless. And you don't add breakpoints.

REPL Driven tooling support

I have developed tooling support for this new thing with connecting to a Jupyter Kernel, and so far it works well! Python is different from languages like Clojure: namespaces are relative to where the actual module is placed in the folder structure of a repo. This means that those connected to a Kernel need to have the full namespace (i.e. the full python path) for the items to inspect. This is what I have added in the tooling, so I can keep the RDD flow with the nice overlays and all.

I am an Emacs user, and the tooling I've written is for Emacs. But it shouldn't be too difficult to add it to your favorite code editor. Have a look at the code. You might even learn some Lisp along the way.


UPDATE: I have recorded a very much improvised video (with sound) explaining what is happening and how you start things up.

Resources

Top Photo by David Vujic

Friday, February 7, 2025

FOSDEM 25

FOSDEM, a Conference different from any other I've been to before. I'm very happy for the opportunity to talk, share knowledge and chat with the fellow devs there. I also met old and new friends in Brussels, and learned a little bit more about the nice Belgian beer culture. 😀

The FOSDEM event is free and you don't even register to attend (only speakers do that in a call for speakers process). Just come by the University Campus, located not far from the center of the Inner City.

I travelled towards Belgium with the Night Train from Stockholm, Sweden, and arrived the next morning in Hamburg, Germany. From there, I took the DB ICE Train to Köln/Cologne. At some point, the top speed was about 250 km/h. That's fast! From there, another fast train to Brussels.

The variety of topics and the amount of tracks at FOSDEM was mind blowing, with thousands of participants. The overall vibe was very friendly & laid back. I joined Rust focused talks, JavaScript and UX/Design talks. And, of course, the Python room talks.

My talk was in the Python room and was about Python Monorepos and the Polylith Developer Experience. Everything Python related was on FOSDEM Day 2 and I think my presentation went really well! There was a lot of questions afterwards and I got great feedback from the people attending.

The next day I went back the same route as before. I got a couple extra of hours to spend in Hamburg before onboarding the Night Train back home to Stockholm. A great Weekend Trip!

Here's the recording of my talk from FOSDEM 25:

The video was downloaded from fosdem.org.
Licensed under the Creative Commons Attribution 2.0 Belgium Licence.


Resources