Sunday, May 12, 2024

Pants & Polylith

But who is Luke and who is R2?
"Pants is a fast, scalable, user-friendly build system for codebases of all sizes"
"Polylith helps us build simple, maintainable, testable, and scalable backend systems"

Can we use both? I have tried that out, and here's my notes.


Because The Developer Experience.

Developer Experience is important, but what does that mean? For me, it is about keeping things simple. The ability to write, try out and reuse code without any context switching. By using one single setup for the REPL and the IDE, you will have everything at your fingertips.

The Polylith Architecture solves this by organizing code into smaller building blocks, or bricks, and separating code from the project-specific configurations. You have all the bricks and configs available in a Monorepo. For Python development, you create one single virtual environment for all your code and dependencies.

There is also tooling support for Polylith that is useful for visualizing the contents of the Monorepo, and for validating the setup. If you already are into Pantsbuild, the Polylith Architecture might be the missing Lego bricks you want to add for a great Developer Experience.

Powerful builds with Pants

Pantsbuild is a powerful build system. The pants tool resolves all the dependencies (by inspecting the source code itself), runs the tests and creates distributions in isolation. The tool also support the common Python tasks such as linting, type checking and formatting. It also has support for creating virtual environments.

Dude, where's my virtual environment?

In the Python Community, there is a convention to name the virtual environment in a certain way, usually .venv, and creating it at the Project root (this will also likely work well with the defaults of your IDE).

The virtual environment created by Pants is placed in a dists folder, and further in a Pants-specific folder structure. I found that the created virtual environment doesn't seem to include custom source paths (I guess that would be what Pants call roots).

Custom source paths is important for an IDE to locate the Python source code. Maybe there are built-in ways in Pantsbuild to solve that already? Package management tools like Poetry, Hatch and PDM have support for configuring custom source paths in the pyproject.toml and also creating virtual environments according to the Python Community conventions.

Note: If you are a PyCharm user, you can mark a folder as a source root manually and it will keep that information in a cache (probably a .pth file).

Example code and custom scripts

I have created an example repository, a monorepo using Pantsbuild and Polylith. You will find Python code and configurations according to the Polylith Architecture and the Pantsbuild configurations making it possible to use both tools. In the example repo I have added a script that adds source paths, based on the output from the pants roots command, to the virtual environment created by Pantsbuild. This is accomplished by adding a .pth file to the site_packages folder. For convenience, the script will also create a symlink to a .venv folder at the root of the repo.

Having the virtual environment properly setup, you can use the REPL (my favorite is IPython) with full access to the entire code base:

source .venv/bin/activate

With an activated virtual environment, you can also use all of the Polylith commands:

poly create
poly info
poly libs
poly deps
poly diff
poly check
poly sync

Pants & Polylith

Pantsbuild has a different take on building and packaging artifacts compared to other tools I've used. It has support for several languages and setups. Some features overlap with what's available in the well-known tooling in the Python community, such as Poetry. Some parts diverge from the common conventions.

Polylith has a different take on sharing code, and also have some overlapping features. Polylith is a Monorepo Architecture, with tooling support for visualizing the Monorepo. From what I've learned so far, the checks and code inspection features are the things you will find in both Pants and Polylith.

Pants operate on a file level. Polylith on the bricks level.

My gut feeling after learning about it and by experimenting, is that Pantsbuild and Polylith shares the same basic vision of software development in general and I have found them working really well together. There are some things I would like to have been a better fit, such as when selecting contents of the Pants-specific BUILD files vs the content in the project-specific pyproject.toml files.

Maybe I should develop a Pants Polylith plugin to fix that part. 🤔
How does that sound to you?


Top Photo by Studbee on Unsplash

Saturday, April 13, 2024

Write Less Code, You Must

An aspect of Python Software Development that is often overlooked, is Architecture (or Design) at the namespace, modules & functions level. My thoughts on Software Development in general is that it is important to try hard writing code that is Simple, and Easy to move from one place to another.

When having code written like this, it becomes less important if a feature was added in Service X, but a better fit would be Service Y when looking at it from a high-level Architectural perspective. All you need to do is move the code to the proper place, and you're all good. However, this will require that the actual code is moveable: i.e. having the features logically separated into functions, modules and namespace packages.

Less Problems

There's a lot of different opinions about this, naturally. I've seen it in in several public Python forums, and been surprised about the reactions about Python with (too) few lines of code in it. How is it even possible having too little of code?

My take on this in general is Less code is Less Problems.

An example

def my_something_function():
    # Validation
    # if valid 
    # else do something
    ... python code here

    # Checking

    # if this 
    # elif that
    # elif not this or not that
    # else do_something
    ... python code here

    # Data transformation

    # for each thing in the things
    #    do a network call and append to a list
    ... python code here

    # Yay, done
    return the_result

This type of function - when all of those things are processed within the function body - is not very testable. A unit test would likely need a bunch of mocking, patching and additional boilerplate test data code. Especially when there are network calls involved.

My approach on refactoring the code above would be to first identify the different tasks within this controller type of function, and begin by extracting each task into separate functions. Ideally these would be pure functions, accepting input and returning output.

At first, I would put the functions within the same module, close to at hand. Quite quickly, the original function has become a whole lot more testable, because the extracted functions can now easily be patched (my preference is using pytest monkeypatch). This approach would be my interpretation of developing software towards a clean code ideal. There is no need for a Dependency Injection framework or any unnecessary complex OOP-style hierarchy to accomplish it.

In addition to testability, the Python code becomes runnable and REPL-friendly. You can now refactor, develop and test-run the individual functions in the REPL. This is a very fast workflow for a developer. Read more about REPL Driven Development in Python here.

With the features living in separate isolated functions, you will likely begin to identify patterns:

"- Hey, this part does this specific thing & could be put in that namespace"

When moving code into a namespace package, the functions become reusable. Other parts of the application - or, if you have a Monorepo containing several services - can now use one and the same source code. The same rows of code, located in a single place of the repo. You will likely structure the repo with many namespace packages, each one containing one or a couple of modules with functions that ideally do one thing. It kind of sounds like the Unix philosophy, doesn't it?

This is how I try to write code on a daily basis, at work and when developing Open Source things. I use tools like SonarCloud and CodeScene to help me keep going in this direction. I've written about that before. The Open source code that I focus on these days (Polylith) has 0% Code Duplications, 0% Code Smells and about a 9.96 long-term Quality Code Scoring. The 0.04 that is left has been an active decision by me and is because of endpoints having 5+ input arguments. It makes sense for me to keep it like that there, but not in functions within the app itself where an options object is a better choice.

This aspect of Software Development is, from my point of view, very important. Even more important than the common Microservices/Events/REST/CQRS debates when Architecture is the topic of discussion. This was my Saturday afternoon reflections, and I thank you for reading this post. ☀️

Top Photo by Remy Gieling on Unsplash

Sunday, February 18, 2024

Python Monorepo Visualization

What's in a code repository? Usually you'll find the source code, some configuration and the deployment infrastructure - basically the things needed to develop & deploy something. It might be a service, an app or a library. A Monorepo contains the same things, but for more than one artifact. In there, you will find the code, configurations and infrastructure for several services, apps or libraries.

The main use case for a Monorepo is to share code and configuration between the artifacts (let's call them projects).

These things have to be simple

Sharing code can be difficult. Repos can be out of date. A Monorepo can be overwhelming. With or without a Monorepo, the most common way of sharing code is to package them as libraries that the projects can add as external dependencies. But managing different versions and keeping the projects up-to-date could lead to unexpected and unwanted extra work. Some Monorepos solve this by using symlinks to share code, or custom scripts for copying things into the individual projects during deployment.

Doing that can be messy, I've seen it myself. I was once part of a team that migrated away from a horrible Monorepo, into several smaller single-repo microservices. The tradeoffs: source code spread out in repos with an almost identical structure. Almost is the key here. Also, code and config duplications.

These tradeoffs have a negative impact on the Developer Experience.

The Polylith Architecture has a different take on organizing and sharing code, with a nice developer experience. These things have to be simple. Writing code should be fun. Polylith is Open Source, by the way.

The most basic type of visualization

In a Polylith workspace, the source code lives in two folders named bases and components. The entry points are put in the bases folder, all other code in the components folder. At first look, this might seem very different from a mainstream Python single-project structure. But it isn't really that different. Many Python projects are using a src layout, or have a root folder with the same name as the app itself. At the top, there's probably an entry point named something like or maybe In Polylith, that one would be put in the bases folder. The rest of the code would be placed in the components folder.


You are encouraged to keep the components folder simple, and rather put logically grouped modules (i.e. namespace packages) in separate components than nested structures. This will make code sharing more straightforward than having a folder structure with packages and sub-packages. It is also less risk of code duplication with this kind of structure, because the code isn't hidden in a complex folder structure. As a side effect, you will have a nice overview over the available features: the names of the folders will tell what they do and what's available for reuse. A folder view like this is surprisingly useful.

Visualize with the poly tool

Yes, looking at a folder structure is useful, but you would need to navigate the actual source code to figure out where it is used and which dependencies that are used in there. Along with the Polylith Architecture there is tooling support. For Python, you can use the tooling together with Poetry, Hatch, PDM or Rye.

The poly info command, an overview of code and projects in the Monorepo.

Besides providing commands for creating bases, components and projects there are useful visualization features. The most straightforward visualization is probably poly info. Here, you will get an overview of all the bricks (the logically grouped Python modules, living in the bases and components folders), the different projects in the Workspace and also in which projects the bricks are added.

Third-party libraries & usages

There's a command called poly libs that will display the third-party dependencies that are used in the Workspace (yes, that's what the contents of the Monorepo is called in Polylith). It will display libraries and the usages on a brick-level. In Polylith, a brick is the thing that you share across projects. Bricks are the building blocks of this architecture.

The poly libs command, displaying the third-party dependencies and where they are used.

The building blocks and how they depend on each other

A new thing in the Python tooling is the command called poly deps. It displays the bricks and how they depend on each other. You can choose to display an overview of the entire Workspace, or for an individual project. This kind of view can be helpful when reasoning about code and how to combine bricks into features. Or inspire a team to simplify things and refactor: should we extract code from this brick into a new one here maybe?

A closer look at the bricks used in a project with poly deps.

You can inspect a single brick to visualize the dependencies: where it is used, and what other bricks it uses.

A zoomed-in view, to inspect the usages of a specific brick.

Export the visualizations

The output from these commands is very easy to copy-and-paste into Documentation, a Pull Request or even Slack messages.

poly deps | pbcopy

📚 Docs, examples and videos

Have a look at the the Polylith documentation for more information about getting started. You will also find examples, articles and videos there for a quick start.

Top image made with AI (DALL-E) and manual additions by a Human (me)

Thursday, January 25, 2024

Simple & Developer-friendly Python Monorepos

🎉 Announcing new features 🎉

Polylith is about keeping Monorepos Simple & Developer-friendly. Today, the Python tools for the Polylith Architecture has support for Poetry, Hatch and PDM - three popular Packaging & Dependency Management tools in the Python community.

In addition to the already existing Poetry plugin that adds tooling support for Polylith, there is also a brand new command line tool available. The CLI has support for both Hatch and PDM. You can also use it for Poetry setups (but the recommended way is to use the dedicated Poetry plugin as before).

To summarize: it is now possible to use the Simple & Developer-friendly Monorepo Architecture of Polylith with many different kinds of Python projects out there.

"Hatch is a modern, extensible Python project manager."
From the Hatch website

🐍 Hatch

To make the tooling fully support Hatch, there is a Hatch build hook plugin to use - hatch-polylith-bricks - that will make Hatch aware of a Polylith Workspace. Hatch has a nice and well thought-through Plugin system. Just add the hook to the build configuration. Nice and simple! The Polylith tool will add the config for you when creating new projects:

requires = ["hatchling", "hatch-polylith-bricks"]
build-backend = ""
"PDM, as described, is a modern Python package and dependency manager supporting the latest PEP standards. But it is more than a package manager. It boosts your development workflow in various aspects."
From the PDM website


Just as with Hatch, there are build hooks available to make PDM aware of the Polylith Workspace. Writing hooks for PDM was really simple and I really like the way it is developed. Great job, PDM developers! There is a workspace build hook - pdm-polylith-workspace - and a projects build hook - pdm-polylith-bricks - to make PDM and the Polylith tooling work well together.

This is added to the workspace build-system section pyproject.toml:

requires = ["pdm-backend", "pdm-polylith-workspace"]
build-backend = "pdm.backend"

And the plugin for projects.
This will be added by the poly create project command for you.

requires = ["pdm-backend", "pdm-polylith-bricks”]
build-backend = "pdm.backend"
"Python packaging and dependency management made easy."
From the Poetry website

🐍 Poetry

For Poetry, just as before, add or update these two plugins and you're ready to go!

poetry self add poetry-multiproject-plugin
poetry self add poetry-polylith-plugin

📚 Docs, examples and videos

Have a look at the the Polylith documentation for more information about getting started. You will also find examples, articles and videos there for a quick start. I'm really excited about the new capabilities of the tooling and hope it will be useful for Teams in the Python Community!

Top photo made with AI (DALL-E) and manual additions by a Human (me)

Tuesday, December 19, 2023

The Lost Balkan Tapes: a Christmas story

A couple of days ago, a nice thing happened that has made me very happy. It happened almost by accident. At work, we have a #music channel on Slack and some of us frequently share music that we like and recommend to the folks in the company. I really like those kind of work-but-not-related-to-the-actual-work kind of things. It helps developing a friendly Organizational culture, and it is also a simple way to get to know each other better.


I was born and raised in the Stockholm area of Sweden, but my dad and his parents (my grandparents) came to Sweden in the late 1960s from former Yugoslavia. They, like many from the Balkans, Finland, Italy and Greece, got job offerings from Swedish companies and they decided to give it a try. They began working with manufacturing chocolate here in Stockholm. I don't know that much about their life over there before Sweden, other than the stories I've heard many times when growing up.

One of my favorite stories is about my Grandfather and when he escaped from some kind of prison camp, set up by the World War II Occupation of former Yugoslavia. He was only a teenager back then, but managed to run away from the guards into a forest - and then catch a passing train, on its way away from the camp. In my imagination, the train was moving fast and he jumped on it in the same way Indiana Jones would do. I remember him telling me and my brother about going undercover, calling himself Ivan Something-something when the train conductor asked who that kid from nowhere was. I loved that story and wanted to hear it over and over again.

Another favorite story was about my Grandmother. She used to be a singer in the 1950s, early 1960s and performed all over the country. I remember the day she told me about the famous people she used to sing for, such as the Ethiopian Emperor Haile Selassie! When I grew up, Reggae and the Rastafari culture was a big thing among us kids in the suburbs, and I certainly knew about Haile Sellasie. My grandma has not only met him, but also has sung for him! Wow. As I understood it, she was popular and during a period of time she was often hired to sing when the officials of the Country expected a visit from abroad.

I have heard only a few songs on cassette when we hang out in the apartment in Fisksätra (a concrete suburb in the Stockholm area that I have spent a lot of time in). I remember the music sounding quite good. But honestly, as a kid and later a young adult, I wasn't really that much of a fan of Balkan Folk Music. Still very cool to hear her sing. She occasionally sang some songs for us too, but time & smoking cigarettes had made her voice different than from the recording of those cassette tapes.

Years later, with both grandma and grandpa no longer around, I thought that the tapes were lost & gone forever. If I recall it correctly, some of it was even accidentally overwritten. Oh no. I made some attempts to find information (and possibly music) online, but failed. I gave up hope of finding anything. This was many years ago.


Back to work. I decide to share some nice Reggae music in the Slack channel. I found the Gratitude Riddim when browsing my online music app, good stuff! The friends at work got inspired and also shared some Jamaican vibes and we had a fun conversation going on there. Then, one person added a picture of Haile Selassie that took me right back to those childhood memories. So, naturally, I told them the Story about my Grandmother and that she has sung for him. But sadly the music is gone, you know. My friend probably got curious, and I believe he just googled her name.

”Wow, very cool to hear about your grandma. Is this her?”
(with a link containing music)


What? Huh? No, wait. That can't be her. Or. Is it? Naw. Gotta be someone else. Then again, how many Folk Music singers from the Balkans named Ikonija Vujic can there be? After a while, I realized that it is in fact my Grandmother! He actually found about 15 songs. All of them beautiful & melancholic Love Songs. Old school Balkan Folk Music with the Tamburitza instrument. Is it the great Janika Balaž playing? This is just me guessing, but according to Wikipedia he lived in the exact same area where my grandparents (and my dad) used to hang out back in the days. They must have met sometime!

I can't think of a better Christmas gift than this and I am forever grateful about these findings. It happened almost randomly, by accident. If I hadn't shared those Reggae songs in Slack, I would still be thinking that the music of my Grandmother was gone. But the Lost Balkan Tapes of Ikonija Vujic have been found again. Thanks to my friend at work and the enthusiast person that has published a huge amount of Balkan Music from the past. Thank you! 🙏