Tuesday, August 2, 2022

A simple & scalable Python project structure

File & folder structures - there's almost as many different variations as there are code repositories.

One common thing though, is that you'll probably find the utils folder in many of the code repos out there, regardless of programming language. That's the one containing the files that don't fit anywhere in the current project structure. It is also known as the helpers folder.

Organizing, sorting and structuring things is difficult. There's framework specific CLIs and tools that will create a nice setup for you, specialized for the needs of the current framework.

"There should be one-- and preferably only one --obvious way to do it."
The Zen of Python

Is there one folder structure to rule them all? Probably not, but I have tried out a way to organize code that is very simple, framework agnostic and scalable as projects grow.

Structure for simplicity

A good folder structure is one that makes it simple to reuse existing code and makes it easy to add new code. You shouldn't have to worry about these things. The thing I've tried out with success is very much inspired by the Polylith architecture. Polylith is a monorepo thing, but don't worry. This post isn't about monorepos at all, however this one is if you are interested in Python specific ones.

An entry point and a components folder. You won't need much more. Use your favorite dependencies tool, mine is currently Poetry.

It's all about the components

The main takeaway here is to view code as small, reusable components, that ideally does one thing only. A component is not the same thing as a library. So, what's the difference?

A library is a full blown feature. A component can be a single function, or a parser. It can also be a thin wrapper around a third party tool.

"Simple is better than complex."
The Zen of Python

I think the idea of writing components is about changing mindset. It is about how to approach a problem and how to organize the code that solves a problem.

It shouldn't be too difficult to grasp for Python developers, though. For us Python devs, it's an everyday thing to write functions and have them in modules. Another useful thing, probably more common in library development, is to group the modules into packages.

Modules, packages, namespaces ... and components?

In Python, a file is a module. One or more modules in a folder becomes a package. A good thing with this is that the code will be namespaced when importing it. Where does the idea of components fit in here? Well, a component is a package. Simple as that.

"Namespaces are one honking great idea -- let's do more of those!"
The Zen of Python

To make a package easier to understand, you can add an interface. Interfaces are well supported in Python. Specifying the interface of a package in an __init__.py file is a great way to make the intention of the code clearer and easier to grasp. Maybe there's only one function that makes sense to use from the "outside"? That's when to use an interface for your component.

Only the functions that makes sense should be exposed from a component.

Make code reuse easy

When organizing code into simple components, you will quickly discover how easy it is to reuse it. Code is no longer hidden in some utils folder and you no longer need to duplicate existing private helper functions (because the refactoring might break things), if they already are organized as reusable components with clear and simple APIs. I usually think of components as LEGO bricks to select from when building features. You will most likely produce new LEGO bricks of various shapes along the way.

This is code in a "dictionaries" component. The interface (previous picture) will handle the access to it.

Well suited for large apps

At work, we have a couple of Python projects using this kind of structure. One of them is a FastAPI app with an entry point (we named it app.py) containing the public endpoints. The entry point is importing a bunch of components that does the actual work.

The repo contains about 80 Python files. Most of them are grouped into components (in total about 30 components). This particular project is about 3K lines of Python code, but other repos are much smaller with only a handful of components.

Perfect for functional programming

Even though it is not a requirement, organizing code into components fits very well with functional programming. Separating code into data, calculations and actions are well suited for the component thing described here in this post.

Don't forget to keep the components simple, and try to view them as LEGO bricks to be used from anywhere in the app. You'll have fun while doing it too.

Top photo by Maureen Sgro on Unsplash

No comments: