Saturday, April 13, 2024

Write Less Code, You Must

An aspect of Python Software Development that is often overlooked, is Architecture (or Design) at the namespace, modules & functions level. My thoughts on Software Development in general is that it is important to try hard writing code that is Simple, and Easy to move from one place to another.

When having code written like this, it becomes less important if a feature was added in Service X, but a better fit would be Service Y when looking at it from a high-level Architectural perspective. All you need to do is move the code to the proper place, and you're all good. However, this will require that the actual code is moveable: i.e. having the features logically separated into functions, modules and namespace packages.

Less Problems

There's a lot of different opinions about this, naturally. I've seen it in in several public Python forums, and been surprised about the reactions about Python with (too) few lines of code in it. How is it even possible having too little of code?

My take on this in general is Less code is Less Problems.

An example

def my_something_function():
    # Validation
    
    # if valid 
    # else do something
    ... python code here

    # Checking

    # if this 
    # elif that
    # elif not this or not that
    # else do_something
    ... python code here

    # Data transformation

    # for each thing in the things
    #    do a network call and append to a list
    ... python code here

    # Yay, done
    return the_result

This type of function - when all of those things are processed within the function body - is not very testable. A unit test would likely need a bunch of mocking, patching and additional boilerplate test data code. Especially when there are network calls involved.

My approach on refactoring the code above would be to first identify the different tasks within this controller type of function, and begin by extracting each task into separate functions. Ideally these would be pure functions, accepting input and returning output.

At first, I would put the functions within the same module, close to at hand. Quite quickly, the original function has become a whole lot more testable, because the extracted functions can now easily be patched (my preference is using pytest monkeypatch). This approach would be my interpretation of developing software towards a clean code ideal. There is no need for a Dependency Injection framework or any unnecessary complex OOP-style hierarchy to accomplish it.

In addition to testability, the Python code becomes runnable and REPL-friendly. You can now refactor, develop and test-run the individual functions in the REPL. This is a very fast workflow for a developer. Read more about REPL Driven Development in Python here.

With the features living in separate isolated functions, you will likely begin to identify patterns:

"- Hey, this part does this specific thing & could be put in that namespace"

When moving code into a namespace package, the functions become reusable. Other parts of the application - or, if you have a Monorepo containing several services - can now use one and the same source code. The same rows of code, located in a single place of the repo. You will likely structure the repo with many namespace packages, each one containing one or a couple of modules with functions that ideally do one thing. It kind of sounds like the Unix philosophy, doesn't it?

This is how I try to write code on a daily basis, at work and when developing Open Source things. I use tools like SonarCloud and CodeScene to help me keep going in this direction. I've written about that before. The Open source code that I focus on these days (Polylith) has 0% Code Duplications, 0% Code Smells and about a 9.96 long-term Quality Code Scoring. The 0.04 that is left has been an active decision by me and is because of endpoints having 5+ input arguments. It makes sense for me to keep it like that there, but not in functions within the app itself where an options object is a better choice.

This aspect of Software Development is, from my point of view, very important. Even more important than the common Microservices/Events/REST/CQRS debates when Architecture is the topic of discussion. This was my Saturday afternoon reflections, and I thank you for reading this post. ☀️

Top Photo by Remy Gieling on Unsplash

3 comments:

laurent said...

Thanks to you, I gave Python Polylith a try several months ago, and I have never regretted it since then. So much easier to manage a unique repo, avoid code duplication, and test things.
Two things that are not clear to me, yet:
- how to manage config files (json): I find myself creating a new component for each project, but it does not seem right;
- how to make things work with pyinstaller (things that previously worked with a src setup are broken, right now).
These two topics would be a great addition to the otherwise excellent documentation for newbies like me.
Thanks again!

David Vujic said...

I'm happy that you like Polylith! About JSON config files, I think you could add them as resources/extra files in the project-specific pyproject.toml file, and the file itself can be stored where you like (it doesn't have to be a component). Pyinstaller, can you share some more information about that?

How about continuing the conversation in the Discussions section of the Polylith repo? You will find it here: https://github.com/DavidVujic/python-polylith/discussions

laurent said...

Great, thanks for your advice, will do!