måndag 22 augusti 2022

Joyful Python with the REPL

REPL Driven Development is a workflow that makes coding both joyful and interactive. The feedback loop from the REPL is a great thing to have at your fingertips.

"If you can improve just one thing in your software development, make it getting faster feedback."
Dave Farley

Just like Test Driven Development (TDD), it will help you write testable code. I have also noticed a nice side effect from this workflow: REPL Driven Development encourages a functional programming style.

REPL Driven Development is an everyday thing among Clojure developers and doable in Python, but far less known here. I'm working on making it an everyday thing in Python development too.

But what is REPL Driven Development?

What is it?

You evaluate variables, code blocks, functions - or an entire module - and get instant feedback, just by a hitting a key combination in your favorite code editor. There's no reason to leave the IDE for a less featured shell to accomplish all of that. You already have autocomplete, syntax highlighting and the color theme set up in your editor. Why not use that, instead of a shell?

Evaluate code and get feedback, without leaving the code editor.

Ideally, the result of an evaluation pops up right next to the cursor, so you don't have to do any context switches or lose focus. It can also be printed out in a separate frame right next to the code. This means that testing the code you currently write is at your fingertips.

Easy setup

With some help from IPython, it is possible to write, modify & evaluate Python code in a REPL Driven way. I would recommend to install IPython globally, to make it accessible from anywhere on your machine.

pip install ipython

Configure IPython to make it ready for REPL Driven Development:

c.InteractiveShellApp.exec_lines = ["%autoreload 2"] c.InteractiveShellApp.extensions = ["autoreload"] c.TerminalInteractiveShell.confirm_exit = False

You will probably find the configuration file here: ~/.ipython/profile_default/ipython_config.py

You are almost all set.

Emacs setup

Emacs is my favorite editor. I'm using a couple of Python specific packages to make life as a Python developer in general better, such as elpy. The auto-virtualenv package will also help out making REPL Driven Developer easier. It will find local virtual environments automatically and you can start coding without any python-path quirks.

Most importantly, set IPython as the default shell in Emacs. Have a look at my Emacs setup for the details.

VS Code setup

I am not a VS Code user. But I wanted to learn how well supported REPL Driven Development is in VS Code, so I added these extensions:

You would probably want to add keyboard shortcuts to get the true interactive feel of it. Here, I'm just trying things out by selecting code, right clicking and running it in an interactive window. It seems to work pretty well! I haven't figured out if the interactive window is picking up the global IPython config yet, or if it already refreshes a submodule when updated.

Evaluating code in the editor with fast feedback loops.
It would be great to have keyboard commands here, though.

Current limitations

In Clojure, you connect to & modify an actually running program by re-evaluating the source code. That is a wonderful thing for the developer experience in general. I haven't been able to do that with Python, and believe Python would need something equivalent to NRepl to get that kind of magic powers.

Better than TDD

I practice REPL Driven Development in my daily Python work. For me, it has become a way to quickly verify if the code I currently write is working as expected. I usually think of this REPL driven thing as Test Driven Development Deluxe. Besides just evaluating the code, I often write short-lived code snippets to test out some functionality. By doing that, I can write code and test it interactively. Sometimes, these code snippets are converted permanent unit tests.

For a live demo, have a look at my five minute lightning talk from PyCon Sweden about REPL Driven Development in Python.

Never too late to learn

I remember it took me almost a year learning & developing Clojure before I actually "got it". Before that, I sometimes copied some code and pasted it into a REPL and then ran it. But that didn't give me a nice developer experience at all. Copy-pasting code is cumbersome and will often fail because of missing variables, functions or imports. Don't do that.

I remember the feeling when figuring out the REPL Driven Development workflow, I finally had understood how software development should be done. It took me about 20 years to get there. It is never too late to learn new things. 😁



Top photo by ckturistando on Unsplash

tisdag 2 augusti 2022

A simple & scalable Python project structure

File & folder structures - there's almost as many different variations as there are code repositories.

One common thing though, is that you'll probably find the utils folder in many of the code repos out there, regardless of programming language. That's the one containing the files that don't fit anywhere in the current project structure. It is also known as the helpers folder.

Organizing, sorting and structuring things is difficult. There's framework specific CLIs and tools that will create a nice setup for you, specialized for the needs of the current framework.

"There should be one-- and preferably only one --obvious way to do it."
The Zen of Python

Is there one folder structure to rule them all? Probably not, but I have tried out a way to organize code that is very simple, framework agnostic and scalable as projects grow.

Structure for simplicity

A good folder structure is one that makes it simple to reuse existing code and makes it easy to add new code. You shouldn't have to worry about these things. The thing I've tried out with success is very much inspired by the Polylith architecture. Polylith is a monorepo thing, but don't worry. This post isn't about monorepos at all, however this one is if you are interested in Python specific ones.

An entry point and a components folder. You won't need much more. Use your favorite dependencies tool, mine is currently Poetry.

It's all about the components

The main takeaway here is to view code as small, reusable components, that ideally does one thing only. A component is not the same thing as a library. So, what's the difference?

A library is a full blown feature. A component can be a single function, or a parser. It can also be a thin wrapper around a third party tool.

"Simple is better than complex."
The Zen of Python

I think the idea of writing components is about changing mindset. It is about how to approach a problem and how to organize the code that solves a problem.

It shouldn't be too difficult to grasp for Python developers, though. For us Python devs, it's an everyday thing to write functions and have them in modules. Another useful thing, probably more common in library development, is to group the modules into packages.

Modules, packages, namespaces ... and components?

In Python, a file is a module. One or more modules in a folder becomes a package. A good thing with this is that the code will be namespaced when importing it. Where does the idea of components fit in here? Well, a component is a package. Simple as that.

"Namespaces are one honking great idea -- let's do more of those!"
The Zen of Python

To make a package easier to understand, you can add an interface. Interfaces are well supported in Python. Specifying the interface of a package in an __init__.py file is a great way to make the intention of the code clearer and easier to grasp. Maybe there's only one function that makes sense to use from the "outside"? That's when to use an interface for your component.

Only the functions that makes sense should be exposed from a component.

Make code reuse easy

When organizing code into simple components, you will quickly discover how easy it is to reuse it. Code is no longer hidden in some utils folder and you no longer need to duplicate existing private helper functions (because the refactoring might break things), if they already are organized as reusable components with clear and simple APIs. I usually think of components as LEGO bricks to select from when building features. You will most likely produce new LEGO bricks of various shapes along the way.

This is code in a "dictionaries" component. The interface (previous picture) will handle the access to it.

Well suited for large apps

At work, we have a couple of Python projects using this kind of structure. One of them is a FastAPI app with an entry point (we named it app.py) containing the public endpoints. The entry point is importing a bunch of components that does the actual work.

The repo contains about 80 Python files. Most of them are grouped into components (in total about 30 components). This particular project is about 3K lines of Python code, but other repos are much smaller with only a handful of components.

Perfect for functional programming

Even though it is not a requirement, organizing code into components fits very well with functional programming. Separating code into data, calculations and actions are well suited for the component thing described here in this post.

Don't forget to keep the components simple, and try to view them as LEGO bricks to be used from anywhere in the app. You'll have fun while doing it too.



Top photo by Maureen Sgro on Unsplash

lördag 9 juli 2022

Just use Dictionaries

A Python dictionary has a simple & well-known API. It is possible to merge data using a nice & minimalistic syntax, without mutating or worrying about state. You're probably not gonna need classes.

Hey, what's wrong with classes? 🤔

From what I've seen in Python, classes often add unnecessary complexity to code. Remember, the Python language is all about keeping it simple.

My impression is that in general, class and instance-based code feels like the proper way of coding: encapsulating data, inheriting features, exposing public methods & writing smart objects. The result is very often a lot of code, weird APIs (each one with its own) and not smart-enough objects. That kind of code quickly tend to be an obstacle. I guess that's when workarounds & hacks usually are added to the app.

Two ways of solving a problem: class-based vs data-oriented.
Less code, less problems.

What about Dataclasses?

Python dataclasses might be a good tradeoff between a heavy object with methods and the simple dictionary. You get typings and autocomplete. You can also create immutable data classes, and that's great! But you might miss the flexibility: the simplicity of merging, picking or omitting data from a dictionary. Letting data flow smoothly through your app.

Hey, what about Pydantic?

That's a really good choice for things like defining FastAPI endpoints. You'll get the typed data as OpenAPI docs for free.

I would as early as possible convert the Pydantic model to a dictionary (using the model.dict() function), or just pick the individual keys and pass those on to the rest of the app. By doing that, the rest of the app is not required to be aware of a domain specific type or some base class, created as workaround for the new set of problems introduced with custom types.

Just data. Keeping it simple.

What about the basic types? 🤔

That is certainly a tradeoff when using dictionaries, the data can be of any type and you will potentially get runtime errors. On the other hand, is that a real problem when using basic data types like dict, bool, str or int? For me, I can't remember that ever has been an issue.

But shouldn't data be private?

Classes are often used to separate public and private functionality. From my experience, explicitly making data and functionality private rarely adds value to a product. I think Python agrees with me about this. By default, all things in a Python module is public. I remember when learning about it, and the authors saying that’s okay because we’re all adults here. I very much liked that perspective!

Do you like Interfaces? 🤩

Yes! Especially when structuring code in modules and packages (more about that in the next section). Using __init__.py is a great way to make the intention of a small package clearer and easier to grasp. Maybe there's only one function that makes sense to use from the outside? That's where the package interface (aka the __init__.py file) feature fits in well.

Python files, modules, packages?

In Python, a file is a module. One or more modules in a folder is a package. One or more packages can be combined into an app. Using a package interface makes sense when structuring code in this way.

Keeping it simple. 😊

I'm finishing off this post with a quote from the past:

“Data is formless, shapeless, like water.
If you put data in a Clojure map, it becomes the map.
You put data in a Python list and it becomes the list.
Now data in a program can flow or it can crash.

Be data, my friend.

Bruce Lee, 1940 - 1973

ps.

If neither Bruce or I convinced you about the great thing with simple data structures, maybe Rich Hickey will? Don't miss his "just use maps" talk!

ds.



Top photo by Syd Wachs on Unsplash

lördag 28 maj 2022

Hi podman, Bye docker

A 🌧️ Saturday.

This feels like a great day to uninstall Docker Desktop on my Mac, and try to go all-in Podman.

What is Podman?

" ... a daemonless container engine for developing, managing, and running OCI Containers ... "

Podman is an open source project and you can run containers as root, or as a non-privileged user. It supports the Docker API and can be used with tools like docker-compose.

What's wrong with docker?

Nothing, docker is a great tool! Free to use for Open Source. The Desktop app for Mac and Windows will continue to be free for small businesses, but require a license for bigger companies. For some time now, I have wanted to find out what the alternatives are to using docker.

Uninstalling docker

I ran docker system prune to delete all existing containers and images.

Then I uninstalled the desktop client. By selecting the "Troubleshoot" menu option I found an uninstall button and pressed it. It required some research (aka googling) to figure out how and where to find an uninstall option.

As you can see in the video, the docker app is still there even if the Desktop client was uninstalled. I just deleted that one, and then deleted the docker-related files and folders that were still in my file system:

Installing podman

I followed the official install guideline to install podman. In addition to that, also installed the Mac OS X specific helper tool to be able to run docker-compose. I learned that from the print-out displayed when the brew install command was completed.

I can now run containers with podman! Here I'm starting a ZooKeeper server running in a container:

podman run --rm -p 2181:2181 zookeeper

Where's my commands?

But wait, my 🧠 is already wired to use the docker command. Do I have to unlearn things? Podman is compatible with the docker commands already, but you can also create an alias:

alias docker=podman

It is now possible to use the docker command as before!

Installing docker-compose

By installing the podman-mac-helper tool as described before in this post, it is possible to use podman with docker-compose.

You might wonder: didn't I uninstall all docker related things? 🤔
Yes, but the docker-compose tool is available to install separately & you'll find it on Homebrew. 🍻

brew install docker-compose

Quirks & workarounds

When trying all of this out, the podman command worked well for building and running containers. However, I got an error when running the docker-compose up command. That's kind of a dealbreaker, isn't it?

The error: listing workers for Build: failed to list workers: Unavailable: connection error: desc = "transport: Error while dialing unable to upgrade to h2c, received 404"

As I understand it (from this conversation on Github), there is something not working as expected when using the latest version(s) of docker-compose and podman.

Luckily, there is a simple workaround to use until the bug is fixed. Prefixing the docker-compose command to disable the buildkit API that is causing the bug. Doing this will make things work very well.


That's it! 👋



Top photo by frank mckenna on Unsplash

söndag 20 februari 2022

What would slackbot do?

I recently found out about Should I deploy today? and laughed. A humorous way of guidning us developers when we hesitate about deploying code to production. This one is extra funny on a Friday or Saturday. Every now and then, memes about (not) releasing code to production on Fridays appear on my Twitter timeline. There's even T-shirts for sale, with No Deploy Fridays printed on them.

Funny because it's true?

I totally get the idea of minimizing the chance of ruining your or someone else's weekend. There's also a potential risk with postponing & stacking up features. The things we develop are there to add value, why make users wait for it?

Ship it!

Fridays shouldn't be different than any of the other working days, at least when it comes to deploying code to production. Just like on a Tuesday, you should probably think twice about deploying stuff just before you'll call it a day. When a fresh release has started causing errors for users, you might have already checked out from the office with loud music in the headphones, the mind focused on something else and unaware of the mess you've left to friends on call.

Deploy with some slack

It's a good idea to monitor the live environment for a while, making sure the deploy has gone well. Keep an eye on it, while working on the next thing. Prepare to act on alerts, errors appearing in your logs or messages in the company Slack.

Have a browser tab with graphs monitoring memory consumption, CPU usage and response times. For how long and what metrics to monitor is depending on the product. From my experience, being attentive to things like this is mostly good enough. When I call it a day a couple of hours later, I can go ahead and play that loud music in my headphones and relax.

When things go 💥

I haven't fact checked this - but unless a deploy is about database schema changes or to an offline machine, it should be relatively easy to roll back. Most deploys are probably easy to undo. If not, find out why and try making it easier.

I would suggest teams to practice: commit, push, merge, release to production and rollback. You'll probably find infrastructure related things to fix or tweak, maybe there's some automation missing? The team will know what to do when things go boom for real.

I've been fortunate working in environments with the basic automation already in place. We, the autonomous dev team, are the ones responsible for pushing features to production. We do that on a daily basis. Before, we hesitated about the deploy-on-a-Friday thing. Now we have better trust in our ways of working, and this Friday we deployed new features, dev experience improvements and bug fixes. Several small deployments went live, the last one about 3-4 hours before we called it a weekend 😀

Automate it

A good deployment process will stop and let the team know about when things break. The obvious code related issues will be caught this way. So, automation will get you far, but not all the way. That's why we'll need the monitoring. More importantly, releasing small portions of code changes is much safer than the ones with lot of code changes in it. Release small and release often.

What would slackbot do?

These days, when I hesitate if a new feature should be deployed right now or not, I'll just ask slackbot what to do.



Top photo by Ochir-Erdene Oyunmedeg on Unsplash