adventures in project renovation

March 9th, 2014

I'm inspired by how many great Python libraries there are these days, and how easy it is to use them. requests is the canonical example, and marks a real watershed moment, but there are many others.

It made me think back on various projects that I've published over the years and not touched in ages. I've been considering them more or less "complete". My standards for publishing projects used to be: write a blog entry, include the code, done. That was okay for simple scripts. Later on I started putting code on berlios.de and sourceforge.net. At some point github emerged and became the de facto standard, so I started using that too.

Fast forward to 2014 and the infrastructure available to open source projects has been greatly enriched. And with it, the standards for what makes a decent project have evolved. Jeff Knupp wrote a fabulous guide on this.

I decided to pick a simple case study. ansicolor is a single module whose origins I can trace back to 2008. I've seen the core functionality present in any number of codebases, because it's just so easy to hammer out some code for this and call it a day. But I never found it in a reusable form, so I decided to make it a separate thing that I could at least reuse between my own projects.

These are the steps a project is destined to pass through:

  • python3 support
  • pypi package + wheel!
  • readme that covers installation and "getting started"
  • tests + tox config
  • travis-ci hook
  • flake8 integration and fixing style violations
  • docs + Read the Docs hook

Not a single feature was added to ansicolor, not a single API was changed. Only two things really changed at the level of the code: exports were tidied up and docstrings were added. Python3 support was added too, but it was so trivial you'd have to squint to notice it.

The biggest stumbling block was actually writing the docs. As an implementor you tend to look at code in a completely different light than you do as a user of that code. Before starting on this I was thinking about how the API is a bit awkward in some places and could be improved. And how some of the functionality caters to a very narrow use case and maybe should be removed or to moved to a "contrib"-like place.

But as a potential user of a library that I just discovered I don't care about any of that. I want to be able to "pip install" it. I want to have some quickstart documentation so I can have running code in 2 minutes. That's how long I'll typically spend deciding whether this code is worth my time at all, so if the implementor is busy polishing the API before even putting out a pypi package they're wasting their time.

There is an interesting cognitive dissonance at play here. As an implementor I tend to think that the darkest corners of my code are those that most need documenting. Those are the ones most likely to bite someone. The easy stuff anyone can figure out. But as a user that's not how I see it at all. It's precisely the simplest functionality that most needs explaining, because most users have simple needs. If you do a good job documenting that you can make lots of people productive. By contrast, the complicated features have a small audience. An audience that's more sophisticated and more likely to help themselves by reading the code if need be.

Then there are the tools. I always found sphinx a bit fiddly. It's not really obvious how to get what you want, and it's permissive enough not to complain, so it takes a fair bit of doc hunting to discover how other projects do it. PyPI has a more conservative rst parser than github, so if you give it syntax it doesn't accept it renders your page in plain text. I ended up doing a number of releases where only the readme changed slightly to debug this. Read the Docs works well, but I couldn't figure out how to make it build from a development branch. It seems to only want to build from a tag regardless of the branches you select, so that too inflated the number of releases.

It takes a bit of time to renovate a project, but it's all fairly painless. All these tools have reached a level of maturity that makes them very nice to use.

:: random entries in this category ::