Archive for the ‘technology’ Category

a little help with bitwise operators

August 3rd, 2015

Binary numbers are easy, right? You just do stuff with bits.

But invariably whenever I code C I can never remember how to actually set a bit or test a bit, I keep getting and and or confused.

So I made a cheat sheet I can look up any time. These are the key ones:

All the others are available on the cheat sheet.

so do you know how your program executes?

August 2nd, 2015

The answer is no, and here's why you shouldn't feel peer pressured into saying yes.

I was reading about disassembly recently when I came across this nugget:

(...) how can a disassembler tell code from data?

The problem wouldn't be as difficult if data were limited to the .data section of an executable and if executable code were limited to the .code section of an executable, but this is often not the case. (...) A technique that is often used is to identify the entry point of an executable, and find all code reachable from there, recursively. This is known as "code crawling".

(...)

The general problem of separating code from data in arbitrary executable programs is equivalent to the halting problem.

Well, if we can't even do *that*, what can we do?

We start with a program you wrote, awesome. We know that part. We compile it. Your favorite language constructs get desugared into very ordinary looking code - all the art you put into your program is lost! Abstract syntax tree, data flow analysis, your constants are folded and propagated, your functions are inlined. By now you wouldn't even know that it's the same program, and we're still in "high level language" territory (probably some intermediate language in the compiler). Now we get basic blocks and we're gonna lay them out in a specific order. This is where the compiler tries to play nice with the branch prediction in your cpu. Your complicated program aka "my ode to control flow" now looks very flat - because it's assembly code. And at last we assemble into machine code - the last vestiges of intelligent life (function names, variable names, any kind of symbolic information) are lost and become just naked memory addresses.

Between your program and that machine code... so many levels. And at each level there are ways to optimize the code. And all of those optimizations have just happened while you stared out the window just now.

So I started thinking about how programs execute. The fact is that predicting the exact execution sequence of a program, even in C, even in C 101 (no pointers, no threading) is basically impossible. Okay, I'm sure it's possible, but I'd have to know the details of my exact cpu model to have a chance.

I need to know how big the pre-fetch cache is. I bet there are some constants that control exactly how the out of order execution engine works - I need to know those. And I need to know the algorithms that are used there (like branch prediction, remember?). I need to know... oh shoot, multicore! Haha, multicore is a huge problem.

Basically, I need to know exactly what else is running on my system at this very time, because that's gonna wreak havoc with my caches. If my L1 gets hosed by another process that's gonna influence a load from memory that I was just about do. Which means I can't execute this instruction I was going to. So I have to pick some other instructions I have lying around and execute those speculatively while we wait for that delivery truck to return all the way from main memory.

Speculatively, you say? Haha yes, since we have these instructions here we'll just go ahead and execute them in case it turns out we needed to do that. Granted, a lot of what a cpu does is stuff like adding numbers, which is pretty easy to undo. "Oops, I 'accidentally' overwrote eax." I guess that addition never happened after all.

And then hyper threading! Do you know how hyper threading works? It's basically a way of saying "this main memory is so damn slow that I can simulate the execution of *two* different programs on a single core and noone's really gonna notice".

This whole thing gives rise to a philosophical question: what is the truth about your program? Is it the effect you observe based on what you read from memory and what you see in the registers (ie. the public API of the cpu)? Or is it the actual *physical* execution sequence of your instructions (the "implementation details" you don't see)?

I remember when virtual machines were starting to get popular around 2000 and there was a lot of discussion about whether they were a good thing - "think about the performance implications". Hell, our so-called physical machines have been virtual for a very long time already!

It's just that the cpu abstraction doesn't seem to leak very much, so you think your instructions are being executed in order. Until you try to use threads. And then you have to ask yourself the deep existential question: what is the memory model of my machine anyway? Just before you start putting memory barriers everywhere.

So no, you don't. But none of your friends do either (unless they work for Intel).

do you know c?

November 13th, 2014

In discussions on programming languages I often see C being designated as a neat, successful language that makes the right tradeoffs. People will go so far as to say that it's a "small language", it "fits in your head" and so on.

I can only imagine that people saying these things have forgotten how much effort it was to really learn C.

I've seen newbies ask things like "I'm a java coder, what book should I use to learn C?" And a lot people will answer K&R. Which is a strange answer, because K&R is a small book (to further perpetuate this idea that it's a small language), is not exactly pedagogical, and still left me totally confused about C syntax.

In practice, learning C takes so much more than that. If you know C the language then you really don't know anything yet.

Because soon enough you discover that you also need to know the preprocessor and macros, gcc, the linker, the loader, make and autoconf, libc (at least what is available and what is where - because it's not organized terribly well), shared libraries and stuff like that. Fair enough, you don't need it for Hello World, but if you're going to do systems programming then it will come up.

For troubleshooting you also need gdb and basically fundamental knowledge of your machine architecture and its assembly language. You need to know about memory segments and the memory layout and alignment of your datastructures and how compiler optimizations affect that. You will often use strace to discover how the program actually behaves (and so you have to know system calls too).

Much later, once you've mastered all that, you might chance upon a slide deck like Deep C whose message basically is that you don't understand anything yet. What's more terrifying is that the fundamental implication at play is: don't trust the abstractions in the language, because when things break you will need to know how it works under the hood.

In a high level language, given effort, it's possible to design an API that is easy to use and hard to misuse and where doing it wrong stands out. Not so in C where any code is always one innocuous looking edit away from a segfault or a catastrophic security hole.

So to know C you need all of that. But that's mostly the happy path. Now it's time to learn about everything that results in undefined behavior. Which is the 90% of the iceberg below the surface. Whenever I read articles about undefined behavior I'm waiting for someone to pinch me and say the language doesn't actually allow that code. Why would "a = a++;" not be a syntax error? Why would "a[i]" and "i[a]" be treated as the same when syntactically they so clearly aren't?

Small language? Fits in your head? I don't think so.

Oh, and once you know C and you want to be a systems programmer you also need to know Posix. Posix threads, signals, pipes, shared memory, sync/async io, ... well you get the idea.

adventures in project renovation

March 9th, 2014

I'm inspired by how many great Python libraries there are these days, and how easy it is to use them. requests is the canonical example, and marks a real watershed moment, but there are many others.

It made me think back on various projects that I've published over the years and not touched in ages. I've been considering them more or less "complete". My standards for publishing projects used to be: write a blog entry, include the code, done. That was okay for simple scripts. Later on I started putting code on berlios.de and sourceforge.net. At some point github emerged and became the de facto standard, so I started using that too.

Fast forward to 2014 and the infrastructure available to open source projects has been greatly enriched. And with it, the standards for what makes a decent project have evolved. Jeff Knupp wrote a fabulous guide on this.

I decided to pick a simple case study. ansicolor is a single module whose origins I can trace back to 2008. I've seen the core functionality present in any number of codebases, because it's just so easy to hammer out some code for this and call it a day. But I never found it in a reusable form, so I decided to make it a separate thing that I could at least reuse between my own projects.

These are the steps a project is destined to pass through:

  • python3 support
  • pypi package + wheel!
  • readme that covers installation and "getting started"
  • tests + tox config
  • travis-ci hook
  • flake8 integration and fixing style violations
  • docs + Read the Docs hook

Not a single feature was added to ansicolor, not a single API was changed. Only two things really changed at the level of the code: exports were tidied up and docstrings were added. Python3 support was added too, but it was so trivial you'd have to squint to notice it.

The biggest stumbling block was actually writing the docs. As an implementor you tend to look at code in a completely different light than you do as a user of that code. Before starting on this I was thinking about how the API is a bit awkward in some places and could be improved. And how some of the functionality caters to a very narrow use case and maybe should be removed or to moved to a "contrib"-like place.

But as a potential user of a library that I just discovered I don't care about any of that. I want to be able to "pip install" it. I want to have some quickstart documentation so I can have running code in 2 minutes. That's how long I'll typically spend deciding whether this code is worth my time at all, so if the implementor is busy polishing the API before even putting out a pypi package they're wasting their time.

There is an interesting cognitive dissonance at play here. As an implementor I tend to think that the darkest corners of my code are those that most need documenting. Those are the ones most likely to bite someone. The easy stuff anyone can figure out. But as a user that's not how I see it at all. It's precisely the simplest functionality that most needs explaining, because most users have simple needs. If you do a good job documenting that you can make lots of people productive. By contrast, the complicated features have a small audience. An audience that's more sophisticated and more likely to help themselves by reading the code if need be.

Then there are the tools. I always found sphinx a bit fiddly. It's not really obvious how to get what you want, and it's permissive enough not to complain, so it takes a fair bit of doc hunting to discover how other projects do it. PyPI has a more conservative rst parser than github, so if you give it syntax it doesn't accept it renders your page in plain text. I ended up doing a number of releases where only the readme changed slightly to debug this. Read the Docs works well, but I couldn't figure out how to make it build from a development branch. It seems to only want to build from a tag regardless of the branches you select, so that too inflated the number of releases.

It takes a bit of time to renovate a project, but it's all fairly painless. All these tools have reached a level of maturity that makes them very nice to use.

luna learns to crawl

October 12th, 2013

So, writing an interpreter. Where do we even start?

Why, with the essentials. While an interpreter is nothing more than a program that reads your program and runs it, that is a pretty complicated undertaking. You could conceivably write an interpreter that reads a program character by character and tries to execute it, but you would soon run into all kinds of problems, like: I need to know the value of this variable, but that's defined somewhere else in the program, and: I'm executing a function definition, which doesn't have any effect, but I need to remember that I've seen it in case anyone is calling the function later in the program.

In fact, this approach is not wrong, it's basically what your machine really does. But your machine is running machine code, and you're not, and therein lies the difference. People don't write programs in machine code anymore. It's too hard. We have high level languages that provide niceties like variables - define once, reference as often as you like. Like functions. Like modules. All of these abstractions amount to a program that is not a linear sequence of instructions, but a network of blocks of instruction sequences and various language specific structures like functions and classes that reference each other.

And yet, since many languages are compiled down to machine code, there must be a way to get from here to there? Well, compiler writers have decided that doing so in one step is needlessly complicated. And instead we do what we always do in software engineering when the problem is too hard: we split it into pieces. A compiler will run in multiple phases and break down the molecule into ever simpler compounds until it's completely flat and in the end you can execute it sequentially.

Interpreters start out the same way, but the goal is not machine code, it's execution of the program. So they bottom out in an analogous language to machine code: byte code. Byte code can be executed sequentially given a virtual machine (language specific) that runs on top of a physical machine (language agnostic) that knows a few extra things about the language it's running.

Phases

Parsing - Read the program text and build a structured representation (abstract syntax tree) out of it.

Compilation - Compile the AST to bytecode instructions that can be executed on a stack machine.

Interpretation - Pretend to be a machine and execute the bytecode.

The parser is the thing that will read the program text, but it needs to know the syntax of the programming language to recognize what it's reading. This is supplied in the form of a grammar. The parser also needs to know the structure of your abstract syntax for it to build those trees for you. And this is where it gets tricky, because the AST is one of the central data structures in an interpreter, and the way you design the AST will have a big impact on how awkward or convenient it will be to do anything meaningful with it. Even so, the AST much match the grammar very closely, otherwise it's going to be very hard to write that parser. So the optimal would be to design the AST first, then write a grammar that matches it to the letter (even better: you could generate the grammar from the AST). Well, there are decades of research that explain why this is hard. Long story short: the more powerful the parser the more leeway you have in the grammar and the easier it is to get the AST you want. Chances are you will not write the parser from scratch, you'll use a library.

The compiler is the thing that traverses an AST and spits out bytecode. It's constrained by two things: the structure of the AST, and the architecture of the vm (including the opcodes that make up your bytecode). Compilers really aren't generic - they need to know all the specifics of the machine they are compiling for. For virtual machines this means: is it a stack machine or a register machine? What are the opcodes? How does the vm supply constants and names at interpretation time? How does it implement scope and modules?

Finally, the interpreter is the thing that emulates a machine and executes the bytecode. It will read the bytecode sequentially and know what to do with each instruction. It's really the only part of the entire program that interacts with its environment extensively. It's the thing that produces output, opens sockets, creates file descriptors, executes processes etc, all on behalf of your program. In fact, it would be more correct to say that these things are contained in the standard libraries of the language, but the interpreter is what orders them to happen.

luna 0.0.1

It's out. An interpreter for basic blocks. All your code runs in a single stack frame, so you can only run expressions, assignments and function calls. There is an object model for lua values. There is a pretty barebones standard library. And there are tests (py.test is making life quite good so far).