python bytecode: loops

August 18th, 2013

New to the series? The previous entry was part 4.

We have seen quite a bit of bytecode already - we've even seen the structure of a bytecode module. But we haven't seen loops yet.

A while loop

We'll start with the simplest kind of loop. It's not really a typical loop given that it breaks after a single execution, but it's enough to give us a look at the looping machinery.

def loop(x):
    while x > 3:
        print x

Disassembly of loop:
  2           0 SETUP_LOOP              22 (to 25)
        >>    3 LOAD_FAST                0 (x)
              6 LOAD_CONST               1 (3)
              9 COMPARE_OP               4 (>)
             12 POP_JUMP_IF_FALSE       24

  3          15 LOAD_FAST                0 (x)
             18 PRINT_ITEM          
             19 PRINT_NEWLINE       

  4          20 BREAK_LOOP          
             21 JUMP_ABSOLUTE            3
        >>   24 POP_BLOCK           
        >>   25 LOAD_CONST               0 (None)
             28 RETURN_VALUE

The control flow is a bit convoluted here. We start off with a SETUP_LOOP which pushes a block onto the stack. It's not really clear to me why that's necessary, given that Python does not use blocks for scopes. But it might be that the interpreter needs to know which level of looping it's at.

We then load the variable x and the constant 3 and run COMPARE_OP. This opcode actually takes a parameter to tell it which operation to perform (greater than in this case). The result of that will be a boolean value on the stack.

Now we need to know whether we're going to execute the loop body or jump past the loop, so that's POP_JUMP_IF_FALSE, which may jump to location 24 where the loop ends.

Assuming we are in the loop body, we simply load the variable x and print it. Interestingly, the print statement requires two opcodes PRINT_ITEM and then PRINT_NEWLINE, which seems a bit over the top.

We now have a BREAK_LOOP instruction. Notice that if we were to ignore and execute the JUMP_ABSOLUTE just behind it that would return us to the loop predicate, and we might continue looping. But that's not supposed to happen after a break: a break ends the loop even if the loop predicate is still true. So this must mean that we won't reach JUMP_ABSOLUTE.

After this we execute POP_BLOCK which will officially end the loop by taking the block off the stack again.

A for loop

A for loop, then, is not very different. The main difference is that we are not looping on a boolean condition - we are looping over an iterable.

def loop(x):
    for i in range(x):

Disassembly of loop:
  2           0 SETUP_LOOP              25 (to 28)
              3 LOAD_GLOBAL              0 (range)
              6 LOAD_FAST                0 (x)
              9 CALL_FUNCTION            1
             12 GET_ITER            
        >>   13 FOR_ITER                11 (to 27)
             16 STORE_FAST               1 (i)

  3          19 LOAD_FAST                1 (i)
             22 PRINT_ITEM          
             23 PRINT_NEWLINE       
             24 JUMP_ABSOLUTE           13
        >>   27 POP_BLOCK           
        >>   28 LOAD_CONST               0 (None)
             31 RETURN_VALUE

To do that we have LOAD_GLOBAL to load the range function on the stack. This is an opcode we haven't seeb before, and it simply means this name comes from somewhere outside this module (the __builtin__ module in this case). We then load x and call the function. This produces a list.

Now, since Python uses iterators so heavily, the loop will use this method to move through the list. It means you could also loop over any other iterable object (tuple, dict, string, your own custom iterators etc). In fact, GET_ITER amounts to calling the iter function on the list (which returns an iterator object). And FOR_ITER calls the iterator's next method to get the next item.

We now have the first int in the list, and we bind it to the name i with STORE_FAST. From there on, we may use i in the loop body.

You will notice that there is something odd about the way i is manipulated. At location 16 the int is sitting on the stack, and gets bound to a name with STORE_FAST. This consumes it on the stack. We then immediately push it on the stack again with LOAD_FAST. These two instructions cancel each other out: we could remove them without changing the meaning of the program.

So why do we have to store and load? Well, imagine i were used again in the loop body - it would have be bound, right? So it could be optimized away in this case, but not in the general case.

:: random entries in this category ::