Underhanded Python: giving the debugger the wrong line numbers

In my last post I showed how it’s easy for Python code to detect the presence of the debugger and change its behaviour accordingly. But this did nothing to deal with single-stepping through the code in the debugger: the debugger would show execution hitting the debugger-detection code and changing its behaviour. You could probably disguise this code a bit, but it’s hard to avoid giving the impression that something suspicious is going on.

Luckily we have another tool at our disposal: we can lie to the runtime about our line numbers.

How single-stepping works in the Python debugger

Executing Python is a 3-stage process:

The first stage is the only part that is working with the source code, and debugging takes place at the execution phase. Therefore at each stage the line numbers need to be preserved in some way to be passed to the next phase.

You can see this when you parse the code. Code like this:

a = 1 + 1

gets converted to an AST like:

a = 1 + 1

You can inspect this AST and see the line numbers:

>>> import ast
>>> code = "a = 1 + 1"
>>> tree = ast.parse(code)
>>> print(tree.body[0].lineno)
1

When this AST is compiled into bytecode, the line numbers get attached to the bytecode. The details are a bit complicated because Python tries to make the bytecode as compact as possible, but conceptually there’s a mapping from each bytecode instruction to the corresponding line of the source code it originally came from.

Of course, each line the source code will typically end up generating many bytecode instructions, because individual bytecode instructions are so much less powerful than Python code (that’s kind of the point of it). So you end up with a run of bytecode instructions that correspond to the same line number. The disassembler shows you this:

import dis

source = """
def add(a, b):
    print("Adding")
    return a + b
"""

code = compile(source, 'foo.py', 'exec')

dis.dis(code)

produces:

  2           0 LOAD_CONST               0 (<code object add at 0x7f6ceef89810, file "foo.py", line 2>)
              2 LOAD_CONST               1 ('add')
              4 MAKE_FUNCTION            0
              6 STORE_NAME               0 (add)
              8 LOAD_CONST               2 (None)
             10 RETURN_VALUE

Disassembly of <code object add at 0x7f6ceef89810, file "foo.py", line 2>:
  3           0 LOAD_GLOBAL              0 (print)
              2 LOAD_CONST               1 ('Adding')
              4 CALL_FUNCTION            1
              6 POP_TOP

  4           8 LOAD_FAST                0 (a)
             10 LOAD_FAST                1 (b)
             12 BINARY_ADD
             14 RETURN_VALUE

where the left-hand numbers are the source code lines.

This brings us back to sys.settrace. When you register a trace function in Python, you can arrange for your trace function to be called on various events. For our purposes, we care about:

  • call: entering a function or code block
  • line: a new line of code is about to be executed, i.e. the next bytecode instruction is associated with a different line number than the previous one.

The practicalities are a bit fiddly, but essentially this gives us what we need for a single-step debugger: a function gets called on each new line, and that function can suspend execution and allow the system to be inspected.

How we can lie to the debugger

The debugger is reliant on the information attached to the function object. That information is readable like any other Python object property:

def add(a, b):
    print("adding")
    return a + b

print(add.__code__.co_firstlineno)
print(add.__code__.co_lnotab)

The co_firstlineno is simply the source line of the start of the function, and the co_lnotab is a table mapping bytecode positions to source code positions. In our case, we get:

1
b'\x00\x01\x08\x01'

The second line works like this:

  • First byte of bytecode (00) corresponds to 1 line after the beginning of the block (01)
  • 8th byte of bytecode (08) corresponds to 1 line after the previous line (01)

Let’s suppose we have two functions: Our malicious function, and a decoy function that we want the user to see when they single-step through it:

import sys

def add(a, b):
    return a + b

def bad_add(a, b):
    if sys.gettrace() is None:
        print("Doing something malicious")

    return a + b

You might hope that you could just overwrite bad_add‘s line number information:

bad_add.__code__.co_linenumber = add.__code__.co_linenumber

But this doesn’t work, because this is a read-only field in Python. However, what you can do is create a new function object that has all the same properties as the other function except for the ones we want to rewrite. You create a new function object with types.CodeType. However, the interface is a bit fiddly and as of Python 3.8 you can do this much more easily by calling the replace method on the old code object, which creates a copy:

add.__code__= bad_add.__code__.replace(
    co_lnotab=add.__code__.co_lnotab,
    co_firstlineno=add.__code__.co_firstlineno
)

This is replacing the entire add code object with a newly created one. It only replaces the code of add, it doesn’t change the other properties of it (such as the __name__). In other words, add will continue to look like the old add but it will walk and quack like bad_add.

But the clever bit is that the new code object is a hybrid: it has bad_add‘s behaviour, but add‘s line number information. If you try to step through it in the debugger, the debugger will show you (line by line) the source code of add even as it executes bad_add.

This was pretty trivial to do, but to be fair it doesn’t actually work all that well. The source code mappings for add aren’t really a good substitute for those for bad_add, because they have different bytecode. In our case we get away with it because add is pretty trivial, but in real code this would probably lead to odd behaviour: single-stepping would cause the debugger to leap around the function in odd ways that didn’t correspond to the code, and it would be pretty obvious that something was wrong.

However, for a motivated attacker it wouldn’t be that hard to create a more accurate fake source code mapping, which would make this much harder to track.

Of course, this code is still pretty obviously malicious if someone looks at the right piece of source code. The malicious piece can be hidden somewhere well away from the code that someone would be likely to inspect (in a different module, even) but once it’s spotted it’s pretty obvious that it’s doing something underhanded. I have a few thoughts about how this could be addressed, which I’ll talk about next time.

Underhanded Python: detecting the debugger

There used to be an annual competition among C programmers called the Underhanded C contest, with the aim of inventing creative ways to write code that appears to do one thing but is actually doing something very different (and theoretically malicious, though it’s a “white hat” contest).

I recently got to thinking about whether you can do this kind of thing in Python. The original format of the contest doesn’t really work in Python: it’s all too easy to write Python code that doesn’t do what it seems, to the point where there’s probably no challenge. But in a dynamic language, there are lots of interesting things that aren’t realistic in a language like C.

For example, can you detect and interfere with the debugger? Can you hide your malicious behaviour when a debugger is attached to look at it? It turns out you can.

Debugging in Python

It’s pretty clear that you can’t implement a Python debugger wholly in Python without any support from the Python runtime. Python code will only run when something calls it, and your debugger code wouldn’t have any way to impose itself upon the code being debugged.

However, Python tries to make things as flexible as possible by implementing a minimal amount of support in the Python runtime and having the rest of the debugger built on top of that in Python. The key tool that makes this work is something I mentioned in an earlier article about coverage testing: sys.settrace.

The way settrace works is that you can register a hook function with it that will be called on some conditions (moving to a new line of code, entering a function scope etc.). This hook is an ordinary Python function that can do whatever you want. The implementation of settrace is built into the Python runtime, but that’s all the special support you need.

How to behave differently when being debugged

Let’s keep things simple. Let’s suppose we want to write a simple function that adds two numbers, and prints "I'm malicious" if it’s called when the debugger isn’t around to see it. Something like:

def add(a, b):
    if not in_debugger():
        print("I'm malicious")

    return a + b

It’s pretty simple, we just have to check whether the settrace hook is set:

import sys

def add(a, b):
    if sys.gettrace() is None:
        print("I'm malicious")

    return a + b

This works pretty well. If you run it in pdb, you’ll see that the message is not printed. One quirk (for better or worse) is that if you do a continue from PDB it will not detect the debugger, because continue disables the debug hook entirely (until it’s reinserted with set_trace() or breakpoint() or whatever). Depending on the requirements for our malicious program this might or might not be a problem.

This isn’t specific to PDB, either. It should work with any debugger for cpython (and may work for other Python implementations, I haven’t checked).

Of course, this is a toy example. The most obvious problem is that if you step into the function you’ll immediately see the malicious code. I have a few ideas about this, which I plan to return to later.

What happens when a class is created?

This post will dig into what happens when a new class is created, at a bytecode level. This isn’t too interesting on its own, but one of my other posts that’s in the works has ended up being far too long, and so I’m breaking this out as a chunk that I can refer to.

As you may already know, class definitions in Python are executed like any other code. This means that you can execute arbitrary code in your definition if you so wish:

class Dog:
    print("In the process of creating class")
 
    def bark(self):
        print("woof")

This prints the message "In the process of creating class" once and only once, when the module that defines the class is first imported.

You can also conditionally define a class:

import random
 
if random.choice([True, False]):
    class Dog:
        def bark(self):
            print("woof")

If you do this, then 50% of the time it will give you a working class, and 50% of the time you’ll get a NameError if you try to use the class. This is stupid, but maybe you can usefully use this kind of thing for OS-dependent classes that shouldn’t be available if the underlying OS doesn’t support them.

So what’s actually going on in the bytecode? Let’s disassemble it and find out. I’m going to assume that you already know the basics of how Python bytecode works. If not, you can see my article on it here.

First of all, we compile some code:

source = """
class Dog:
    def bark(self):
        print("woof")
"""
 
code = compile(source, '<string>', 'exec')

We can then disassemble it to find out what’s going on. If you’re using Python 3.6, you’ll get something like this:

import dis
 
dis.dis(code)

will print to STDOUT:

 2           0 LOAD_BUILD_CLASS
             2 LOAD_CONST                0 (<code object Dog at 0x7f42d20b6c00, file "<string>", line 2>)
              4 LOAD_CONST               1 ('Dog')
              6 MAKE_FUNCTION            0
              8 LOAD_CONST               1 ('Dog')
             10 CALL_FUNCTION            2
             12 STORE_NAME               0 (Dog)
             14 LOAD_CONST               2 (None)
             16 RETURN_VALUE

The basic form of this is reasonably familiar, but what’s the MAKE_FUNCTION talking about? This is code that makes a class, and yet the bytecode is making a function. And where’s the function bark? This actually is a function, but it’s nowhere in sight when MAKE_FUNCTION is being invoked.

Let’s break it down. We can see this as:

 2           0 LOAD_BUILD_CLASS

... some other stuff ...

             10 CALL_FUNCTION            2
             12 STORE_NAME               0 (Dog)
             14 LOAD_CONST               2 (None)
             16 RETURN_VALUE

What this does is load a special built-in function called __build_class__, set up some arguments (which we skip over for now), call the function with two arguments on the stack, assign the result to the name Dog and then return None. So the interesting things to consider are:

  • What goes on inside __build_class__?
  • What are the arguments that we pass to it?

What does __build_class__ do?

__build_class__ isn’t documented anywhere, but it’s easy enough to find in the cpython source code. I won’t step through it line by line, but you can find it at Python/bltinmodule.c if you want to dig in to the details.

The __build_class__ function takes at least two arguments (a function plus a name string), with optional arguments after that for base classes. Let’s ignore base classes for now.

The interesting part of the __build_class__ function is this:

    cell = PyEval_EvalCodeEx(PyFunction_GET_CODE(func), PyFunction_GET_GLOBALS(func), ns,
                             NULL, 0, NULL, 0, NULL, 0, NULL,
                             PyFunction_GET_CLOSURE(func));

Here func is the function that was passed in to __build_class__, which is the mystery function we haven’t explained yet. The only other variable is ns, which is an empty Python dict.

This call evaluates some code. Specifically, it executes the code of the function func, in the context of the globals that func has access to, and using ns as the local namespace. The return value gets mostly ignored. If this function does anything at all, the interesting thing is in its side-effects on the dict ns.

Hint: side-effects on ns are very important to this.

After we’ve evaluated this mystery function, the ns dict is passed to the class’s metaclass. Metaclasses in Python get a bit confusing, so for now let’s ignore this detail and assume we’re using the default metaclass, which is type(). Therefore what we’re doing is calling:

   type("Dog", base_classes, ns)

You can think of this as a class instantiation: The class is type, and the instance we end up with is the Dog class. Dog is both a class and an instance: it’s a class for future instances like rex, rover and lassie, but is itself an instance of type.

What are the arguments we pass in to __build_class__?

We’ve figured out that __build_class__ takes a mystery function, evaluates it for side effects, then creates an instance of type using the resultant namespace. But what is the mystery function?

Let’s look again at that disassembly:

 2           0 LOAD_BUILD_CLASS
             2 LOAD_CONST                0 (<code object Dog at 0x7f42d20b6c00, file "<string>", line 2>)
              4 LOAD_CONST               1 ('Dog')
              6 MAKE_FUNCTION            0
              8 LOAD_CONST               1 ('Dog')
             10 CALL_FUNCTION            2
             12 STORE_NAME               0 (Dog)
             14 LOAD_CONST               2 (None)
             16 RETURN_VALUE

Specifically, we’ll look at the bit we skipped over before:

             2 LOAD_CONST                0 (<code object Dog at 0x7f42d20b6c00, file "<string>", line 2>)
              4 LOAD_CONST               1 ('Dog')
              6 MAKE_FUNCTION            0

When MAKE_FUNCTION is called with an argument of 0, it’s the simplest case: it takes only two arguments. The two arguments are a code object and a name. So if we want to know about the function we’re creating (and ultimately calling inside of __build_class__) we need to look inside this code object ot see what it’s doing.

The code object is loaded with LOAD_CONST 0, which means that we can find it in the tuple of constants associated with this code block:

dis.dis(code.co_consts[0])

gives:

  2           0 LOAD_NAME                0 (__name__)
              2 STORE_NAME               1 (__module__)
              4 LOAD_CONST               0 ('Dog')
              6 STORE_NAME               2 (__qualname__)

  3           8 LOAD_CONST               1 (<code object bark at 0x7fb6c7b77930, file "<string>", line 3>)
             10 LOAD_CONST               2 ('Dog.bark')
             12 MAKE_FUNCTION            0
             14 STORE_NAME               3 (bark)
             16 LOAD_CONST               3 (None)
             18 RETURN_VALUE

Now we’re getting somewhere. Suddenly this looks a bit more like the inside of a class definition. We’re loading up our method object and giving it the name "bark". The actual code for the method isn’t visible here, it’s stored in a constant nested inside the code object:

dis.dis(code.co_consts[0].co_consts[1])

gives:

  4           0 LOAD_GLOBAL              0 (print)
              2 LOAD_CONST               1 ('woof')
              4 CALL_FUNCTION            1
              6 POP_TOP
              8 LOAD_CONST               0 (None)
             10 RETURN_VALUE

You should recognise this bit as the innards of this method:

    def bark(self):
        print("woof")

So what are we actually saying here?

It’s got a bit confusing. Functions within functions within functions.

I think the key is to think of this class definition:

class Dog:
    def bark(self):
        print("woof")

as actually being a function dressed up:

def _Dog_namespace_creator():
    def bark(self):
        print("woof")

Creating the class works something like this:

  • Compile a function that creates the Dog namespace (I’m calling this _Dog_namespace_creator for clarity, though it isn’t really called that).
  • Execute this function, and keep hold of the resultant namespace. Remember that a namespace is just a dictionary. In our case, the namespace after executing this function contains one member, a function called bark.
  • Create an instance of type (or some other metaclass) using this namespace. The class will therefore have a method called bark in its namespace.

This is all done once, when the class is defined. None of this stuff needs to happen again when the class is instantiated.

Hmm. What’s the really short version?

The body of a class definition is a lot more like a function than you think. It actually is a function, one that is executed when the class is defined and builds the members of the class.

Why doesn’t import * work at function scope?

Python is a regular language, which means that function definitions, class definitions, import statements etc. mostly work at any scope. But there’s an exception for “from ... import *“, which can’t be used inside a function. The reason why turns out to reveal something interesting about the internals of Python.

Python is a pretty regular language; it follows patterns. Most of the time if you can do something in a function, you can do it in global scope or in a class definition or vice versa. There’s an exception for from ... import *, which works at global scope but not inside a function. The reason why turns out to tell us something interesting about Python internals.

Continue reading “Why doesn’t import * work at function scope?”