Dictionaries preserve ordering

I came across something unexpected when preparing my post about the basics of dictionaries in Python. For as long as I’ve been using Python, it’s been a fundamental truth that dictionaries don’t have a reliable ordering. It’s mentioned all over StackOverflow. The standard library even has an OrderedDict container for cases where you want your dictionary to have order.

Turns out it isn’t so.

I was trying to create an example that shows the fundamental way that dictionaries return elements in an order that has nothing (apparently) to do with the keys or the order in which you inserted them. Something like this:

fruits = {}
fruits["apple"] = 100
fruits["banana"] = 200
fruits["cherry"] = 300

I printed the dictionary. The fruits came back in alphabetical order. Dammit.

I guess I had bad luck and the keys just happened to be in the naive order. It’s a 1 in 6 chance, right?

fruits["damson"] = 400

Still in insertion order. Maybe something is up.

This actually isn’t anything mysterious: it’s right there in the documentation:

Changed in version 3.7: Dictionary order is guaranteed to be insertion order. This behavior was implementation detail of CPython from 3.6.

I’m on Ubuntu, so I’m getting Python 3.6 as my default Python 3. Mystery solved.

I don’t know about anyone else, but I was really surprised to see this change. I was assuming that making things more user-friendly like this would only come at the cost of making things slower, and there’s no way that Python would make things slower across the board.

Why wasn’t it always this way?

To recap some of the basics: Dictionaries don’t preserve ordering because they’re implemented using hash tables. A hash table is a data structure that gives very fast lookup regardless of the number of items in the container. It works like this:

Start with a so-called “hash function” that maps strings (or whatever key value you’re using—I’ll keep it simple here) into integers. To insert a value, apply the function to the key and then store the value in an array at that numbered position. Retrieving it again works the same way. The hash function is quick, and fetching an item from a specified position in an array is trivial. The great thing about it is that it’s equally fast no matter how many other elements are stored in the hash table (I’m skipping over a few details here, to keep things simple).

To be concrete, suppose:

We can store these entries in an 8-element array like this:

We can now easily look up any of the values from its key: hash the key, look up the corresponding position in the array, check that the key matches what we expect and then return the value.

But note that there’s no way to produce the keys in alphabetical order, without looking up all the keys and then sorting them (which is a costly operation). Nor can we get the keys in insertion order, because nothing about the structure shows us the order they were inserted; the image above would look the same whatever order you inserted the three values.

So why did it change?

One of the downsides of a hash function can be seen in the diagram above: generally there’s a bunch of empty space in it. This empty space isn’t zero cost: this is an array, meaning that it’s all one block of memory and the empty spaces are blank memory just taking up space. There are two ways to limit this overhead: Pick a clever hash function that maps the keys as densely as possible into as small a space as possible, and limit the size of each element in the array. In Python 3.6, a change was made to achieve the latter.

Looking a little deeper at how Python actually uses hash tables: Firstly, the has function isn’t as simple as I suggested above. The hash function produces a 64-bit integer, which is then used to identify an entry in the array (by using 3 bits of the integer, if we’ve got an 8-entry array). This is done because if enough data is inserted into the hash table it will have to be resized, and it’s convenient if we use the same hash values in the new bigger table. For the same reason, the hash values are stored in the array and not discarded as I suggested above. This means that our hash table ends up looking like this:

There are 3 items in each array entry, so our wasted space is actually 24 bytes for each empty array entry. Can we improve on this?

Yes we can. Instead of the array holding the hash value, key and value make the array store an index into another array:

It might look like we’ve just moved the data around, but note that this new array is dense: there is no wasted space in it between elements. Each newly inserted element gets added on to the end of this array. Now we only have 8 bytes of space used per empty entry in the hash table, rather than 24. On the other hand, we’ve added 8 bytes of space per filled entry in the hash table. But this is probably a win overall for size.

We still have all the desirable hash table properties, and the only cost is that we have to follow another pointer lookup to find the value we want. But on a modern CPU following pointers is cheap provided they are stored in cache, and data is much more likely to be stored in cache if it is small. So making the data structure smaller at the cost of another pointer lookup is likely to make things faster as well as smaller.

Wait, what does this have to do with preserving order?

Usually when we make code faster we end up making it less useful, or at least less flexible. But in this case something nice falls out for free. Note that the new array the data is stored in will automatically be in insertion order. If we want to iterate over the elements of the dict, we don’t need to look at the first table, we can just walk over the insertion-order table and produce the keys in insertion order.

For more information see this thread from python-dev way back in 2012 when the change was first proposed.

Does this mean that collections.OrderedDict is now obsolete?

Good question. Actually, no. The thing with an OrderedDict is that ordering is assumed to have meaning over and above the values stored. The most obvious way this comes into play is in comparison between dictionaries: if you compare two OrderedDict instances for equality, they only compare equal if the keys are the same and the insertion order is the same. For regular dict we don’t want this to be the case: two dicts with different insertion order but the same keys and values are the same dict. Even if we did want to change this we couldn’t, because it would break existing code.

It might be that dict and OrderedDict can share all their implementation internals and just have a flag set for “ordering is significant for equality test” or whatever. I’ll admit I haven’t looked into this.

Can I rely on this behaviour?

There’s a big difference between behaviour that happens on a certain version of Python and behaviour you can rely on in your code. As the docs quoted above say, as of Python 3.7 this is an official specification and not just an implementation detail. For more information about why this is specified, see this post from Guido.

Length hints in Python

I was grubbing about trying to understand some library code the other day (I forget which one) and I came across a double-underscore method I’d never seen before:

    def __length_hint__(self):
        return ... # Some internals here.

Since this is a double-underscore method, it can’t be (or shouldn’t be) something the library invented. It must have significance to the Python runtime. A quick search on the manual shows that this is the protocol by which a class can implement the operator.length_hint operator (duh). But of course I’ve not seen that operator either.

length_hint is slightly unusual. Most of the functions in operator correspond to built-in functions or operators accessed via special syntax (like +, == etc.) These are always available regardless of what namespaces you have imported, but are also available as properly namespaced named functions in operator, which can be useful if you want to store them in variables and use them as function arguments. However, length_hint is relatively new and thus doesn’t have any corresponding built-in function.

Anyway, length_hint is defined to return “an estimated length for the object”. It’s actually defined in PEP-424, which makes it clear that the return value of length_hint may be wrong, and may give up and return a default value (the __length_hint__ method is allowed to raise NotImplemented). What’s the use of a function that doesn’t return the right value, or even any value at all?

A simple example

This is just a toy example. Nothing actually works the way I’m going to describe it, but hopefully it will give you the idea of what’s going on.

Let’s play the role of the Python runtime for a bit. Assume we have some empty memory:

The blocks in this diagram could be bytes or multi-byte blocks or memory pages or whatever, it doesn’t really matter.

We want to create a list of prime numbers that we’ve read in from a file. We create the list called primes and load the first three numbers from the file:

Just then, we take a break from this process and need to create another object. Perhaps it’s a log message? One of those kinds of things that happens when you’re in the middle of doing something else.

Having done that, we go back to filling our list with prime numbers from the file. But you can see the problem we run into: if there’s more than 1 more prime number in the list, we end up crashing into the other block of memory.

As you can imagine, this kind of thing happens all the time. Software would get nowhere without a way to deal with this. One of the objects will have to be moved, and that means copying memory around, which wastes time.

It would be so much better if we could tell ahead of time how big our list needed to be. If we did that, we could pre-allocate enough space for the whole list of prime numbers (we’re reading it in from a file, so it must be finite).

This is where it could be useful to know the approximate length of what we’re dealing with. For a file, we can get the file size (in bytes) quickly without having to read every byte of the file. This doesn’t tell us exactly how many numbers are in the file, because different numbers may take up different amounts of space. But if we assume 3 bytes per number (2 characters plus a newline for each one) we can divide the length by 3 and make a guess.

Given our guess, we can pre-allocate enough space for our list and put the log_message item in memory somewhere where we won’t end up having to move it.

Of course, the crude method of dividing the file length by 3 might fail, and we end up not leaving enough space. But that leaves us no worse than if we hadn’t used the length hint and just dropped the log message wherever we felt like it.

Again, none of this is how things actually work, but it’s an idea of why this might be useful.

This worked because we knew we were using a file, and we know things about files (they have a length known without reading the whole file, they include line ending characters etc.). But as good programmers, particularly good Python programmers, we don’t want to have our code making too many assumptions. We want a way to get a length hint from anything: a file, a TCP socket, a list, a generator, or even a custom class that hadn’t even been written when you first wrote your code.

How length hints are actually used

It’s easy to miss the use of length hints in the Python runtime code. Length hints aren’t used in the parts of the code written in Python, only by the C code. If you simply grep the source for __length_hint__ you’ll find it being mentioned a lot, but virtually all the references are the various implementations for built-in types (list, tuple etc.) and don’t show it in use.

Where length hints are used by cpython runtime C code, it is called via the wrapper function PyObject_LengthHint (in Objects/abstract.c). This includes the following code:

PyObject_LengthHint(PyObject *o, Py_ssize_t defaultvalue)
    PyObject *hint, *result;
    Py_ssize_t res;
    if (_PyObject_HasLen(o)) {
        res = PyObject_Length(o);
        if (res < 0 && PyErr_Occurred()) {
            if (!PyErr_ExceptionMatches(PyExc_TypeError)) {
                return -1;
        else {
            return res;
    hint = _PyObject_LookupSpecial(o, &PyId___length_hint__);
    if (hint == NULL) {
        if (PyErr_Occurred()) {
            return -1;
        return defaultvalue;
    result = PyObject_CallFunctionObjArgs(hint, NULL);
    if (result == NULL) {
        if (PyErr_ExceptionMatches(PyExc_TypeError)) {
            return defaultvalue;
        return -1;
    else if (result == Py_NotImplemented) {
        return defaultvalue;
// ...

First it checks whether the object in question is of a standard sequence type, in which case it uses the standard method of getting the length directly. This is the check of _PyObject_HasLen. A real length will always be better than a length hint, if it is available.

Then it looks for a __length_hint__ member, bailing out if it doesn’t find one:

    hint = _PyObject_LookupSpecial(o, &PyId___length_hint__);
    if (hint == NULL) {
        if (PyErr_Occurred()) {
            return -1;
        return defaultvalue;

Assuming a length hint method is found, it calls the method and returns it.

This length hint function is used for example in listextend in Objects/listobject.c, which is the implementation of the built-in list type. listextend reads through an iterator, adding the elements to a list. This is obviously used to implement the list.extend() method, but it’s also used when concatenating lists or when instantiating a new list (the empty list is constructed, then the extend code is called).

The length hint is used like this (some details omitted):

    /* Guess a result list size. */
    n = PyObject_LengthHint(b, 8);
    if (n < 0) {/* Deal with bad result */}
    m = Py_SIZE(self);
    if (m > PY_SSIZE_T_MAX - n) {
        /* Overflow. Ignore this. */
    else {
        mn = m + n;
        /* Make room. */
        if (list_resize(self, mn) < 0)
            goto error;
        /* Make the list sane again. */
        Py_SIZE(self) = m;

The key point is the call to list_resize. This resizes the list to be big enough to include its existing contents, plus the hinted size of the iterator. Assuming the hint is correct, there will be no need to resize the list as it fills up. Even if it’s slightly off, it should still require fewer resizings, saving time overall.

Should you bother with this?

The above shows that this really is used, and if you implement a __length_hint__ on your iterators it will save some unnecessary work at runtime. But bear in mind that this is quite an edge case. Saving 10 or 20 allocations won’t make a measurable difference. It will only matter if you’re hitting this code thousands, maybe even millions of times in the hot loop of your algorithm.

My guess is that this is worth doing for any iterator where it’s easy to write code (say 2 or 3 lines) to generate a decent hint. If it’s more than that, wait until you’ve proven that this is something your code is hitting enough times to justify the effort.

An exception to this might be if you’re writing library code to be used by lots of different people. In that case, you can’t know how your code might be used and the payoff of your efforts could be multiplied by the number of users who end up using it. But the more generic your code is, the harder it is likely to be to write a decent length hint method.