A Whirlwind Excursion through Python C Extensions

Created 16 February 2009, last updated 28 January 2021

This is a presentation for PyCon 2009 in Chicago. A video of me presenting it is available on pyvideo.org, or at archive.org, or right here:

Python can be extended with extensions written in C. It’s a complex topic, this will be a quick 45 minute introduction.

The C API
The Hello World extension
API details
Error handling
Memory management
Making a type
In closing
See also

The examples here are toys, meant to demonstrate the structure of an extension. They are real running code, but they don’t do anything useful. They’ll demonstrate the workings of the C API and modules written with it. I’m assuming you’ll be able to provide your own domain-specific inner workings.

I’m assuming you know Python, and that you know C, at least well enough to follow along.

The code samples are available in whirlext.zip.

Why write in C when we have lovely Python?

To speed up parts of your Python app
To interface with existing C code
To add Python to your C application
To contribute to Python

There are other tools that solve similar problems:

The C API

The C API is actually the public interface to the implementation of CPython. It’s the same API used internally to build Python itself. It’s large, with over 600 entry points covering all sorts of functionality.

Because it’s the core of CPython, it doesn’t apply to other implementations of Python: Jython, IronPython, PyPy, etc. Ironclad is a project to provide for C extensions in IronPython, but I have no experience with it.

One amazing advantage to writing against the same C API as the core Python developers is that their code can be your learning sample. Want to do something similar to a built-in function? Go find its source and learn how it was done.

Writing a C API extension means working in two worlds at once. The C environment you’re coding in is missing many of the niceties you’re used to in Python, and at the same time, you’re writing code that provides those niceties to your callers.

You have to check function return codes, and then convert them into exceptions.
You manage memory with malloc and free, but have to properly update reference counts so the Python environment doesn’t leak memory.
C has no introspection, so you have to explicitly re-describe what you’ve already written.

The Hello World extension

This is the Python code for the module we’ll implement in C. It contains a single function which simply returns the string “hello world!”. Both the module and the function have doc strings.

This is the complete C code for the Hello World extension:

// ext1.c: A sample C extension: one simple function

#include "Python.h"

// hello_world function.

static PyObject *
hello_world(PyObject *self, PyObject *args)
{
    return Py_BuildValue("s", "hello, world!");
}

// Module functions table.

static PyMethodDef
module_functions[] = {
    { "hello_world", hello_world, METH_VARARGS, "Say hello." },
    { NULL }
};

// This function is called to initialize the module.

void
initext1(void)
{
    Py_InitModule3("ext1", module_functions, "A minimal module.");
}

The file starts by including Python.h. This pulls in all of the definitions needed for using the C API, as well as a few standard header files.

Next comes the hello_world function itself, that will actually do the work of the extension. The signature of the function is typical for the C API. There are few different ways to invoke C functions from Python, but this signature is the most common: taking two PyObject pointers, and returning another one.

The hello_world function is very simple, it just returns a constant Python string. We use the function Py_BuildValue to create a Python string from a C string. In this case, the C string is a literal, but any C string can be used. Py_BuildValue’s first argument is a format specifier that indicates how to interpret the rest of the arguments, similar to how sprintf works. In this case, the format spec is simply “s”, meaning the argument is a C string to be turned into a Python string.

The hello_world function is defined to return a PyObject*, so it returns the object created by Py_BuildValue. Even functions that return nothing must explicitly return a None value. Returning NULL indicates that an exception occurred, which we’ll get to later.

Because C has no introspection or reflection facilities, just defining the hello_world function isn’t enough for us to be able to use it by name. Next comes an array of PyMethodDef structures which will define the contents of the module. Each structure specifies a function, providing the Python name, the C implementation function, flags indicating how the function should be called, and a doc string.

In our case we have only one function. We’ve named the C function hello_world, the same as its Python name, but the connection between the two is made by the structure associating the hello_world C function with the Python name “hello_world”, not the identical names.

The flags for hello_world are METH_VARARGS which tells Python how to invoke the C function. Last comes the doc string for the function, as a standard C string. The array is terminated by a sentinel structure with a NULL name pointer, a common C idiom.

The last function defined here is initext1, and it’s the only symbol exported from this file (the others are declared static). This function is executed when the module is imported, and its name is important. It must be named initMOD where MOD is the name of the module, otherwise Python won’t be able to find it in the executable library.

Our initext1 function only does one thing: initialize the module by calling Py_InitModule3 with three arguments: the name of the module, the table of function definitions for the module’s contents, and a doc string for the module.

This simple extension shows the typical structure of a C API extension:

Create the meat of your extension as C constructs: functions, structures, etc.
Describe those C constructs in arrays of structures.
Use C API functions to create the Python constructs.

Building the extension is easy: setup.py knows how to do it with a simple declarative statement. All we have to do is tell distutils about our extension: what it is called and what source files comprise it. Distutils knows how to do the rest, producing a .pyd file on Windows, or a .so on Linux.

On Windows, it may take some work to get a compiler installed properly.

Once built and installed, the module works like any other Python module. The function can really be called, and so on. Notice that hello_world’s type is “built-in function”. Your code really is no different than a truly built-in function, they are both written with the same C API, and called in the same way. To the Python interpreter, your hello_world is the same as, say, len.

API details

The C API is fairly consistent in its conventions, but there are a few of them, so read the docs to be sure you know how each function works. The docs are good, and say what will happen.

The C API provides hundreds of entry points.

Each built-in type has a set of C calls that implement the operations particular to the type. They’ll look familiar to you from working with the types in Python. In some cases, the operations may be made available in slightly different forms, such as PyDict_SetItem, which uses a PyObject as a key, and PyDict_SetItemString, which uses a C string as a key. The latter is provided because using strings as keys is so common, it is special-cased for the caller and in the dictionary implementation.

The C API also provides polymorphic functions that access objects based on what they do rather than what they are.

And on and on, covering all of the built-in functionality of the Python environment.

We used Py_BuildValue to create the Python “hello world” string. It can make many other Python data structures. The format string can include punctuation that creates tuples, lists, and dictionaries, including nesting:

Py_BuildValue("s", "x") --> "x"

Py_BuildValue("i", 17) --> 17

Py_BuildValue("(isi)", 17, "x", 23) --> (17, "x", 23)

Py_BuildValue("{si,si}", "x", 17, "y", 2) --> {"x":17, "y":2}

Py_BuildValue("{si,s(ii)}", "x", 17, "y", 2, 3) -->
                                            {"x":17, "y":(2,3)}

Py_BuildValue("") --> None

You’ll use Py_BuildValue quite a bit to create Python data to return from your extension.

Just as Py_BuildValue makes it easy to combine C values into a Python structure, PyArg_ParseTuple makes it simple to parse apart a tuple into a number of C variables. The args argument to our C functions contains a tuple of the arguments to the function call. We pass it to PyArg_ParseTuple along with a format string indicating what types we expect. PyArg_ParseTuple works like sscanf, interpreting the format string, and assigning values to the variables in the rest of its arguments.

Here we get a C string and a C integer from the Python values passed in. Just as with sscanf, the arguments after the format string must be addresses of variables that will be assigned values:

static PyObject *
string_peek(PyObject *self, PyObject *args)
{
   const char *pstr;
   int indx;

   if (!PyArg_ParseTuple(args, "si:string_peek", &pstr, &indx)) {
      return NULL;
   }

   int char_value = pstr[indx];

   return Py_BuildValue("i", char_value);
}

In this code, we can use typical C pointer arithmetic to get the index’th character from the string, and then use Py_BuildValue to return it as a Python integer.

>>> string_peek("Whirlwind", 5)
119

Error handling

If you try passing incorrect arguments to our string_peek function, you’ll see it behaves as you would expect a Python function to, raising exceptions about incorrect types and number of arguments:

>>> string_peek("Whirlwind")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: string_peek() takes exactly 2 arguments (1 given)

>>> string_peek("Whirlwind", "0")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: an integer is required

>>> string_peek("Whirlwind", 2000)
88

We get all this free from PyArg_ParseTuple, all we have to do is pass along errors it detects.

If the args tuple doesn’t consist of a string and an integer, PyArg_ParseTuple will set an error state, and return false. The “:string_peek” portion of the format string tells PyArg_ParseTuple what function name to use in its error messages.

If PyArg_ParseTuple returns false, we know that the arguments weren’t proper, and we simply return NULL from the function. This indicates to Python that an exception occurred, and it will raise it in the Python environment.

The error state is global — once set by PyArg_ParseTuple, all you have to do is return NULL, and the Python interpreter will raise the error as an exception in the calling Python code.

In Python, the norm is to not catch exceptions, and let called functions’ exception pass through your code. In C API code, the same rule is true, but is implemented by always checking return codes, and if a called function returns false or NULL, then you should return NULL to pass the error up the stack.

There’s still a problem with our string_peek function: we can ask for the 2000’th character of a nine-character string. Our C code happily reads the contents of memory far outside the actual string it’s supposed to be working with.

We can check the index value to see that it’s valid for the string, then raise our own exception if it is not. To raise an exception in the C API, call PyErr_SetString to set the error state, and return NULL to indicate to Python that an exception occurred. PyErr_SetString takes two arguments: the exception type, and a string message for the exception.

Once we’ve done that, out-of-range arguments to string_peek will raise Python exceptions as you’d expect:

>>> string_peek("Whirlwind", 2000)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: peek index out of range

Memory management

Python’s memory management is based on reference counting. Every Python object has a count of the number of references to the object. When the count becomes zero, the object can be destroyed and its memory reclaimed.

In the C API, Python objects are instances of PyObject, and references to them are PyObject pointers. When using PyObject pointers, you have to manipulate reference counts properly, or your extension will leak memory or crash.

Every PyObject pointer is either owned or borrowed. They are both pointers, used like any other C pointer. But an owned reference means you are responsible for correctly disposing of the reference. Remember, objects are not owned, they are all shared, it’s references to objects that are owned or borrowed.

A borrowed reference means you can identify some other piece of code that owns the reference, because that code’s interest in the object started before yours, and will end after yours. For example, a caller must have a reference to the args it passes into a called function, so arguments are almost always borrowed.

If you get it wrong, the object will be freed out from under you: crash!

There are two ways to get an owned reference:

Accept a return value from a C function that returns a PyObject pointer. These are documented as returning a “new reference”. Most C API functions that return PyObject pointers return a new reference, but some return borrowed references. Read the docs carefully.
Use Py_INCREF on a borrowed PyObject pointer you already have. This increments the reference count on the object, and obligates you to dispose of it properly.

Once you have an owned reference, you have to get rid of it properly. The three ways are:

Return it to the caller of your function. This transfers the ownership from you to your caller. Now they have an owned reference.
Use Py_DECREF on it. This decrements the reference count.
Store it with PyTuple_SetItem() or PyList_SetItem(), which are unusual among C API functions: they steal ownership of their item argument.

Let look at a real code example:

// def insert_powers(numbers, n):
//    powers = (n, n*n, n*n*n)
//    numbers[n] = powers
//    return powers

static PyObject *
insert_powers1(PyObject *self, PyObject *args)
{
   PyObject *numbers;
   int n;

   if (!PyArg_ParseTuple(args, "Oi", &numbers, &n)) {
      return NULL;
   }

   PyObject *powers = Py_BuildValue("(iii)", n, n*n, n*n*n);

   // Equivalent to Python: numbers[n] = powers
   if (PySequence_SetItem(numbers, n, powers) < 0) {
      return NULL;
   }

   return powers;
}

In this code, we have four PyObject pointers:

self is borrowed, as our argument values always are. The caller (or one of his callers) must own a reference to this value in order to call us with it, so we can borrow it from him.
Similarly for args.
numbers is an object we pull out of the args with PyArg_ParseTuple. Since args is borrowed, we can borrow value out of it, so numbers is also borrowed.
powers is our first owned reference. We create this tuple at line 16, and Py_BuildValue returns a new reference, so it is now our responsibility.

We have to make sure that we properly dispose of our owned reference, which happens at line 23 when we return it as the value of the function, passing the ownership to our caller.

But there is a memory leak in this code: if PySequence_SetItem fails, we properly return NULL from the function to indicate the problem, but in that code path, we haven’t disposed of our owned reference, numbers.

Fixing the memory leak is simple: before returning NULL from the function, dispose of the owned reference explicitly with Py_DECREF. Now every path through the function properly handles the owned references, and our function is complete:

// def insert_powers(numbers, n):
//    powers = (n, n*n, n*n*n)
//    numbers[n] = powers
//    return powers

static PyObject *
insert_powers2(PyObject *self, PyObject *args)
{
   PyObject *numbers;
   int n;

   if (!PyArg_ParseTuple(args, "Oi", &numbers, &n)) {
      return NULL;
   }

   PyObject *powers = Py_BuildValue("(iii)", n, n*n, n*n*n);

   if (PySequence_SetItem(numbers, n, powers) < 0) {
      // Because we won't return powers, we have to discard it.
      Py_DECREF(powers);
      return NULL;
   }

   return powers;
}

Real-world C functions of course can be much more elaborate than this one, and analyzing all the code paths can be complex. One way to simplify the problem is to organize your code so that all returns are at the end of the function, and resource clean up is all in one place at the end also:

// ..init vars..
    int ok = 0;
    PyObject * retval = NULL;
    PyObject * something = NULL;

    // ..do all the work, using goto to jump to the
    //   end on error...
    Blah(); Blah();
    if error: goto done
    Etc(); Etc();
    if error: goto done

    // final step: build the return value, and set ok=1.
    retval = Py_BuildValue("");
    ok = 1;

    done:

    // ..clean up resources..
    Py_XDECREF(something);

    return retval;

Your function’s resources may need more complex logic to get all the owned resources released properly, and you may not like the idea of using goto. But organizing code this way will make the code flow clearer, and you can consolidate all your resource tear-down code in one place, making it easier to be sure you have it right.

However you organize your code, keeping track of owned references is an extra burden for you as you write your extension, but it is extremely important to get it right.

Making a type

Making a type is more involved than simply making functions, but it has a similar flavor: write C components, describe them in arrays, and use the arrays to create Python components.

The storage for your type is a C struct. Its fields will be your type’s data:

// The CountDict type.

typedef struct {
   PyObject_HEAD
   PyObject * dict;
   int count;
} CountDict;

The first thing in the struct must be PyObject_HEAD, with no semicolon. This is a macro that creates the initial fields in the structure. This is what makes your structure usable as a PyObject.

The rest of the structure can be whatever data you need to support your code. PyObject * pointers are very useful for holding Python objects as data, but remember they are almost certainly owned references, since you can hold those values across function calls. As with all owned references, you have to be careful to acquire and release them properly.

When writing a class in Python, special methods have special names, like __init__. When creating a type in C, those special methods are ordinary C function with particular signatures that will be specified as part of the type definition. Often, these functions are named systematically with the type name and method name, but as with our earlier C functions, the name really doesn’t matter: a pointer to the function will be used to associate it with its role.

Each type has an init function, which is the analog of __init__: it initializes the data members:

static int
CountDict_init(CountDict *self, PyObject *args, PyObject *kwds)
{
   self->dict = PyDict_New();
   self->count = 0;
   return 0;
}

Unlike a Python class, a C class needs an explicit deallocation method. Here you should dispose of your owned references, and finally call the class tp_free function to clean up the type itself.

static void
CountDict_dealloc(CountDict *self)
{
   Py_XDECREF(self->dict);
   self->ob_type->tp_free((PyObject*)self);
}

You can decide which of your struct’s fields to make available as Python data attributes, if any. An array of structures defines the attributes:

static PyMemberDef
CountDict_members[] = {
   { "dict",   T_OBJECT, offsetof(CountDict, dict), 0,
               "The dictionary of values collected so far." },

   { "count",  T_INT,    offsetof(CountDict, count), 0,
               "The number of times set() has been called." },

   { NULL }
};

Each PyMemberDef structure specifies the Python attribute name, the C type of the field, the offset into the structure (with the handy offsetof macro), some flags, and a docstring for the attribute. The array will be used later in the type definition.

Class methods are defined just like functions. That strange self argument we had on our functions earlier now makes sense: we can declare it to be our struct type, and use it to access our data fields:

static PyObject *
CountDict_set(CountDict *self, PyObject *args)
{
   const char *key;
   PyObject *value;

   if (!PyArg_ParseTuple(args, "sO:set", &key, &value)) {
      return NULL;
   }

   if (PyDict_SetItemString(self->dict, key, value) < 0) {
      return NULL;
   }

   self->count++;

   return Py_BuildValue("i", self->count);
}

Methods are declared just like functions, in an array of structs providing the name, C function pointer, flags, and docstring for the method:

static PyMethodDef
CountDict_methods[] = {
   { "set",    (PyCFunction) CountDict_set, METH_VARARGS,
               "Set a key and increment the count." },
   // typically there would be more here...

   { NULL }
};

Now we are ready to pull all our pieces together. Types are defined by initializing a PyTypeObject struct. This struct has fields for each of the special functions needed to provide the behavior of a type. Where in Python we’d have specially named functions like __init__ and __hash__, in C we have members in the PyTypeObject struct pointing to the C function implementing the functionality. Other fields in the struct get pointers to the arrays of structs defining the methods, properties, and attributes:

static PyTypeObject
CountDictType = {
   PyObject_HEAD_INIT(NULL)
   0,                         /* ob_size */
   "CountDict",               /* tp_name */
   sizeof(CountDict),         /* tp_basicsize */
   0,                         /* tp_itemsize */
   (destructor)CountDict_dealloc, /* tp_dealloc */
   0,                         /* tp_print */
   0,                         /* tp_getattr */
   0,                         /* tp_setattr */
   0,                         /* tp_compare */
   0,                         /* tp_repr */
   0,                         /* tp_as_number */
   0,                         /* tp_as_sequence */
   0,                         /* tp_as_mapping */
   0,                         /* tp_hash */
   0,                         /* tp_call */
   0,                         /* tp_str */
   0,                         /* tp_getattro */
   0,                         /* tp_setattro */
   0,                         /* tp_as_buffer */
   Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE, /* tp_flags*/
   "CountDict object",        /* tp_doc */
   0,                         /* tp_traverse */
   0,                         /* tp_clear */
   0,                         /* tp_richcompare */
   0,                         /* tp_weaklistoffset */
   0,                         /* tp_iter */
   0,                         /* tp_iternext */
   CountDict_methods,         /* tp_methods */
   CountDict_members,         /* tp_members */
   0,                         /* tp_getset */
   0,                         /* tp_base */
   0,                         /* tp_dict */
   0,                         /* tp_descr_get */
   0,                         /* tp_descr_set */
   0,                         /* tp_dictoffset */
   (initproc)CountDict_init,  /* tp_init */
   0,                         /* tp_alloc */
   0,                         /* tp_new */
};

The good news is that most of these fields can be omitted, just as in Python, you only have to implement the special functions you need to override.

Finally we are ready to actually create the type. Once the module is initialized, we can init some slots in the type that can’t be done with the struct initializer, then call PyType_Ready to finish up the creation of the type:

void
initext3(void)
{
   PyObject* mod;

   // Create the module
   mod = Py_InitModule3("ext3", NULL, "An extension with a type.");
   if (mod == NULL) {
      return;
   }

   // Fill in some slots in the type, and make it ready
   CountDictType.tp_new = PyType_GenericNew;
   if (PyType_Ready(&CountDictType) < 0) {
      return;
   }

   // Add the type to the module.
   Py_INCREF(&CountDictType);
   PyModule_AddObject(mod, "CountDict", (PyObject*)&CountDictType);
}

PyType_Ready performs bookkeeping and other initialization to prepare the type for use, including hooking up the hierarchy for inheritance, and so on. Finally, PyModule_AddObject is used to assign the type to its name in the module, and we are done.

As you’d expect, our CountDict type works like other built-in types:

>>> import ext3
>>> cdict = ext3.CountDict()
>>> cdict
<CountDict object at 0x0099F0B0>

>>> cdict.set("a", "hello")
1

>>> cdict.set("b", "world")
2

>>> cdict.dict
{'a': 'hello', 'b': 'world'}

>>> cdict.count
2

We can construct it, examine it, call its methods, and use its attributes, just like a Python type.

In closing

These are other topics that we can’t cover here...