NumPy arrays with pre-allocated memory

Sep 09 2008 Published by under NumPy, Open Source, Python, SciPy

A common need whenever NumPy is used to mediate the Python level access to another library
is to wrap the memory that the library creates using its own allocator into a NumPy array. This allows easy Python-side manipulation of the data already available without requiring an un-necessary copy. Fundamentally this is easy to do using PyArray_SimpleNewFromData. The C-level calling syntax is

int nd;
npy_intp *dims
void *data;

arr = PyArray_SimpleNewFromData(nd, dims, typenum, data);

In this code block, nd is the number of dimensions, dims is a C-array of integers describing the number of elements in each dimension of the array, typenum is the simple data-type of the NumPy array (e.g. NPY_DOUBLE), and data is the pointer to the memory that has been previously allocated.

By default, the memory for the NumPy array will be interpreted as a C-ordered contiguous array. If you need more control over the data-type or the striding of the array, then you can also use PyArray_NewFromDescr.

This is the simple part and code like this has been possible in Numeric for more than a decade. The tricky part, however, is memory management. How does the memory get deallocated? The suggestions have always been something similar to “make sure the memory doesn’t get deallocated before the NumPy
array disappears.” This is nice advice, but not generally helpful as it basically just tells you to create a memory leak.

All that NumPy does internally is to un-set a flag on the array object indicating that the array doesn’t own its memory pointer and so NumPy won’t free the memory when the last reference to the array disappears.

The key to managing memory correctly is to recognize that every NumPy array that doesn’t own its own memory can also point to a “base” object from which it obtained the memory. This base object is usually another NumPy array or an object exposing the buffer protocol — but it can be any object (even one we create on the fly). This object is DECREF’d when the NumPy array is deallocated and if the NumPy array contains the only reference to the object, then it will also be deallocated when the NumPy array is deallocated.

Thus, a good way to manage memory from another allocator is to create an instance of a new Python type. You then store the pointer to the memory (and anything else you may need to call the deallocator correctly) in the instance. Finally, you call the deallocator in the tp_dealloc function of the new Python type you’ve created. Then, you point the base member of your new NumPy array to the new object you’ve created.

The concept is relatively simple, but there are enough moving parts that an example is probably useful. Let’s say I want to create an extension module that only uses NumPy arrays allocated on 16-byte boundaries (maybe I’m experimenting with some SIMD instructions). I want to use arrays whose data is allocated using the aligned allocator defined below (borrowed from a patch to NumPy by David Cournapeau):

#include
#define uintptr_t size_t

#define _NOT_POWER_OF_TWO(n) (((n) & ((n) – 1)))
#define _UI(p) ((uintptr_t) (p))
#define _CP(p) ((char *) p)

#define _PTR_ALIGN(p0, alignment) \
((void *) (((_UI(p0) + (alignment + sizeof(void*))) \
& (~_UI(alignment – 1)))))

/* pointer must sometimes be aligned; assume sizeof(void*) is a power of two */
#define _ORIG_PTR(p) (*(((void **) (_UI(p) & (~_UI(sizeof(void*) – 1)))) – 1))

static void *_aligned_malloc(size_t size, size_t alignment)
{
void *p0, *p;

if (_NOT_POWER_OF_TWO(alignment)) {
errno = EINVAL;
return ((void *) 0);
}
if (size == 0) {
return ((void *) 0);
}
if (alignment < sizeof(void *)) {
alignment = sizeof(void *);
}

/* including the extra sizeof(void*) is overkill on a 32-bit
machine, since malloc is already 8-byte aligned, as long
as we enforce alignment >= 8 …but oh well */

p0 = malloc(size + (alignment + sizeof(void *)));
if (!p0) {
return ((void *) 0);
}
p = _PTR_ALIGN(p0, alignment);
_ORIG_PTR(p) = p0;
return p;
}

static void _aligned_free(void *memblock)
{
if (memblock) {
free(NPY_ALIGNED_ORIG_PTR(memblock));
}
}

Now, to create arrays using this allocator we just need to allocate the needed memory and use SimpleNewFromData. Then we create a new object encapsulating the deallocation and set this as the base object of the ndarray.

int nd=2
npy_intp dims[2]={10,20};
size_t size;
PyObject newobj, arr=NULL;
void *mymem;

size = PyArray_MultiplyList(dims, nd);
mymem = _aligned_malloc(size, 16);
arr = PyArray_SimpleNewFromData(nd, dims, NPY_DOUBLE, mymem);
if (arr == NULL) goto fail;
newobj = PyObject_New(_MyDeallocObject, &_MyDealloc_Type);
if (newobj == NULL) goto fail;
((_MyDeallocObject *)newobj)->memory = mymem;
PyArray_BASE(arr) = newobj;

fail:
_aligned_free(size);
Py_XDECREF(arr);

Now, all that is missing is the code to create the new Python Type. That code is

typedef struct {
PyObject_HEAD
void *memory;
} _MyDeallocObject;

static void
_mydealloc_dealloc(_MyDeallocObject *self)
{
_aligned_free(self->memory);
self->ob_type->tp_free((PyObject *)self);
}

static PyTypeObject _myDeallocType = {
PyObject_HEAD_INIT(NULL)
0, /*ob_size*/
“mydeallocator”, /*tp_name*/
sizeof(_MyDeallocObject), /*tp_basicsize*/
0, /*tp_itemsize*/
_mydealloc_dealloc, /*tp_dealloc*/
0, /*tp_print*/
0, /*tp_getattr*/
0, /*tp_setattr*/
0, /*tp_compare*/
0, /*tp_repr*/
0, /*tp_as_number*/
0, /*tp_as_sequence*/
0, /*tp_as_mapping*/
0, /*tp_hash */
0, /*tp_call*/
0, /*tp_str*/
0, /*tp_getattro*/
0, /*tp_setattro*/
0, /*tp_as_buffer*/
Py_TPFLAGS_DEFAULT, /*tp_flags*/
“Internal deallocator object”, /* tp_doc */
};

Don’t forget to add the following to the extension type initialization module in order to initialize the new Python type that has been created.

_MyDeallocType.tp_new = PyType_GenericNew;
if (PyType_Ready(&_MyDeallocType) < 0)
return;

This simple pattern should allow you to seamlessly integrate NumPy arrays with all kinds of memory allocation strategies. I think this pattern is common enough that we should probably add something to NumPy itself to make it easier to do this sort of thing in a few lines of code. Perhaps a new C-API call is justified with a new internal Python type that allows different allocators and deallocators to be used. Subscribe and post to the numpy-discussion@scipy.org list if you are interested in staying tuned.

11 responses so far

  • [...] code that has it’s own memory management scheme. Comments and feedback is welcome. The post is http://blog.enthought.com/?p=62 Best regards, -Travis [...]

  • avatar DSW says:

    Whew, that looks like a mess of code! I’m wondering if it’s okay to malloc the memory, set a bit array.flags |= NPY_OWNDATA, and that’s all. Doesn’t numpy ultimately deep within use malloc, so there’d be no problem even if my extension is the caller of malloc?

  • If you know for sure that you are using the same memory allocator as NumPy (currently malloc, but this could possibly change) then you would be fine just setting the array flag to NPY_OWNDATA and having NumPy deallocate for you.

    Otherwise, you need to set up this indirection. You might also be able to handle the indirection with Cython though.

    I think this pattern should actually be wrapped up and placed in NumPy itself, but I haven’t gotten the chance to do it.

  • avatar Stefan says:

    Also see Lisandro Dalcin’s comment on the NumPy list [1] where he mentions that

    PyCObject_FromVoidPtrAndDesc [2]

    can be used instead of creating a new type.

    [1] http://www.mail-archive.com/numpy-discussion@scipy.org/msg12442.html

    [2] http://docs.python.org/c-api/cobject.html

  • avatar Mike says:

    Travis,

    I know this is a bit old, but I have a question about this…

    I have an issue where I have a C program that generates multiple arrays that can each be several million cells long, just 1-D though, so like 1×4,000,000. Each array is a malloc’d array of type double. The program as is stores the data in a structure with other information about the data set (name, size, etc).

    I need to get this data into a Python workspace. I can do it with SWIG using memory copies, but that is undesirable because it greatly slows down the program and will cause more memory to be used.

    I would like to be able to return all the data as Numpy arrays, and this method seems like it could almost work (maybe in conjunction with SWIG to fill up a dictionary with the structure info), but not quite…

    For this method, how do you use the “arr” array you create? Is it manipulatable (arr[i][j] = x)? or do you manipulate the data beforehand by using mymem?

    Thanks,
    Mike

  • Mike,

    This is a C-extension example. Once you return the object created to Python it can be used just like any other NumPy array (you can set and get data or manipulate however you like).

    There are many ways to get access to memory allocated in a C program (including SWIG or f2py or using this approach). The approach described here is really for being able to manage deallocation of the data. If you know that your data will never change, then you can just use PyArray_SimpleNewFromData.

  • avatar Mike says:

    Travis,

    Thanks for the info. I was able to use SWIG to make a typemap that reads the C structure that contains the arrays, and uses PyArray_SimpleNewFromData to build the PyArrays and send them back to a dictionary/list in Python using PyArray_Return, which works.

    Thanks again.

  • avatar Egor Zindy says:

    Mike, not sure if that’s any use:

    I have included Travis’s code for automatic handling of array creation from malloc-ed memory (without copying the data), in a custom version of Bill Spotz’s numpy.i.
    When the numpy array is deleted on the python side, the malloc block is also automatically de-allocated to prevent any memory leak.

    I have written an entry in the scipy cookbook (explanations + complete example):
    http://www.scipy.org/Cookbook/SWIG_Memory_Deallocation

    Regards,
    Egor

  • avatar Nick says:

    Fantastic stuff. Been looking for something like this to help solve my conundrum over at

    http://stackoverflow.com/questions/2290007/beginner-extending-c-with-python-specifically-numpy

    If I have any success, I’ll post a solution with a link to this blog post. Thanks

  • i was trying to install numpy, can any one tell me the required python packages.

  • avatar Sylvain Mazet says:

    Hi,
    this is great and saves my day!
    I can point numpy arrays to fortran allocated arrays,
    and the deallocator gets called when expected
    (when the numpy array is not used anymore), thanks!

    Since this is an “old” post (by today’s standards),
    I wonder about the status of integration in numpy?

    Cheers,
    Sylvain.

Leave a Reply

Featuring Advanced Search Functions plugin by YD