Monthly Archives: September 2008

Texas Python Regional Unconference – Austin, TX

We’re excited about meeting over at the University of Texas (Enthought’s backyard) for the Texas Python Regional Unconference this weekend (October 4-5).

2007 Unconference Attendees

Program

It’s absolutely free to attend the Unconference. It’s not too late to register, so add yourself to the list of attendees if you can make it. There are open slots in the self-organizing program as well, so feel free to add yourself to the schedule on the wiki.

Wireless

It’s also important to note you’ll have to email me (travis@enthought.com) with the following information if you’d like to have wireless access at the meeting venue.

  • Full Name
  • Phone or Email
  • Address
  • Affiliation

Friday Dinner

Anyone who happens to be in town on Friday evening is welcome to come over to our offices and walk with us to grab dinner in downtown Austin. Feel free to come over anytime after 5:30pm to hang out and get the nickel tour — we’ll leave at 7:00pm to eat.

EPD with Py2.5 v4.0.300 Beta 3 released

We’ve recently posted the third beta release of EPD (the Enthought
Python Distribution) with Python 2.5 version 4.0.300.

Please help us test it out and provide feedback on the EPD Trac
instance: https://svn.enthought.com/epd You can check out the release
notes here: http://www.enthought.com/products/epdbetareleasenotes.php

About EPD
The Enthought Python Distribution (EPD) is a “kitchen-sink-included”
distribution of the Python Programming Language, including over 60
additional tools and libraries. The EPD bundle includes NumPy, SciPy,
IPython, 2D and 3D visualization, database adapters, and a lot of
other tools right out of the box.

http://www.enthought.com/products/epd.php

It is currently available as a single-click installer for Windows XP
(x86), Mac OS X (a universal binary for OS X 10.4 and above), and
RedHat 3 and 4 (x86 and amd64).

EPD is free for academic use. An annual subscription and installation
support are available for individual commercial use. An enterprise
subscription with support for particular deployment environments is also
available for commercial purchase. The beta versions of EPD are
available for indefinite free trial.

I Hate Web Browsers

I just wrote a long and brilliant post into a text box in a web browser. I hit Command-Left Arrow to go to the beginning of the line. (For those without MacBooks, Fn+Left Arrow is Home, which *should* take you to the beginning of the line, but for some reason text boxes in Firefox on OS X don’t actually respond to Home, so you have to use Command-Left Arrow.)

The problem is that not all javascript-y AJAX-y sexy WYSIWIG Web 2.0 text areas actually *implement* Command-Left Arrow. A basic Firefox text area *will* move to the start and end of line with Command-Left and Command-Right arrows. But for some reason, when a javascript-y AJAX-y WYSIWIG Web 2.0 editor wraps a text area, sometimes they don’t handle Command-Left, and instead pass that straight through to Firefox. What does Firefox do with Command-Left? Why, return you to the previous page, of course!

This is about the 3rd or 4th time this has happened to me. In fact, I have a habit now of doing “select all; copy” so I at least have the text stored in the system clipboard. I know I am not the only one. Why doesn’t Firefox just have an internal flag on text areas that triggers whenever the user actually enters more than, say, 10 words, and automatically prompt the user if they try to close or navigate away from a page with unsaved text in those text areas? Can someone more knowledgeable about web browsers and plugins and DOMs tell me if this is a difficult thing to implement as an add-on? I know there are some plugins that allow you to manually save the text in an edit window to disk, but that’s way too heavyweight and manual of a process. Does anyone have a suggestion for a plugin or add-on that actually solves this problem?

ETS 3.0.2 Released!

I’m pleased to announce that the Enthought Tool Suite (ETS) 3.0.2 has
just been tagged and released!

Source distributions (.tar.gz) have been pushed to PyPi. Window’s
binaries will be built and uploaded to PyPi over the next 24 hours or so.

You can update to ETS 3.0.2 like so:
easy_install -U ets[nonets]>=3.0.2

Changes
ETS 3.0.2 is an update to ETS 3.0.1 that includes the following changes:
* Update of Enable to fix problems doing ‘setup.py install’.
* Update of ETSProjectTools to fix bugs and improve the help messages.
* Update of Mayavi to fix bugs found during the SciPy conference.
* Update of Traits, TraitsGUI, and TraitsBackend* to fix a number of
issues (see https://svn.enthought.com/enthought/query?milestone=Traits+3.0.2)

What is ETS?
The Enthought Tool Suite (ETS) is a collection of components developed
by Enthought and our partners, which we use every day to construct
custom scientific applications. It includes a wide variety of
components, including:* an extensible application framework
* application building blocks
* 2-D and 3-D graphics libraries
* scientific and math libraries
* developer toolsThe cornerstone on which these tools rest is the Traits package, which
provides explicit type declarations in Python; its features include
initialization, validation, delegation, notification, and visualization
of typed attributes.

More information is available for all these packages from the Enthought
Tool Suite development home page: http://code.enthought.com/projects/tool-suite.php

NumPy arrays with pre-allocated memory

A common need whenever NumPy is used to mediate the Python level access to another library
is to wrap the memory that the library creates using its own allocator into a NumPy array. This allows easy Python-side manipulation of the data already available without requiring an un-necessary copy. Fundamentally this is easy to do using PyArray_SimpleNewFromData. The C-level calling syntax is

[sourcecode language=”c++”]
int nd;
npy_intp *dims
void *data;

arr = PyArray_SimpleNewFromData(nd, dims, typenum, data);
[/sourcecode]

In this code block, nd is the number of dimensions, dims is a C-array of integers describing the number of elements in each dimension of the array, typenum is the simple data-type of the NumPy array (e.g. NPY_DOUBLE), and data is the pointer to the memory that has been previously allocated.

By default, the memory for the NumPy array will be interpreted as a C-ordered contiguous array. If you need more control over the data-type or the striding of the array, then you can also use PyArray_NewFromDescr.

This is the simple part and code like this has been possible in Numeric for more than a decade. The tricky part, however, is memory management. How does the memory get deallocated? The suggestions have always been something similar to “make sure the memory doesn’t get deallocated before the NumPy
array disappears.” This is nice advice, but not generally helpful as it basically just tells you to create a memory leak.

All that NumPy does internally is to un-set a flag on the array object indicating that the array doesn’t own its memory pointer and so NumPy won’t free the memory when the last reference to the array disappears.

The key to managing memory correctly is to recognize that every NumPy array that doesn’t own its own memory can also point to a “base” object from which it obtained the memory. This base object is usually another NumPy array or an object exposing the buffer protocol — but it can be any object (even one we create on the fly). This object is DECREF’d when the NumPy array is deallocated and if the NumPy array contains the only reference to the object, then it will also be deallocated when the NumPy array is deallocated.

Thus, a good way to manage memory from another allocator is to create an instance of a new Python type. You then store the pointer to the memory (and anything else you may need to call the deallocator correctly) in the instance. Finally, you call the deallocator in the tp_dealloc function of the new Python type you’ve created. Then, you point the base member of your new NumPy array to the new object you’ve created.

The concept is relatively simple, but there are enough moving parts that an example is probably useful. Let’s say I want to create an extension module that only uses NumPy arrays allocated on 16-byte boundaries (maybe I’m experimenting with some SIMD instructions). I want to use arrays whose data is allocated using the aligned allocator defined below (borrowed from a patch to NumPy by David Cournapeau):

[sourcecode language=”c++”]
#include
#define uintptr_t size_t

#define _NOT_POWER_OF_TWO(n) (((n) & ((n) – 1)))
#define _UI(p) ((uintptr_t) (p))
#define _CP(p) ((char *) p)

#define _PTR_ALIGN(p0, alignment) \
((void *) (((_UI(p0) + (alignment + sizeof(void*))) \
& (~_UI(alignment – 1)))))

/* pointer must sometimes be aligned; assume sizeof(void*) is a power of two */
#define _ORIG_PTR(p) (*(((void **) (_UI(p) & (~_UI(sizeof(void*) – 1)))) – 1))

static void *_aligned_malloc(size_t size, size_t alignment)
{
void *p0, *p;

if (_NOT_POWER_OF_TWO(alignment)) {
errno = EINVAL;
return ((void *) 0);
}
if (size == 0) {
return ((void *) 0);
}
if (alignment < sizeof(void *)) { alignment = sizeof(void *); } /* including the extra sizeof(void*) is overkill on a 32-bit machine, since malloc is already 8-byte aligned, as long as we enforce alignment >= 8 …but oh well */

p0 = malloc(size + (alignment + sizeof(void *)));
if (!p0) {
return ((void *) 0);
}
p = _PTR_ALIGN(p0, alignment);
_ORIG_PTR(p) = p0;
return p;
}

static void _aligned_free(void *memblock)
{
if (memblock) {
free(NPY_ALIGNED_ORIG_PTR(memblock));
}
}
[/sourcecode]

Now, to create arrays using this allocator we just need to allocate the needed memory and use SimpleNewFromData. Then we create a new object encapsulating the deallocation and set this as the base object of the ndarray.

[sourcecode language=”c++”]
int nd=2
npy_intp dims[2]={10,20};
size_t size;
PyObject newobj, arr=NULL;
void *mymem;

size = PyArray_MultiplyList(dims, nd);
mymem = _aligned_malloc(size, 16);
arr = PyArray_SimpleNewFromData(nd, dims, NPY_DOUBLE, mymem);
if (arr == NULL) goto fail;
newobj = PyObject_New(_MyDeallocObject, &_MyDealloc_Type);
if (newobj == NULL) goto fail;
((_MyDeallocObject *)newobj)->memory = mymem;
PyArray_BASE(arr) = newobj;

fail:
_aligned_free(size);
Py_XDECREF(arr);
[/sourcecode]

Now, all that is missing is the code to create the new Python Type. That code is

[sourcecode language=”c++”]
typedef struct {
PyObject_HEAD
void *memory;
} _MyDeallocObject;

static void
_mydealloc_dealloc(_MyDeallocObject *self)
{
_aligned_free(self->memory);
self->ob_type->tp_free((PyObject *)self);
}

static PyTypeObject _myDeallocType = {
PyObject_HEAD_INIT(NULL)
0, /*ob_size*/
“mydeallocator”, /*tp_name*/
sizeof(_MyDeallocObject), /*tp_basicsize*/
0, /*tp_itemsize*/
_mydealloc_dealloc, /*tp_dealloc*/
0, /*tp_print*/
0, /*tp_getattr*/
0, /*tp_setattr*/
0, /*tp_compare*/
0, /*tp_repr*/
0, /*tp_as_number*/
0, /*tp_as_sequence*/
0, /*tp_as_mapping*/
0, /*tp_hash */
0, /*tp_call*/
0, /*tp_str*/
0, /*tp_getattro*/
0, /*tp_setattro*/
0, /*tp_as_buffer*/
Py_TPFLAGS_DEFAULT, /*tp_flags*/
“Internal deallocator object”, /* tp_doc */
};
[/sourcecode]

Don’t forget to add the following to the extension type initialization module in order to initialize the new Python type that has been created.

[sourcecode language=”c++”]
_MyDeallocType.tp_new = PyType_GenericNew;
if (PyType_Ready(&_MyDeallocType) < 0) return; [/sourcecode] This simple pattern should allow you to seamlessly integrate NumPy arrays with all kinds of memory allocation strategies. I think this pattern is common enough that we should probably add something to NumPy itself to make it easier to do this sort of thing in a few lines of code. Perhaps a new C-API call is justified with a new internal Python type that allows different allocators and deallocators to be used. Subscribe and post to the numpy-discussion@scipy.org list if you are interested in staying tuned.