Enthought is proud to sponsor the NYC HPC-GPU Supercomputing meetup on May 21st, 2012. This meeting will focus on the GPU and Python (what else?).
A happy coincidence brings together three GPU wise men at this meetup. Andreas Klockner is the author of PyCUDA and PyOpenCL. Nicolas Pinto comes to us via MIT with an extensive background in machine learning/GPU-related research. Last but not least, Enthought’s own Sean Ross Ross will share the latest on OpenCL and CLyther.
It promises to be a GPU-accelerated night! The room is filled to capacity I believe, but it’s not uncommon for people to get off the waiting list.
Please come by and say hello!
We are pleased to announce the 2012 SciPy Bioinformatics Workshop held in conjunction with SciPy 2012 this July in Austin, TX.
Python in biology is not dead yet… in fact, it’s alive and well!
Remember just a few short years ago when BioPerl ruled the world? Just one minor paradigm shift* later and Python now has a commanding presence in bioinformatics. From Python bindings to common tools all the way to entire Python-based informatics platforms, Python is used everywhere** in modern bioinformatics.
If you use Python for bioinformatics or just want to learn more about how its being used, join us at the 2012 SciPy Bioinformatics Workshop. We will have speakers from both academia and industry showcasing how Python is enabling biologists to effectively work with large, complex data sets.
The workshop will be held the evening of July 19 from 5-6:30.
More information about SciPy is available on the conference site: http://conference.scipy.org/scipy2012/
!! Participate !!
Are you using Python in bioinformatics? We’d love to have you share your story. We are looking for 3-4 speakers to share their experiences using Python for bioinformatics.
Please contact Chris Mueller at chris.mueller [at] lab7.io and Ray Roberts at rroberts [at] enthought.com to volunteer. Please include a brief description or link to a paper/topic which you would like to discuss. Presentations will last for 15 minutes each and will be followed by a panel Q&A.
* That would be next generation sequencing
** Yes, we aRe awaRe of that otheR language used eveRywhere, but let’s celebRate Python Right now.
As we announced last week, Enthought is sponsoring an open-source High Performance Python track at PyGotham this year. The video above is meant to give you a sneak peek at the Parallel Python class. Please remember that the class will cover a broader array of topics (as described below), so don’t worry if you aren’t familiar with MPI as discussed in the video.
The talk itself will offer advice every developer should know for writing effective parallel python. Many of the examples you find on parallel python focus on the mechanics of getting parallel infrastructure working with your code, and not on actually building good Portable Parallel Python (the 3 P’s!). This talk is intended to be a broad introduction that is well suited to both the beginner and the veteran developer.
- Leverage existing code wherever possible. Most likely someone has solved a subset of your problem and it was probably their primary interest. As such, their code is probably better optimized for performance within their defined scope than your code will be. Use the standards: numpy, scipy, etc. Don’t do anything fancy unless its absolutely necessary.
- Build efficient models. Evolve code from for-loops to list comprehensions and generator comprehensions to numpy and cython.
- Optimize your code for speed and memory performance using profilers.
- Keep the minimum set of information you need at hand at all times. Whatever you do, don’t trap a piece of necessary information in a dark corner of your code. The rediscovery of information is very expensive.
- Separate the consumer and producer of information from the communication mechanism. Build simple-as-possible data structures, using or deriving from basic types. Use function-based operations on a data type or to map between data types. Keep any diagnostic and monitoring tools separate from your data and functions. Good design here can increase the parallel nature of your code significantly.
- Learn the different parallel communication technologies (multiprocessing, MPI, zmq, GPU, cloud, …) and use parallel map and/or map-reduce.
- Stay name space aware. Ensure your code has self-contained key-value pairs for name=object definitions at the level of the parallel map. Use imports and definitions wisely. Know scoping rules and how they apply to parallelization.
- When the above fails, lean on a good serializer.
We look forward to seeing you at PyGotham! Let us know if there’s anything you’d like to see covered (within the scope of each respective talk) and stay tuned for future “sneak peeks.”
To the PyCluster!
Enthought is a proud sponsor of the second annual PyGotham conference in New York City (June 8th and 9th). As part of our commitment, we are also offering a High Performance Python track that will illustrate how to build applications and utilize parallel computing techniques with various open source projects. Stayed tuned for more details as they become available.
Here’s the lineup so far:
- Python with Parallelism in Mind. Rarely does code just happen to be “embarrassingly parallel.” We will discuss some simple rules, structural changes, and diagnostic tools that can help optimize the parallel efficiency of your code. This session will also introduce several common parallel communication technologies that can lower the barrier to parallel computing.
- GPU Computing with CLyther. GPU computing has come a long way over the past few years but still requires knowledge of CUDA or OpenCL. Similar to Cython, CLyther is a Python language extension that makes writing OpenCL code as easy as Python itself.
- MapReduce with the Disco Project. MapReduce frameworks provide a powerful abstraction for distributed data storage and processing. Our friend, Chris Mueller, will talk about the Disco Project, a refreshing alternative to the Hadoop hegemony that uses Python for the front-end and Erlang for the back-end. More importantly, he will discuss when a MapReduce framework makes sense and when it doesn’t.
- Interactive Plotting with Chaco. Most “big data” problems don’t stop with distributed computation. You have to render your results in a way that a larger audience can understand. Chaco is an open source library that helps developers generate performant, interactive data visualizations.
- Declarative UIs with Enaml. Enaml is pythonic UI development done right. Enaml shares Python’s goals of simplicity, conciseness and elegance. Enaml implements a constraint based layout system which ensures that UI’s built with Enaml behave and appear identical on Windows, Linux and OSX. This introduction to Enaml will get you started on the path of writing non trivial UI’s in an afternoon.
- Tie It Together: Build An App. In an updated version of his Pycon talk, Jonathan Rocher ties together time series data — from storage to analysis to visualization — in a demo application. We’ll also walk through a more computationally demanding application to illustrate concepts introduced in the previous talks.
Look forward to seeing everyone there!
Last night, Ilan Schnell announced the release of ETS 4.0. The first major release of the Enthought Tool Suite in almost three years, 4.0 implements a significant change: The ‘enthought’ namespace has been removed from all projects.
from enthought.traits.api import HasTraits
is now simply:
from traits.api import HasTraits
For backwards compatibility, a proxy package ‘etsproxy’ has been added, which should permit existing code to work. This package also contains a refactor tool ‘ets3to4’ to convert projects to the new namespace so that they don’t rely on ‘etsproxy’.
If you want to download the source code of all ETS projects, you can download http://www.enthought.com/repo/ets/ALLETS-4.0.0.tar (41MB).
The projects themselves are now hosted on: https://github.com/enthought
We understand that the namespace refactor (which prompted this major release in the first place) is a big change, and even though we have tested examples and some of our own code against this ETS version, we expect there to be little glitches. We are therefore already planning a 4.0.1 bug-fix release in about 2-3 weeks.
We are looking forward to your feedback on the ETS mailing list, and hope you enjoy ETS 4.
Last week, we released the Enthought Tool Suite 3.6. John Wiggins made many improvements and bug fixes to Kiva, Enable, and Chaco. And thanks to Evan Patterson, the TraitsBackendQt now supports PySide (as well as PyQt4).
We are also happy to announce that immediately after the release, the ETS repository was moved from subversion to git, and is now hosted on github.
This new ETS will be included in EPD 7.0, which is Python 2.7-based and is scheduled to be released on February 8.
Over the last couple of weeks I added support for PySide to the majority of the ETS packages, including Traits, Chaco and Enable. I have only tested personally with Ubuntu 10.10 and PySide Beta 1, but we’re beginning to test with OS X (Windows is next). With any luck, the next ETS release will have full PySide support, and the next EPD release will include PySide eggs.
Right now, to use PySide in ETS, you have to set an environment variable QT_API=pyside. I hope by the time ETS 3.6 is released we can ditch the environment variable, but I can’t make any promises.
We are pleased to announce that Enthought Tool Suite 3.5.0 wasreleased Friday afternoon. Because the last ETS (3.4.1) was releasedalmost half a year ago, there are many bug fixes andnew features. All projects descriptions are updatedon PyPI.
The source tarballs can be downloaded from:
Additionally, this release includes a replacement forETSProjectTools, which allows users to download andinstallETS from the SVN repository:
This ETS release will be included in the upcoming version of the Enthought Python Distribution.EPD 6.3 will be released in the next several weeks.