Category Archives: Canopy

“venv” in Python 2.7 and how it Simplifies Life

Virtual environments, specifically ‘venv’ which we backported from Python 3.x, are a technology that enables the creation of multiple, lightweight, independent Python environments. Each virtual environment appears to be a self-contained Python installation, but loads the Python standard library and other common resources from a common base Python installation. Optionally, a virtual environment can also load packages from its base Python environment, whether that’s Canopy Core itself or another virtual environment.

What makes virtual environments so interesting? Well, they reduce disk space by not having to duplicate the full Python environment each time. But more than that, making Python environments far “lighter” enables several interesting capabilities.

First, the most common use of virtual environments is to allow separate projects to run in separate environments with different packages requirements. Each Python application runs in a separate virtual environment so package updates needed for one application don’t break the others. This model has long been used by web developers as well as a few scientific software developers.

The second case is specifically enabled by Canopy. Sharp-eyed readers will have noted in the first paragraph that we said that a virtual environment can have Canopy Core or another virtual environment as the base. But virtual environments can’t be layered, right? Now they can.

We have extended venv to support arbitrary numbers of layers, so we can do something like this:

'venv' in Canopy

‘venv’ in Canopy

‘Project1’ can be created with the following Canopy command:

canopy_cli  setup  ./Project1

Canopy constructs Project1 with all of the standard Canopy packages installed, and Project1 can now be customized to run the application. Once we’ve got Project1 working with a particular Python configuration, what if we want to see if the application works with the latest version of NumPy? We could update and potentially break the stable environment. Or, we can do this:

./Project1/bin/ venv  -s  ./Project1_play

Now ‘Project1_Play’ is a virtual environment which has by default all of Project1’s packages and package versions available. We can now update NumPy or other packages in Project1_play and test the application. If it doesn’t work, no big deal, we just delete it. We now have the ability to rapidly experiment with different (safe) Python environments without breaking our stable working area.

Canopy makes use of virtual environments to provide a protected Python environment for the Canopy GUI application to run in, and to provide one or more User Python environments which you can customize and run your own code in. Canopy Core is the base for each of these virtual environments, providing the core Python components and several common, large packages such as Qt and PySide. This structure means that the Canopy GUI can be updated without impacting your code, and any package updates you install won’t destabilize the Canopy GUI.

Canopy Core can be updated if you want, such as to move to a new version of Python, and each of the virtual environments will be updated automatically as well. This eliminates the need to install a new Python environment and then re-install any third-party packages into that new environment just to update Python.

For more information on how to set up virtual environments with Canopy, check the online docs, or get Canopy v1.1 and try it out.

Our next post will detail how to use Canopy and virtual environments to set up multi-user networks and cluster environments.

For all you EPD Users: Canopy v1.1

EPD (Enthought Python Distribution) provided a simple install of Python for scientific computing on the major platforms: Windows, Linux and Mac-OS. Those looking for a clean, straightforward Python stack to unpack into a particular directory found EPD to be pretty ideal.

With the introduction of Enthought Canopy, we began addressing users who are more engineer or scientist than programmer and were much less familiar with command-line interfaces. The Canopy desktop (in the vein of MATLAB or Spyder) aims at these technical users who want to use Python, but more as an application or IDE. To implement the desktop in Python and to allow both it and a user-defined Python environment to co-exist and be separately updated, we used virtual environments. As a consequence Canopy can feel a bit foreign to EPD users. With 1.1 we have added a new command line interface (CLI) that will hopefully make EPD users feel more at home in Canopy while retaining many of the Canopy advantages such as in-place update and virtual environment support.

Now, EPD users who just want to use Canopy as a plain Python environment with their own tools or IDE can easily create one or more Python environments. For example, from the command line on Windows:

        Canopy_cli.exe setup C:\Python27

or on Linux:

        canopy_cli setup ~/canopy

The target directory can be any you choose. If you want to make this Python environment the default on your system, you can specify the –default switch, and Canopy will add the appropriate bin directory (Scripts directory on Windows) to your PATH environment variable. On Mac OS and Linux systems, Canopy does this by appending a line to your ~/.bash_profile file which activates the correct virtual environment. On Windows, this Python environment is also added to the system registry so third-party tools can correctly find it.

Since we use virtual environments, the installation layout for Canopy is different. With Canopy we install what is referred to as “Canopy Core”: the core Python environment and a minimum set of packages needed to bootstrap Canopy itself. With it we can lock down the Canopy environment, facilitate the automatic update mechanism, and provide reliable startup and fail-safe recovery. For the user, there is a different environment. This means when a Python update comes out, it is no longer necessary to install a whole new environment plus all of your packages and get everything working again. Instead, simply update Canopy and go back to working — all of your packages are still installed but Python has been upgraded.

To complete an install, Canopy creates two virtual environments named ‘System’ and ‘User’. System is where the Canopy GUI runs; no user code runs in this environment. Updates to this virtual environment are done via the Canopy update mechanisms. The User environment is where the kernel and all user code runs. This virtual environment is managed by Package Manager from the desktop or by enpkg from the command line; any packages can be updated and installed without fear of disrupting the GUI. Similarly, updates to the Canopy GUI will not affect packages installed in the User environment and break your code.

So why stick with virtual environments for an “EPD-like” install? One of the big challenges with the old, “flat” EPD installation method was updating an install, or trying out different package configurations. With virtual environments, you can create a new environment which inherits packages from another virtual environment, and try out a few package changes. When you are satisfied, it’s straightforward to throw away the experimentation area and make the changes to the original, stable virtual environment.

For more details, check out Creating an EPD-like Python environment in our online docs. And you can download Canopy v1.1 now.

Canopy v1.1 – Linux, Command Line Interface and More

Final-version-canopy-logo (1)

With version 1.1, Enthought Canopy now:

1) addresses, much more completely, the command line use cases that EPD users and IT managers expect from their Python distributions,

2) makes Linux support generally available,

3) streamlines installation for users without internet access with full, single-click installers,

4) supports multiple virtual environments for advanced users via “venv” backported to Python 2.7, and

5) provides updates like numpy 1.7.1, matplotlib 1.3.0 and more.

It’s been just over 4 months since Canopy v1.0 shipped with the new desktop analysis environment and our updated Python distribution for scientific computing. Canopy’s analysis environment seems to be well-received by users looking for a simpler GUI environment, but the Canopy graphical installation process left something to be desired by our EPD users.

Along with the Canopy desktop for users that don’t want to work directly from a command line, Canopy version 1.1 now provides command-line utilities that streamline the installation of a complete Python scientific stack for current EPD users who want to work from the shell or command line. In addition, IT groups or tools specialists that need to manage a central install of Python for a workgroup or department now have the tools they need to install and maintain Canopy. Version 1.1’s command-line installation and setup (and the 1-click, full installers detailed below) are much better for supporting Canopy installations on clusters as well.

Canopy for Linux is now fully released. We have full, tested support for RedHat5, CentOS5, and Ubuntu 12.04. Linux distros and versions beyond those work as well (anecdotally and based on some in-house use), but those are our tested versions.

With Canopy v1.0 we implemented a 2-step installation process. The installer includes the Canopy desktop, the Python packages needed by Canopy itself, and other core scientific Python stack packages for a minimal install (the libraries in Canopy Express). For those with a subscription, the second step requires downloading any additional packages using the Package Manager. This 2-step process is problematic for users that don’t have easy internet access or need to install centrally for a group. To help, we now provide full installers with all the Python packages we support included. This provides a streamlined 1-step install process for those who need it or want it.

To ensure users can install any package updates they wish without messing up package dependencies for Canopy itself, we use virtual environments under the hood. With v1.1 we now provide command-line access to our backport of “venv”. The new CLI provides utilities to create, upgrade, activate and deactivate your own virtual environments. Now its much easier to try out new Python environments or set up multiple configurations for a workgroup.

Canopy v1.1 ships many updates to packages and many new ones: OpenCV, LLVM, Bottleneck, gevent, msgpack, py, pytest, six, NLTK, Numba, Mock, patsy and more. You can see the full details on the Canopy Package Index page.

We hope you find version 1.1 useful!

“Why We Built Enthought Canopy, An Inside Look” Recorded Webinar

We posted a recording of a 30 minute webinar that we did on the 20th that covers what Canopy is and why we developed it. There’s a few minutes of Brett Murphy(Product Manager at Enthought) discussing the “why” with some slides, and then Jason McCampbell (Development Manager for Canopy) gets into the interesting part with a 15+ minute demo of some of the key capabilities and workflows in Canopy. If you would like to watch the recorded webinar, you can find it here (the different formats will play directly in different browsers so check them and you won’t have to download the whole recording first):

Summed up in one line: Canopy provides the minimal set of tools for non-programmers to access, analyze and visualize data in an open-source Python environment.

The challenge in the past for scientists, engineers and analysts who wanted to use Python had been pulling together a working, integrated Python environment for scientific computing. Finding compatible versions of the dozens of Python packages, compiling them and integrating it all was very time consuming. That’s why we released the Enthought Python Distribution (EPD) many years back. It provided a single install of all the major packages you needed to do scientific and analytic computing with Python.

But the primary interface for a user of EPD was the command line. For a scientist or analyst used to an environment like MATLAB or one of the R IDEs, the command line is a little unapproachable and makes Python challenging to adopt. This is why we developed Canopy.

Enthought Canopy is both a Python distribution (like EPD) and an analysis environment. The analysis environment includes an integrated editor and IPython prompt to faciliate script development & testing and data analysis & plotting. The graphical package manager becomes the main interface to the Python ecosystem with its package search, install and update capabilities. And the documentation browser makes online documentation for Canopy, Python and the popular Python packages available on the desktop.

Check out the Canopy demo in the recorded webinar (link above). We hope it’s helpful.

Avoiding “Excel Hell!” using a Python-based Toolchain

Update (Feb 6, 2014):  Enthought is now the exclusive distributor of PyXLL, a solution that helps users avoid “Excel Hell” by making it easy to develop add-ins for Excel in Python. Learn more here.

Didrik Pinte gave an informative, provocatively-titled presentation at the second, in-person New York Quantitative Python User’s Group (NY QPUG) meeting earlier this month.

There are a lot of examples in the press of Excel workflow mess-ups and spreadsheet errors contributing to some eye-popping mishaps in the finance world (e.g. JP Morgan’s spreadsheet issues may have led to the 2012 massive loss at “the London Whale”). Most of these can be traced to similar fundamental issues:

  • Data referencing/traceability

  • Numerical errors

  • Error-prone manual operations (cut & paste, …)

  • Tracing IO’s in libraries/API’s

  • Missing version control

  • Toolchain that doesn’t meet the needs of researchers, analysts, IT, etc.

Python, the coding language and its tool ecosystem, can provide a nice solution to these challenges, and many organizations are already turning to Python-based workflows in response. And with integration tools like PyXLL (to execute Python functions within Excel) and others, organizations can adopt Python-based workflows incrementally and start improving their current “Excel Hell” situation quickly.

For the details check out the video of Didrik’s NY QPUG presentation.  He demonstrates a an example solution using PyXLL and Enthought Canopy.

[vimeo 67327735 http://vimeo.com/67327735]

And grab the PDF of his slides here.

QPUG_20130514_ExcelHell_Slides

It would be great to hear your stories about “Excel Hell”. Let us know below.

–Brett Murphy