Category Archives: Enthought Deployment Server

Enthought Announces Canopy 2.1: A Major Milestone Release for the Python Analysis Environment and Package Distribution

Python 3 and multi-environment support, new state of the art package dependency solver, and over 450 packages now available free for all users

Enthought Canopy logoEnthought is pleased to announce the release of Canopy 2.1, a significant feature release that includes Python 3 and multi-environment support, a new state of the art package dependency solver, and access to over 450 pre-built and tested scientific and analytic Python packages completely free for all users. We highly recommend that all current Canopy users upgrade to this new release.

Ready to dive in? Download Canopy 2.1 here.

For those currently familiar with Canopy, in this blog we’ll review the major new features in this exciting milestone release, and for those of you looking for a tool to improve your workflow with Python, or perhaps new to Python from a language like MATLAB or R, we’ll take you through the key reasons that scientists, engineers, data scientists, and analysts use Canopy to enable their work in Python.

First, let’s talk about the latest and greatest in Canopy 2.1!

  1. Support for Python 3 user environments: Canopy can now be installed with a Python 3.5 user environment. Users can benefit from all the Canopy features already available for Python 2.7 (syntax checking, debugging, etc.) in the new Python 3 environments. Python 3.6 is also available (and will be the standard Python 3 in Canopy 2.2).
  2. All 450+ Python 2 and Python 3 packages are now completely free for all users: Technical support, full installers with all packages for offline or shared installation, and the premium analysis environment features (graphical debugger and variable browser and Data Import Tool) remain subscriber-exclusive benefits. See subscription options here to take advantage of those benefits.
  3. Built in, state of the art dependency solver (EDM or Enthought Deployment Manager): the new EDM back end (which replaces the previous enpkg) provides additional features for robust package compatibility. EDM integrates a specialized dependency solver which automatically ensures you have a consistent package set after installation, removal, or upgrade of any packages.
  4. Environment bundles, which allow users to easily share environments directly with co-workers, or across various deployment solutions (such as the Enthought Deployment Server, continuous integration processes like Travis-CI and Appveyor, cloud solutions like AWS or Google Compute Engine, or deployment tools like Ansible or Docker). EDM environment bundles not only allow the user to replicate the set of installed dependencies but also support persistence for constraint modifiers, the list of manually installed packages, and the runtime version and implementation.
  5. Multi-environment support: with the addition of Python 3 environments and the new EDM back end, Canopy now also supports managing multiple Python environments from the user interface. You can easily switch between Python 2.7 and 3.5, or between multiple 2.7 or 3.5 environments. This is ideal especially for those migrating legacy code to Python 3, as it allows you to test as you transfer and also provides access to historical snapshots or libraries that aren’t yet available in Python 3.

Why Canopy is the Python platform of choice for scientists and engineers

Since 2001, Enthought has focused on making the scientific Python stack accessible and easy to use for both enterprises and individuals. For example, Enthought released the first scientific Python distribution in 2004, added robust and corporate support for NumPy on 64-bit Windows in 2011, and released Canopy 1.0 in 2013.

Since then, with its MATLAB-like experience, Canopy has enabled countless engineers, scientists and analysts to perform sophisticated analysis, build models, and create cutting-edge data science algorithms. Canopy’s all-in-one package distribution and analysis environment for Python has also been widely adopted in organizations who want to provide a single, unified platform that can be used by everyone from data analysts to software engineers.

Here are five of the top reasons that people choose Canopy as their tool for enabling data analysis, data modelling, and data visualization with Python:

Continue reading

Enthought Presents the Canopy Platform at the 2017 American Institute of Chemical Engineers (AIChE) Spring Meeting

by: Tim Diller, Product Manager and Scientific Software Developer, Enthought

Last week I attended the AIChE (American Institute of Chemical Engineers) Spring Meeting in San Antonio, Texas. It was a great time of year to visit this cultural gem deep in the heart of Texas (and just down the road from our Austin offices), with plenty of good food, sights and sounds to take in on top of the conference and its sessions.

The AIChE Spring Meeting focuses on applications of chemical engineering in industry, and Enthought was invited to present a poster and deliver a “vendor perspective” talk on the Canopy Platform for Process Monitoring and Optimization as part of the “Big Data Analytics” track. This was my first time at AIChE, so some of the names were new, but in a lot of ways it felt very similar to many other engineering conferences I have participated in over the years (for instance, ASME (American Society of Mechanical Engineers), SAE (Society of Automotive Engineers), etc.).

This event underscored that regardless of industry, engineers are bringing the same kinds of practical ingenuity to bear on similar kinds of problems, and with the cost of data acquisition and storage plummeting in the last decade, many engineers are now sitting on more data than they know how to effectively handle.

What exactly is “big data”? Does it really matter for solving hard engineering problems?

One theme that came up time and again in the “Big Data Analytics” sessions Enthought participated in was what exactly “big data” is. In many circles, a good working definition of what makes data “big” is that it exceeds the size of the physical RAM on the machine doing the computation, so that something other than simply loading the data into memory has to be done to make meaningful computations, and thus a working definition of some tens of GB delimits “big” data from “small”.

For others, and many at the conference indeed, a more mundane definition of “big” means that the data set doesn’t fit within the row or column limits of a Microsoft Excel Worksheet.

But the question of whether your data is “big” is really a moot one as far as we at Enthought are concerned; really, being “big” just adds complexity to an already hard problem, and the kind of complexity is an implementation detail dependent on the details of the problem at hand.

And that relates to the central message of my talk, which was that an analytics platform (in this case I was talking about our Canopy Platform) should abstract away the tedious complexities, and help an expert get to the heart of the hard problem at hand.

At AIChE, the “hard problems” at hand seemed invariably to involve one or both of two things: (1) increasing safety/reliability, and (2) increasing plant output.

To solve these problems, two general kinds of activity were on display: different pattern recognition algorithms and tools, and modeling, typically through some kind of regression-based approach. Both of these things are straightforward in the Canopy Platform.

The Canopy Platform is a collection of related technologies that work together in an integrated way to support the scientist/analyst/engineer.

What is the Canopy Platform?

If you’re using Python for science or engineering, you have probably used or heard of Canopy, Enthought’s Python-based data analytics application offering an integrated code editor and interactive command prompt, package manager, documentation browser, debugger, variable browser, data import tool, and lots of hidden features like support for many kinds of proxy systems that work behind the scenes to make a seamless work environment in enterprise settings.

However, this is just one part of the Canopy Platform. Over the years, Enthought has been building other components and related technologies that work together in an integrated way to support the engineer/analyst/scientist solving hard problems.

At the center of the this is the Enthought Python Distribution, with runtime interpreters for Python 2.7 and 3.x and over 450 pre-built Python packages for scientific computing, including tools for machine learning and the kind of regression modeling that was shown in some of the other presentations in the Big Data sessions. Other components of the Canopy Platform include interface modules for Excel (PyXLL) and for National Instruments’ LabView software (Python Integration Toolkit for LabVIEW), among others.

A key component of our Canopy Platform is our Deployment Server, which simplifies the tricky tasks of deploying proprietary applications and packages or creating customized, reproducible Python environments inside an organization, especially behind a firewall or an air-gapped network.

Finally, (and this is what we were really showing off at the AIChE Big Data Analytics session) there are the Data Catalog and the Cloud Compute layers within the Canopy Platform.

The Data Catalog provides an indexed interface to potentially heterogeneous data sources, making them available for search and query based on various kinds of metadata.

The Data Catalog provides an indexed interface to potentially heterogeneous data sources. These can range from a simple network directory with a collection of HDF5 files to a server hosting files with the Byzantine complexity of the IRIG 106 Ch. 10 Digital Recorder Standard used by US military test flight ranges. The nice thing about the Data Catalog is that it lets you query and select data based on computed metadata, for example “factory A, on Tuesdays when Ethylene output was below 10kg/hr”, or in a test flight data example “test flights involving a T-38 that exceeded 10,000 ft but stayed subsonic.”

With the Cloud Compute layer, an expert user can write code and test it locally on some subset of data from the Data Catalog. Then, when it is working to satisfaction, he or she can publish the code as a computational kernel to run on some other, larger subset of the data in the Data Catalog, using remote compute resources, which might be an HPC cluster or an Apache Spark server. That kernel is then available to other users in the organization, who do not have to understand the algorithm to run it on other data queries.

In the demo below, I showed hooking up the Data Catalog to some historical factory data stored on a remote machine.

Data Catalog View The Data Catalog allows selection of subsets of the data set for inspection and ad hoc analysis. Here, three channels are compared using a time window set on the time series data shown on the top plot.

Then using a locally tested and developed compute kernel, I did a principal component analysis on the frequencies of the channel data for a subset of the data in the Data Catalog. Then I published the kernel and ran it on the entire data set using the remote compute resource.

After the compute kernel has been published and run on the entire data set, then the result explorer tool enables further interactions.

Ultimately, the Canopy Platform is for building and distributing applications that solve hard problems.  Some of the products we have built on the platform are available today (for instance, Canopy Geoscience and Virtual Core), others are in prototype stage or have been developed for other companies with proprietary components and are not publicly available.

It was exciting to participate in the Big Data Analytics track this year, to see what others are doing in this area, and to be a part of many interesting and fruitful discussions. Thanks to Ivan Castillo and Chris Reed at Dow for arranging our participation.

New Year, New Enthought Products!

We’ve had a number of major product development efforts underway over the last year, and we’re pleased to share a lot of new announcements for 2017:

A New Chapter for the Enthought Python Distribution (EPD):
Python 3 and Intel MKL 2017

In 2004, Enthought released the first “Python: Enthought Edition,” a Python package distribution tailored for a scientific and analytic audience. In 2008 this became the Enthought Python Distribution (EPD), a self-contained installer with the "enpkg" command-line tool to update and manage packages. Since then, over a million users have benefited from Enthought’s tested, pre-compiled set of Python packages, allowing them to focus on their science by eliminating the hassle of setting up tools.

Enthought Python Distribution logo

Fast forward to 2017, and we now offer over 450 Python packages and a new era for the Enthought Python Distributionaccess to all of the packages in the new EPD is completely free to all users and includes packages and runtimes for both Python 2 and Python 3 with some exciting new additions. Our ever-growing list of packages includes, for example, the 2017 release of the MKL (Math Kernel Library), the fruit of an ongoing collaboration with Intel.

The New Enthought Deployment Server:
Secure, Onsite Access to EPD and Private Packages


For those who are interested in having a private copy of the Enthought Python Distribution behind their firewall, as well as the ability to upload and manage internal private packages alongside it, we now offer the Enthought Deployment Server, an onsite version of the server we have been using for years to serve millions of Python packages to our users.

enthought-deployment-server-logoWith a local Enthought Deployment Server, your private copy will periodically synchronize with our master repository, on a schedule of your choosing, to keep you up to date with the latest releases. You can also set up private package repositories and control access to them using your existing LDAP or Active Directory service in a way that suits your organization.  We can even give you access to the packages (and their historical versions) inside of air-gapped networks! See our webinar introducing the Enthought Deployment Server.

Command Line Access to the New EPD and Flat Environments
via the Enthought Deployment Manager (EDM)

In 2013, we expanded the original EPD to introduce Enthought Canopy, coupling an integrated analysis environment with additional features such as a graphical package manager, documentation browser, and other user-friendly tools together with the Enthought Python Distribution to provide even more features to help “make science and analysis easy.”

With its MATLAB-like experience, Canopy has enabled countless engineers, scientists and analysts to perform sophisticated analysis, build models, and create cutting-edge data science algorithms. The all-in-one analysis platform for Python has also been widely adopted in organizations who want to provide a single, unified platform that can be used by everyone from data analysts to software engineers.

But we heard from a number of you that you also still wanted the capability to have flat, standalone environments not coupled to any editor or graphical tool. And we listened!  

enthought-deployment-manager-cli-screenshot2So last year, we finished building out our next-generation command-line tool that makes producing flat, standalone Python environments super easy.  We call it the Enthought Deployment Manager (or EDM for short), because it’s a tool to quickly deploy one or multiple Python environments with the full control over package versions and runtime environments.

EDM is also a valuable tool for use cases such as command line deployment on local machines or servers, web application deployment on AWS using Ansible and Amazon CloudFormation, rapid environment setup on continuous integration systems such as Travis-CI, Appveyor, or Jenkins/TeamCity, and more.

Finally, a new state-of-the-art package dependency solver included in the tool guarantees the consistency of your environment, and if your workflow requires switching between different environments, its sandboxed architecture makes it a snap to switch contexts.  All of this has also been designed with a focus on providing robust backward compatibility to our customers over time.  Find out more about EDM here.

Enthought Canopy 2.0:
Python 3 packages and New EDM Back End Infrastructure

Enthought Canopy LogoThe new Enthought Python Distribution (EPD) and Enthought Deployment Manager (EDM) will also provide additional benefits for Canopy.  Canopy 2.0 is just around the corner, which will be the first version to include Python 3 packages from EPD.

In addition, we have re-worked Canopy’s graphical package manager to use EDM as its back end, to take advantage of both the consistency and stability of the environments EDM provides, as well as its new package dependency solver.  By itself, this will provide a big boost in stability for users (ever found yourself wrapped up in a tangle of inconsistent package versions?).  Alongside the conversion of Canopy’s back end infrastructure to EDM, we have also included a substantial number of stability improvements and bug fixes.

Canopy’s Graphical Debugger adds external IPython kernel debugging support

On the integrated analysis environment side of Canopy, the graphical debugger and variable browser, first introduced in 2015, has gotten some nifty new features, including the ability to connect to and debug an external IPython kernel, in addition to a number of stability improvements.  (Weren’t aware you could connect to an external process?  Look for the context menu in the IPython console, use it to connect to the IPython kernel running, say, a Jupyter notebook, and debug away!)

Canopy Data Import Tool adds CSV exports and input file templates

Enthought Canopy Data Import ToolAlso, we’ve continued to add new features to the Canopy Data Import Tool since its initial release in May of 2016. The Data Import Tool allows users to quickly and easily import CSVs and other structured text files into Pandas DataFrames through a graphical interface, manipulate the data, and create reusable Python scripts to speed future data wrangling.

The latest version of the tool (v. 1.0.9, shipping with Canopy 2.0) has some nice new features like CSV exporting, input file templates, and more. See Enthought’s blog for some great examples of how the Data Import Tool speeds data loading, wrangling and analysis.

What to Look Forward to in 2017

So where are we headed in 2017?  We have put a lot of effort into building a strong foundation with our core suite of products, and now we’re focused on continuing to deliver new value (our enterprise users in particular have a number of new features to look forward to).  First up, for example, you can look for expanded capabilities around Python environments, making it easy to manage multiple environments, or even standardize and distribute them in your organization.  With the tremendous advancements in our core products that took place in 2016, there are a lot of follow-on features we can deliver. Stay tuned for updates!

Have a specific feature you’d like to see in one of Enthought’s products? E-mail our product team at and tell us about it!

Webinar: Solving Enterprise Python Deployment Headaches with the New Enthought Deployment Server

See a recording of the webinar:

Built on 15 years of experience of Python packaging and deployment for Fortune 500 companies, the NEW Enthought Deployment Server provides enterprise-grade tools groups and organizations using Python need, including:

  1. Secure, onsite access to a private copy of the proven 450+ package Enthought Python Distribution
  2. Centralized management and control of packages and Python installations
  3. Private repositories for sharing and deployment of proprietary Python packages
  4. Support for the software development workflow with Continuous Integration and development, testing, and production repositories

In this webinar, Enthought’s product team demonstrates the key features of the Enthought Deployment Server and how it can take the pain out of Python deployment and management at your organization.

Who Should Watch this Webinar:

If you answer “yes” to any of the questions below, then you (or someone at your organization) should watch this webinar:

  1. Are you using Python in a high-security environment (firewalled or air gapped)?
  2. Are you concerned about how to manage open source software licenses or compliance management?
  3. Do you need multiple Python environment configurations or do you need to have consistent standardized environments across a group of users?
  4. Are you producing or sharing internal Python packages and spending a lot of effort on distribution?
  5. Do you have a “guru” (or are you the guru?) who spends a lot of time managing Python package builds and / or distribution?

In this webinar, we demonstrate how the Enthought Deployment Server can help your organization address these situations and more.