New features in the Canopy Data Import Tool Version 1.1:
Support for Pandas v. 20, Excel / CSV export capabilities, and more
We’re pleased to announce a significant new feature release of the Canopy Data Import Tool, version 1.1. The Data Import Tool allows users to quickly and easily import CSVs and other structured text files into Pandas DataFrames through a graphical interface, manipulate the data, and create reusable Python scripts to speed future data wrangling. Here are some of the notable updates in version 1.1:
For those currently familiar with Canopy, in this blog we’ll review the major new features in this exciting milestone release, and for those of you looking for a tool to improve your workflow with Python, or perhaps new to Python from a language like MATLAB or R, we’ll take you through the key reasons that scientists, engineers, data scientists, and analysts use Canopy to enable their work in Python. Continue reading →
No dataset is perfect and most datasets that we have to deal with on a day-to-day basis have values missing, often represented by “NA” or “NaN”. One of the reasons why the Pandas library is as popular as it is in the data science community is because of its capabilities in handling data that contains NaN values.
by: Tim Diller, Product Manager and Scientific Software Developer, Enthought
Last week I attended the AIChE (American Institute of Chemical Engineers) Spring Meeting in San Antonio, Texas. It was a great time of year to visit this cultural gem deep in the heart of Texas (and just down the road from our Austin offices), with plenty of good food, sights and sounds to take in on top of the conference and its sessions.
The AIChE Spring Meeting focuses on applications of chemical engineering in industry, and Enthought was invited to present a poster and deliver a “vendor perspective” talk on the Canopy Platform for Process Monitoring and Optimization as part of the “Big Data Analytics” track. This was my first time at AIChE, so some of the names were new, but in a lot of ways it felt very similar to many other engineering conferences I have participated in over the years (for instance, ASME (American Society of Mechanical Engineers), SAE (Society of Automotive Engineers), etc.).
This event underscored that regardless of industry, engineers are bringing the same kinds of practical ingenuity to bear on similar kinds of problems, and with the cost of data acquisition and storage plummeting in the last decade, many engineers are now sitting on more data than they know how to effectively handle.
What exactly is “big data”? Does it really matter for solving hard engineering problems?
We’ve had a number of major product development efforts underway over the last year, and we’re pleased to share a lot of new announcements for 2017:
A New Chapter for the Enthought Python Distribution (EPD):
Python 3 and Intel MKL 2017
In 2004, Enthought released the first “Python: Enthought Edition,” a Python package distribution tailored for a scientific and analytic audience. In 2008 this became the Enthought Python Distribution (EPD), a self-contained installer with the "enpkg" command-line tool to update and manage packages. Since then, over a million users have benefited from Enthought’s tested, pre-compiled set of Python packages, allowing them to focus on their science by eliminating the hassle of setting up tools.
Fast forward to 2017, and we now offer over 450 Python packages and a new era for the Enthought Python Distribution; access to all of the packages in the new EPD is completely free to all users and includes packages and runtimes for both Python 2 and Python 3 with some exciting new additions. Our ever-growing list of packages includes, for example, the 2017 release of the MKL (Math Kernel Library), the fruit of an ongoing collaboration with Intel.
The New Enthought Deployment Server:
Secure, Onsite Access to EPD and Private Packages
Data exploration, manipulation, and visualization start with loading data, be it from files or from a URL. Pandas has become the go-to library for all things data analysis in Python, but if your intention is to jump straight into data exploration and manipulation, the Canopy Data Import Tool can help, instead of having to learn the details of programming with the Pandas library. Continue reading →
For example, the Data Import Tool lets you delete rows and columns containing Null values or replace the Null values in the DataFrame with a specific value. It also allows you to create new columns from existing ones. All operations are logged and are reversible in the Data Import Tool so you can experiment with various workflows with safeguards against errors or forgetting steps. Continue reading →
In May of 2016 we released the Canopy Data Import Tool, a significant new feature of our Canopy graphical analysis environment software. With the Data Import Tool, users can now quickly and easily import CSVs and other structured text files into Pandas DataFrames through a graphical interface, manipulate the data, and create reusable Python scripts to speed future data wrangling.
Watch a 2-minute demo video to see how the Canopy Data Import Tool works:
With the latest version of the Data Import Tool released this month (v. 1.0.4), we’ve added new capabilities and enhancements, including:
The ability to select and import a specific table from among multiple tables on a webpage,
Intelligent alerts regarding the saved state of exported Python code, and