Tag Archives: Python

Webinar: An Exclusive Peek “Under the Hood” of Enthought Training and the Pandas Mastery Workshop

pandas-mastery-workshop-webinar-email-header-900x300comp

Enthought’s Pandas Mastery Workshop is designed to accelerate the development of skill and confidence with Python’s Pandas data analysis package — in just three days, you’ll look like an old pro! This course was created ground up by our training experts based on insights from the science of human learning, as well as what we’ve learned from over a decade of extensive practical experience of teaching thousands of scientists, engineers, and analysts to use Python effectively in their everyday work.

In this webinar, we’ll give you the key information and insight you need to evaluate whether the Pandas Mastery Workshop is the right solution to advance your data analysis skills in Python, including:

  • Who will benefit most from the course
  • A guided tour through the course topics
  • What skills you’ll take away from the course, how the instructional design supports that
  • What the experience is like, and why it is different from other training alternatives (with a sneak peek at actual course materials)
  • What previous workshop attendees say about the course

Date and Registration Info:
January 26, 2017, 11-11:45 AM CT
Register (if you can’t attend, register and we’ll be happy to send you a recording of the session)

Register


michael_connell-enthought-vp-trainingPresenter: Dr. Michael Connell, VP, Enthought Training Solutions

Ed.D, Education, Harvard University
M.S., Electrical Engineering and Computer Science, MIT


Why Focus on Pandas:

Python has been identified as the most popular coding language for five years in a row. One reason for its popularity, especially among data analysts, data scientists, engineers, and scientists across diverse industries, is its extensive library of powerful tools for data manipulation, analysis, and modeling. For anyone working with tabular data (perhaps currently using a tool like Excel, R, or SAS), Pandas is the go-to tool in Python that not only makes the bulk of your work easier and more intuitive, but also provides seamless access to more specialized packages like statsmodels (statistics), scikit-learn (machine learning), and matplotlib (data visualization). Anyone looking for an entry point into the general scientific and analytic Python ecosystem should start with Pandas!

Who Should Attend: 

Whether you’re a training or learning development coordinator who wants to learn more about our training options and approach, a member of a technical team considering group training, or an individual looking for engaging and effective Pandas training, this webinar will help you quickly evaluate how the Pandas Mastery Workshop can meet your needs.


Additional Resources

Upcoming Open Pandas Mastery Workshop Sessions:

London, UK, Feb 22-24
Chicago, IL, Mar 8-10
Albuquerque, NM, Apr 3-5
Washington, DC May 10-12
Los Alamos, NM, May 22-24
New York City, NY, Jun 7-9

Learn More

Have a group interested in training? We specialize in group and corporate training. Contact us or call 512.536.1057.

Download Enthought’s Pandas Cheat Sheets

Loading Data Into a Pandas DataFrame: The Hard Way, and The Easy Way

Data exploration, manipulation, and visualization start with loading data, be it from files or from a URL. Pandas has become the go-to library for all things data analysis in Python, but if your intention is to jump straight into data exploration and manipulation, the Canopy Data Import Tool can help, instead of having to learn the details of programming with the Pandas library.

The Data Import Tool leverages the power of Pandas while providing an interactive UI, allowing you to visually explore and experiment with the DataFrame (the Pandas equivalent of a spreadsheet or a SQL table), without having to know the details of the Pandas-specific function calls and arguments. The Data Import Tool keeps track of all of the changes you make (in the form of Python code). That way, when you are done finding the right workflow for your data set, the Tool has a record of the series of actions you performed on the DataFrame, and you can apply them to future data sets for even faster data wrangling in the future.

At the same time, the Tool can help you pick up how to use the Pandas library, while still getting work done. For every action you perform in the graphical interface, the Tool generates the appropriate Pandas/Python code, allowing you to see and relate the tasks to the corresponding Pandas code.

With the Data Import Tool, loading data is as simple as choosing a file or pasting a URL. If a file is chosen, it automatically determines the format of the file, whether or not the file is compressed, and intelligently loads the contents of the file into a Pandas DataFrame. It does so while taking into account various possibilities that often throw a monkey wrench into initial data loading: that the file might contain lines that are comments, it might contain a header row, the values in different columns could be of different types e.g. DateTime or Boolean, and many more possibilities as well.

Importing files or data into Pandas with the Canopy Data Import Tool

The Data Import Tool makes loading data into a Pandas DataFrame as simple as choosing a file or pasting a URL.

A Glimpse into Loading Data into Pandas DataFrames (The Hard Way)

The following 4 “inconvenience” examples show typical problems (and the manual solutions) that might arise if you are writing Pandas code to load data, which are automatically solved by the Data Import Tool, saving you time and frustration, and allowing you to get to the important work of data analysis more quickly.

Continue reading

Mayavi (Python 3D Data Visualization and Plotting Library) adds major new features in recent release

Key updates include: Jupyter notebook integration, movie recording capabilities, time series animation, updated VTK compatibility, and Python 3 support

by Prabhu Ramachandran, core developer of Mayavi and director, Enthought India

The Mayavi development team is pleased to announce Mayavi 4.5.0, which is an important release both for new features and core functionality updates.

Mayavi is a general purpose, cross-platform Python package for interactive 2-D and 3-D scientific data visualization. Mayavi integrates seamlessly with NumPy (fast numeric computation library for Python) and provides a convenient Pythonic wrapper for the powerful VTK (Visualization Toolkit) library. Mayavi provides a standalone UI to help visualize data, and is easy to extend and embed in your own dialogs and UIs. For full information, please see the Mayavi documentation.

Mayavi is part of the Enthought Tool Suite of open source application development packages and is available to install through Enthought Canopy’s Package Manager (you can download Canopy here).

Mayavi 4.5.0 is an important release which adds the following features:

  1. Jupyter notebook support: Adds basic support for displaying Mayavi images or interactive X3D scenes
  2. Support for recording movies and animating time series
  3. Support for the new matplotlib color schemes
  4. Improvements on the experimental Python 3 support from the previous release
  5. Compatibility with VTK-5.x, VTK-6.x, and 7.x. For more details on the full set of changes see here.

Let’s take a look at some of these new features in more detail:

Jupyter Notebook Support

This feature is still basic and experimental, but it is convenient. The feature allows one to embed either a static PNG image of the scene or a richer X3D scene into a Jupyter notebook. To use this feature, one should first initialize the notebook with the following:

from mayavi import mlab
mlab.init_notebook()

Subsequently, one may simply do:

s = mlab.test_plot3d()
s

This will embed a 3-D visualization producing something like this:

Mayavi in a Jupyter Notebook

Embedded 3-D visualization in a Jupyter notebook using Mayavi

When the init_notebook method is called it configures the Mayavi objects so they can be rendered on the Jupyter notebook. By default the init_notebook function selects the X3D backend. This will require a network connection and also reasonable off-screen support. This currently will not work on a remote Linux/OS X server unless VTK has been built with off-screen support via OSMesa as discussed here.

For more documentation on the Jupyter support see here.

Animating Time Series

This feature makes it very easy to animate a time series. Let us say one has a set of files that constitute a time series (files of the form some_name[0-9]*.ext). If one were to load any file that is part of this time series like so:

from mayavi import mlab
src = mlab.pipeline.open('data_01.vti')

Animating these is now very easy if one simply does the following:

src.play = True

This can also be done on the UI. There is also a convenient option to synchronize multiple time series files using the “sync timestep” option on the UI or from Python. The screenshot below highlights the new features in action on the UI:

Time Series Animation in Mayavi

New time series animation feature in the Python Mayavi 3D visualization library.

Recording Movies

One can also create a movie (really a stack of images) while playing a time series or running any animation. On the UI, one can select a Mayavi scene and navigate to the movie tab and select the “record” checkbox. Any animations will then record screenshots of the scene. For example:

from mayavi import mlab
f = mlab.figure()
f.scene.movie_maker.record = True
mlab.test_contour3d_anim()

This will create a set of images, one for each step of the animation. A gif animation of these is shown below:

Recording movies with Mayavi

Recording movies as gif animations using Mayavi

More than 50 pull requests were merged since the last release. We are thankful to Prabhu Ramachandran, Ioannis Tziakos, Kit Choi, Stefano Borini, Gregory R. Lee, Patrick Snape, Ryan Pepper, SiggyF, and daytonb for their contributions towards this release.

Additional Resources on Mayavi:

Geophysical Tutorial: Facies Classification using Machine Learning and Python

Published in the October 2016 edition of The Leading Edge magazine by the Society of Exploration Geophysicists. Read the full article here.

By Brendon Hall, Enthought Geosciences Applications Engineer 
Coordinated by Matt Hall, Agile Geoscience

ABSTRACT

There has been much excitement recently about big data and the dire need for data scientists who possess the ability to extract meaning from it. Geoscientists, meanwhile, have been doing science with voluminous data for years, without needing to brag about how big it is. But now that large, complex data sets are widely available, there has been a proliferation of tools and techniques for analyzing them. Many free and open-source packages now exist that provide powerful additions to the geoscientist’s toolbox, much of which used to be only available in proprietary (and expensive) software platforms.

One of the best examples is scikit-learn, a collection of tools for machine learning in Python. What is machine learning? You can think of it as a set of data-analysis methods that includes classification, clustering, and regression. These algorithms can be used to discover features and trends within the data without being explicitly programmed, in essence learning from the data itself.

Well logs and facies classification results from a single well.

Well logs and facies classification results from a single well.

In this tutorial, we will demonstrate how to use a classification algorithm known as a support vector machine to identify lithofacies based on well-log measurements. A support vector machine (or SVM) is a type of supervised-learning algorithm, which needs to be supplied with training data to learn the relationships between the measurements (or features) and the classes to be assigned. In our case, the features will be well-log data from nine gas wells. These wells have already had lithofacies classes assigned based on core descriptions. Once we have trained a classifier, we will use it to assign facies to wells that have not been described.

See the tutorial in The Leading Edge here.

ADDITIONAL RESOURCES:

Webinar: Introducing the NEW Python Integration Toolkit for LabVIEW

See a recording of the webinar:

LabVIEW is a software platform made by National Instruments, used widely in industries such as semiconductors, telecommunications, aerospace, manufacturing, electronics, and automotive for test and measurement applications. In August 2016, Enthought released the Python Integration Toolkit for LabVIEW, which is a “bridge” between the LabVIEW and Python environments.

In this webinar, we’ll demonstrate:

  1. How the new Python Integration Toolkit for LabVIEW from Enthought seamlessly brings the power of the Python ecosystem of scientific and engineering tools to LabVIEW
  2. Examples of how you can extend LabVIEW with Python, including using Python for signal and image processing, cloud computing, web dashboards, machine learning, and more

Continue reading

Webinar: Fast Forward Through the “Dirty Work” of Data Analysis: New Python Data Import and Manipulation Tool Makes Short Work of Data Munging Drudgery

Python Import & Manipulation Tool Intro Webinar

Whether you are a data scientist, quantitative analyst, or an engineer, or if you are evaluating consumer purchase behavior, stock portfolios, or design simulation results, your data analysis workflow probably looks a lot like this:

Acquire > Wrangle > Analyze and Model > Share and Refine > Publish

The problem is that often 50 to 80 percent of time is spent wading through the tedium of the first two stepsacquiring and wrangling data – before even getting to the real work of analysis and insight. (See The New York Times, For Big-Data Scientists, ‘Janitor Work’ Is Key Hurdle to Insights)

WHAT YOU’LL LEARN:

Enthought Canopy Data Import Tool

Try the Data Import Tool with your own data. Download here.

In this webinar we’ll demonstrate how the new Canopy Data Import Tool can significantly reduce the time you spend on data analysis “dirty work,” by helping you:

  • Load various data file types and URLs containing embedded tables into Pandas DataFrames
  • Perform common data munging tasks that improve raw data
  • Handle complicated and/or messy data
  • Extend the work done with the tool to other data files

WEBINAR RECORDING:
Continue reading