Author Archives: admin

Webinar: An Exclusive Peek “Under the Hood” of Enthought Training and the Pandas Mastery Workshop

pandas-mastery-workshop-webinar-email-header-900x300comp

Enthought’s Pandas Mastery Workshop is designed to accelerate the development of skill and confidence with Python’s Pandas data analysis package — in just three days, you’ll look like an old pro! This course was created ground up by our training experts based on insights from the science of human learning, as well as what we’ve learned from over a decade of extensive practical experience of teaching thousands of scientists, engineers, and analysts to use Python effectively in their everyday work.

In this webinar, we’ll give you the key information and insight you need to evaluate whether the Pandas Mastery Workshop is the right solution to advance your data analysis skills in Python, including:

  • Who will benefit most from the course
  • A guided tour through the course topics
  • What skills you’ll take away from the course, how the instructional design supports that
  • What the experience is like, and why it is different from other training alternatives (with a sneak peek at actual course materials)
  • What previous workshop attendees say about the course

Date and Registration Info:
January 26, 2017, 11-11:45 AM CT
Register (if you can’t attend, register and we’ll be happy to send you a recording of the session)

Register


michael_connell-enthought-vp-trainingPresenter: Dr. Michael Connell, VP, Enthought Training Solutions

Ed.D, Education, Harvard University
M.S., Electrical Engineering and Computer Science, MIT


Why Focus on Pandas:

Python has been identified as the most popular coding language for five years in a row. One reason for its popularity, especially among data analysts, data scientists, engineers, and scientists across diverse industries, is its extensive library of powerful tools for data manipulation, analysis, and modeling. For anyone working with tabular data (perhaps currently using a tool like Excel, R, or SAS), Pandas is the go-to tool in Python that not only makes the bulk of your work easier and more intuitive, but also provides seamless access to more specialized packages like statsmodels (statistics), scikit-learn (machine learning), and matplotlib (data visualization). Anyone looking for an entry point into the general scientific and analytic Python ecosystem should start with Pandas!

Who Should Attend: 

Whether you’re a training or learning development coordinator who wants to learn more about our training options and approach, a member of a technical team considering group training, or an individual looking for engaging and effective Pandas training, this webinar will help you quickly evaluate how the Pandas Mastery Workshop can meet your needs.


Additional Resources

Upcoming Open Pandas Mastery Workshop Sessions:

London, UK, Feb 22-24
Chicago, IL, Mar 8-10
Albuquerque, NM, Apr 3-5
Washington, DC May 10-12
Los Alamos, NM, May 22-24
New York City, NY, Jun 7-9

Learn More

Have a group interested in training? We specialize in group and corporate training. Contact us or call 512.536.1057.

Download Enthought’s Pandas Cheat Sheets

Loading Data Into a Pandas DataFrame: The Hard Way, and The Easy Way

Data exploration, manipulation, and visualization start with loading data, be it from files or from a URL. Pandas has become the go-to library for all things data analysis in Python, but if your intention is to jump straight into data exploration and manipulation, the Canopy Data Import Tool can help, instead of having to learn the details of programming with the Pandas library.

The Data Import Tool leverages the power of Pandas while providing an interactive UI, allowing you to visually explore and experiment with the DataFrame (the Pandas equivalent of a spreadsheet or a SQL table), without having to know the details of the Pandas-specific function calls and arguments. The Data Import Tool keeps track of all of the changes you make (in the form of Python code). That way, when you are done finding the right workflow for your data set, the Tool has a record of the series of actions you performed on the DataFrame, and you can apply them to future data sets for even faster data wrangling in the future.

At the same time, the Tool can help you pick up how to use the Pandas library, while still getting work done. For every action you perform in the graphical interface, the Tool generates the appropriate Pandas/Python code, allowing you to see and relate the tasks to the corresponding Pandas code.

With the Data Import Tool, loading data is as simple as choosing a file or pasting a URL. If a file is chosen, it automatically determines the format of the file, whether or not the file is compressed, and intelligently loads the contents of the file into a Pandas DataFrame. It does so while taking into account various possibilities that often throw a monkey wrench into initial data loading: that the file might contain lines that are comments, it might contain a header row, the values in different columns could be of different types e.g. DateTime or Boolean, and many more possibilities as well.

Importing files or data into Pandas with the Canopy Data Import Tool

The Data Import Tool makes loading data into a Pandas DataFrame as simple as choosing a file or pasting a URL.

A Glimpse into Loading Data into Pandas DataFrames (The Hard Way)

The following 4 “inconvenience” examples show typical problems (and the manual solutions) that might arise if you are writing Pandas code to load data, which are automatically solved by the Data Import Tool, saving you time and frustration, and allowing you to get to the important work of data analysis more quickly.

Continue reading

Webinar: Solving Enterprise Python Deployment Headaches with the New Enthought Deployment Server

See a recording of the webinar:

Built on 15 years of experience of Python packaging and deployment for Fortune 500 companies, the NEW Enthought Deployment Server provides enterprise-grade tools groups and organizations using Python need, including:

  1. Secure, onsite access to a private copy of the proven 450+ package Enthought Python Distribution
  2. Centralized management and control of packages and Python installations
  3. Private repositories for sharing and deployment of proprietary Python packages
  4. Support for the software development workflow with Continuous Integration and development, testing, and production repositories

In this webinar, Enthought’s product team demonstrates the key features of the Enthought Deployment Server and how it can take the pain out of Python deployment and management at your organization.

Who Should Watch this Webinar:

If you answer “yes” to any of the questions below, then you (or someone at your organization) should watch this webinar:

  1. Are you using Python in a high-security environment (firewalled or air gapped)?
  2. Are you concerned about how to manage open source software licenses or compliance management?
  3. Do you need multiple Python environment configurations or do you need to have consistent standardized environments across a group of users?
  4. Are you producing or sharing internal Python packages and spending a lot of effort on distribution?
  5. Do you have a “guru” (or are you the guru?) who spends a lot of time managing Python package builds and / or distribution?

In this webinar, we demonstrate how the Enthought Deployment Server can help your organization address these situations and more.

Geophysical Tutorial: Facies Classification using Machine Learning and Python

Published in the October 2016 edition of The Leading Edge magazine by the Society of Exploration Geophysicists. Read the full article here.

By Brendon Hall, Enthought Geosciences Applications Engineer 
Coordinated by Matt Hall, Agile Geoscience

ABSTRACT

There has been much excitement recently about big data and the dire need for data scientists who possess the ability to extract meaning from it. Geoscientists, meanwhile, have been doing science with voluminous data for years, without needing to brag about how big it is. But now that large, complex data sets are widely available, there has been a proliferation of tools and techniques for analyzing them. Many free and open-source packages now exist that provide powerful additions to the geoscientist’s toolbox, much of which used to be only available in proprietary (and expensive) software platforms.

One of the best examples is scikit-learn, a collection of tools for machine learning in Python. What is machine learning? You can think of it as a set of data-analysis methods that includes classification, clustering, and regression. These algorithms can be used to discover features and trends within the data without being explicitly programmed, in essence learning from the data itself.

Well logs and facies classification results from a single well.

Well logs and facies classification results from a single well.

In this tutorial, we will demonstrate how to use a classification algorithm known as a support vector machine to identify lithofacies based on well-log measurements. A support vector machine (or SVM) is a type of supervised-learning algorithm, which needs to be supplied with training data to learn the relationships between the measurements (or features) and the classes to be assigned. In our case, the features will be well-log data from nine gas wells. These wells have already had lithofacies classes assigned based on core descriptions. Once we have trained a classifier, we will use it to assign facies to wells that have not been described.

See the tutorial in The Leading Edge here.

ADDITIONAL RESOURCES:

Canopy Data Import Tool: New Updates

In May of 2016 we released the Canopy Data Import Tool, a significant new feature of our Canopy graphical analysis environment software. With the Data Import Tool, users can now quickly and easily import CSVs and other structured text files into Pandas DataFrames through a graphical interface, manipulate the data, and create reusable Python scripts to speed future data wrangling.

Watch a 2-minute demo video to see how the Canopy Data Import Tool works:

With the latest version of the Data Import Tool released this month (v. 1.0.4), we’ve added new capabilities and enhancements, including:

  1. The ability to select and import a specific table from among multiple tables on a webpage,
  2. Intelligent alerts regarding the saved state of exported Python code, and
  3. Unlimited file sizes supported for import.

Download Canopy and start a free 7 day trial of the data import tool Continue reading

Webinar: Introducing the NEW Python Integration Toolkit for LabVIEW

See a recording of the webinar:

LabVIEW is a software platform made by National Instruments, used widely in industries such as semiconductors, telecommunications, aerospace, manufacturing, electronics, and automotive for test and measurement applications. In August 2016, Enthought released the Python Integration Toolkit for LabVIEW, which is a “bridge” between the LabVIEW and Python environments.

In this webinar, we’ll demonstrate:

  1. How the new Python Integration Toolkit for LabVIEW from Enthought seamlessly brings the power of the Python ecosystem of scientific and engineering tools to LabVIEW
  2. Examples of how you can extend LabVIEW with Python, including using Python for signal and image processing, cloud computing, web dashboards, machine learning, and more

Continue reading

5 Simple Steps to Create a Real-Time Twitter Feed in Excel using Python and PyXLL

PyXLL 3.0 introduced a new, simpler, way of streaming real time data to Excel from Python.

Excel has had support for real time data (RTD) for a long time, but it requires a certain knowledge of COM to get it to work. With the new RTD features in PyXLL 3.0 it is now a lot simpler to get streaming data into Excel without having to write any COM code.

This blog will show how to build a simple real time data feed from Twitter in Python using the tweepy package, and then show how to stream that data into Excel using PyXLL.

(Note: The code from this blog is available on github https://github.com/pyxll/pyxll-examples/tree/master/twitter). 

Create a real-time Twitter data feed using Python and PyXLL.

Create a real-time Twitter feed in Excel using Python and PyXLL.

Continue reading

AAPG 2016 Conference Technical Presentation: Unlocking Whole Core CT Data for Advanced Description and Analysis

Microscale Imaging for Unconventional Plays Track Technical Presentation:

Unlocking Whole Core CT Data for Advanced Description and Analysis

Brendon Hall, Geoscience Applications Engineer, EnthoughtAmerican Association of Petroleum Geophysicists (AAPG)
2016 Annual Convention and Exposition Technical Presentation
Tuesday June 21st at 4:15 PM, Hall B, Room 2, BMO Centre, Calgary

Presented by: Brendon Hall, Geoscience Applications Engineer, Enthought, and Andrew Govert, Geologist, Cimarex Energy

PRESENTATION ABSTRACT:

It has become an industry standard for whole-core X-ray computed tomography (CT) scans to be collected over cored intervals. The resulting data is typically presented as static 2D images, video scans, and as 1D density curves.

CT scan of core pre- and post-processing

CT scans of cores before and after processing to remove artifacts and normalize features.

However, the CT volume is a rich data set of compositional and textural information that can be incorporated into core description and analysis workflows. In order to access this information the raw CT data initially has to be processed to remove artifacts such as the aluminum tubing, wax casing and mud filtrate. CT scanning effects such as beam hardening are also accounted for. The resulting data is combined into contiguous volume of CT intensity values which can be directly calibrated to plug bulk density.

Continue reading