Category Archives: SciPy

Webinar: Work Better, Smarter, and Faster in Python with Enthought Training on Demand

Join Us For a Webinar

Enthought Training on Demand Webinar

We’ll demonstrate how Enthought Training on Demand can help both new Python users and experienced Python developers be better, smarter, and faster at the scientific and analytic computing tasks that directly impact their daily productivity and drive results.

View a recording of the Work Better, Smarter, and Faster in Python with Enthought Training on Demand webinar here.

What You’ll Learn

Continue reading

Exploring NumPy/SciPy with the “House Location” Problem

Author: Aaron Waters

I created a Notebook that describes how to examine, illustrate, and solve a geometric mathematical problem called “House Location” using Python mathematical and numeric libraries. The discussion uses symbolic computation, visualization, and numerical computations to solve the problem while exercising the NumPy, SymPy, Matplotlib, IPython and SciPy packages.

I hope that this discussion will be accessible to people with a minimal background in programming and a high-school level background in algebra and analytic geometry. There is a brief mention of complex numbers, but the use of complex numbers is not important here except as “values to be ignored”. I also hope that this discussion illustrates how to combine different mathematically oriented Python libraries and explains how to smooth out some of the rough edges between the library interfaces.

http://nbviewer.ipython.org/urls/raw.github.com/awatters/CanopyDemoArchive/master/misc/house_locations.ipynb

Advanced Cython Recorded Webinar: Typed Memoryviews

Author: Kurt SmithWebinar_screenshot

Typed memoryviews are a new Cython feature for accessing memory buffers, such as NumPy arrays, without any Python overhead. This makes them very useful for manipulating blocks of memory in Cython directly without calling into the Python-C API.  Typed memoryviews have a clean declaration syntax and have a NumPy-like look and feel, supporting slicing, striding and indexing.

I go into more detail and provide some specific examples on how to use typed memoryviews in this webinar: “Advanced Cython: Using the new Typed Memoryviews”.

If you would like to watch the recorded webinar, you can find a link below (the different formats will play directly in different browsers so check to see which one works for you, and you won’t have to download the whole recording ahead of time):

For all you EPD Users: Canopy v1.1

EPD (Enthought Python Distribution) provided a simple install of Python for scientific computing on the major platforms: Windows, Linux and Mac-OS. Those looking for a clean, straightforward Python stack to unpack into a particular directory found EPD to be pretty ideal.

With the introduction of Enthought Canopy, we began addressing users who are more engineer or scientist than programmer and were much less familiar with command-line interfaces. The Canopy desktop (in the vein of MATLAB or Spyder) aims at these technical users who want to use Python, but more as an application or IDE. To implement the desktop in Python and to allow both it and a user-defined Python environment to co-exist and be separately updated, we used virtual environments. As a consequence Canopy can feel a bit foreign to EPD users. With 1.1 we have added a new command line interface (CLI) that will hopefully make EPD users feel more at home in Canopy while retaining many of the Canopy advantages such as in-place update and virtual environment support.

Now, EPD users who just want to use Canopy as a plain Python environment with their own tools or IDE can easily create one or more Python environments. For example, from the command line on Windows:

        Canopy_cli.exe setup C:\Python27

or on Linux:

        canopy_cli setup ~/canopy

The target directory can be any you choose. If you want to make this Python environment the default on your system, you can specify the –default switch, and Canopy will add the appropriate bin directory (Scripts directory on Windows) to your PATH environment variable. On Mac OS and Linux systems, Canopy does this by appending a line to your ~/.bash_profile file which activates the correct virtual environment. On Windows, this Python environment is also added to the system registry so third-party tools can correctly find it.

Since we use virtual environments, the installation layout for Canopy is different. With Canopy we install what is referred to as “Canopy Core”: the core Python environment and a minimum set of packages needed to bootstrap Canopy itself. With it we can lock down the Canopy environment, facilitate the automatic update mechanism, and provide reliable startup and fail-safe recovery. For the user, there is a different environment. This means when a Python update comes out, it is no longer necessary to install a whole new environment plus all of your packages and get everything working again. Instead, simply update Canopy and go back to working — all of your packages are still installed but Python has been upgraded.

To complete an install, Canopy creates two virtual environments named ‘System’ and ‘User’. System is where the Canopy GUI runs; no user code runs in this environment. Updates to this virtual environment are done via the Canopy update mechanisms. The User environment is where the kernel and all user code runs. This virtual environment is managed by Package Manager from the desktop or by enpkg from the command line; any packages can be updated and installed without fear of disrupting the GUI. Similarly, updates to the Canopy GUI will not affect packages installed in the User environment and break your code.

So why stick with virtual environments for an “EPD-like” install? One of the big challenges with the old, “flat” EPD installation method was updating an install, or trying out different package configurations. With virtual environments, you can create a new environment which inherits packages from another virtual environment, and try out a few package changes. When you are satisfied, it’s straightforward to throw away the experimentation area and make the changes to the original, stable virtual environment.

For more details, check out Creating an EPD-like Python environment in our online docs. And you can download Canopy v1.1 now.

Canopy v1.1 – Linux, Command Line Interface and More

Final-version-canopy-logo (1)

With version 1.1, Enthought Canopy now:

1) addresses, much more completely, the command line use cases that EPD users and IT managers expect from their Python distributions,

2) makes Linux support generally available,

3) streamlines installation for users without internet access with full, single-click installers,

4) supports multiple virtual environments for advanced users via “venv” backported to Python 2.7, and

5) provides updates like numpy 1.7.1, matplotlib 1.3.0 and more.

It’s been just over 4 months since Canopy v1.0 shipped with the new desktop analysis environment and our updated Python distribution for scientific computing. Canopy’s analysis environment seems to be well-received by users looking for a simpler GUI environment, but the Canopy graphical installation process left something to be desired by our EPD users.

Along with the Canopy desktop for users that don’t want to work directly from a command line, Canopy version 1.1 now provides command-line utilities that streamline the installation of a complete Python scientific stack for current EPD users who want to work from the shell or command line. In addition, IT groups or tools specialists that need to manage a central install of Python for a workgroup or department now have the tools they need to install and maintain Canopy. Version 1.1’s command-line installation and setup (and the 1-click, full installers detailed below) are much better for supporting Canopy installations on clusters as well.

Canopy for Linux is now fully released. We have full, tested support for RedHat5, CentOS5, and Ubuntu 12.04. Linux distros and versions beyond those work as well (anecdotally and based on some in-house use), but those are our tested versions.

With Canopy v1.0 we implemented a 2-step installation process. The installer includes the Canopy desktop, the Python packages needed by Canopy itself, and other core scientific Python stack packages for a minimal install (the libraries in Canopy Express). For those with a subscription, the second step requires downloading any additional packages using the Package Manager. This 2-step process is problematic for users that don’t have easy internet access or need to install centrally for a group. To help, we now provide full installers with all the Python packages we support included. This provides a streamlined 1-step install process for those who need it or want it.

To ensure users can install any package updates they wish without messing up package dependencies for Canopy itself, we use virtual environments under the hood. With v1.1 we now provide command-line access to our backport of “venv”. The new CLI provides utilities to create, upgrade, activate and deactivate your own virtual environments. Now its much easier to try out new Python environments or set up multiple configurations for a workgroup.

Canopy v1.1 ships many updates to packages and many new ones: OpenCV, LLVM, Bottleneck, gevent, msgpack, py, pytest, six, NLTK, Numba, Mock, patsy and more. You can see the full details on the Canopy Package Index page.

We hope you find version 1.1 useful!

Raspberry Pi Sensor and Actuator Control

Author: Jack Minardi

I gave a talk at SciPy 2013 titled open('dev/real_world') Raspberry Pi Sensor and Actuator Control. You can find the video on youtube, the slides on google drive and I will summarize the content here.

Typically as a programmer you will work with data on disk, and if you are lucky you will draw pictures on the screen. This is in contrast to physical computing which allows you as a programmer to work with data sensed from the real world and with data sent to control devices that move in the real world.

Mars Rover

physical computing at work. (source)

Goal

Use a Raspberry Pi to read in accelerometer value and to control a servo motor.

Definitions

  • Raspberry Pi
    • Small $35 Linux computer with 2 USB ports, HDMI out, Ethernet, and most importantly…
  • GPIO Pins
    • General Purpose Input/Output Pins
    • This is the component that truly enables “physical computing”. You as a programmer can set the voltage high or low on each pin, which is how you will talk to actuators. You can also read what the voltage is currently on each pin. This is how sensors will talk back to you. It is important to note that each pin represents a binary state, you can only output a 0 or a 1, nothing in between.

In this article I will go over four basic Python projects to demonstrate the hardware capabilities of the Raspberry Pi. Those projects are:

  • Blink an LED.
  • Read a pot (potentiometer).
  • Stream data.
  • Control a servo.

Blink an LED.

An LED is a Light Emitting Diode. A diode is a circuit element that allows current to flow in one direction but not the other. Light emitting means … it emits light. Your typical LED needs current in the range of 10-30 mA and will drop about 2-3 volts. If you connect an LED directly to your Pi’s GPIO it will source much more than 30 mA and will probably fry your LED. To prevent this we have to put a resistor. If you want to do math you can calculate the appropriate resistance using the following equation:

R = (Vs - Vd) / I

But if you don’t want to do math then pick a resistor between 500-1500 ohms. Once you’ve gathered up all your circuit elements (LED and resistor), build this circuit on a breadboard:

LED Circuit

thats not so bad, is it?

The code is also pretty simple. But first you will need to install RPi.GPIO. (It might come preinstalled on your OS.)

import time
from itertools import cycle
import RPi.GPIO as io

io.setmode(io.BCM)
io.setup(12, io.OUT)

o = cycle([1, 0])
while True:
    io.output(12, o.next())
    time.sleep(0.5)

The important lines basically are:

io.setup(12, io.OUT)
io.output(12, 1)

These lines of code setup pin 12 as an output, and then output a 1 (3.3 volts). Run the above code connected to the circuit and you should see your LED blinking on and off every half second.


Read a pot.

A pot is short for potentiometer, which is a variable resistor. This is just a fancy word for knob. Basically by turning the knob you affect the resistance, which affects the voltage across the pot. (V = IR, remember?). Changing voltage relative to some physical value is how many sensors work, and this class of sensor is known as an analog sensor. Remember when I said the GPIO pins can only represent a binary state? We will have to call in the aide of some more silicon to convert that analog voltage value into a binary stream of bits our Pi can handle.

That chunk of silicon is refered to as an Analog-to-Digital Converter (ADC). The one I like is called MCP3008, it has 8 10-bit channels, meaning we can read 8 sensors values with a resolution of 1024 each (2^10). This will map our input voltage of 0 – 3.3 volts to an integer between 0 and 1023.

LED Circuit

I’ve turned the Pi into ephemeral yellow labels to simplify the diagram

To talk to the chip we will need a python package called spidev. For more information about the package and how it works with the MCP3008 check out this great blog post

With spidev installed and the circuit built, run the following program to read live sensor values and print them to stdout.

import spidev
import time

spi = spidev.SpiDev()
spi.open(0,0)

def readadc(adcnum):
    if not 0 <= adcnum <= 7:
        return -1
    r = spi.xfer2([1, (8+adcnum)<<4, 0])
    adcout = ((r[1] & 3) << 8) + r[2]
    return adcout

while True:
    val = readadc(0)
    print val
    time.sleep(0.5)

The most important parts are these two lines:

r = spi.xfer2([1, (8+adcnum)<<4, 0])
adcout = ((r[1] & 3) << 8) + r[2]

They send the read command and extract the relevant returned bits. See the blog post I linked above for more information on what is going on here.


Stream data.

To stream data over the wire we will be using the ØMQ networking library and implementing the REQUEST/REPLY pattern. ØMQ makes it super simple to set up a client and server in Python. The following is a complete working example.

Server

import zmq

context = zmq.Context()
socket = context.socket(
    zmq.REP)
socket.bind('tcp://*:1980')

while True:
    message = socket.recv()
    print message
    socket.send("I'm here")

Client

import zmq

context = zmq.Context()
socket = context.socket(
    zmq.REQ)
a = 'tcp://192.168.1.6:1980'
socket.connect(a)

for request in range(10):
    socket.send('You home?')
    message = socket.recv()
    print message

Now we can use traits and enaml to make a pretty UI on the client side. Check out the acc_plot demo in the github repo to see an example of the Pi streaming data over the wire to be plotted by a client.


Control a servo

Servos are (often small) motors which you can drive to certain positions. For example, for a given servo you may be able to set the drive shaft from 0 to 18o degrees, or anywhere in between. As you can imagine, this could be useful for a lot of tasks, not least of which is robotics.

Shaft rotation is controlled by Pulse Width Modulation (PWM) in which you encode information in the duration of a high voltage pulse on the GPIO pins. Most hobby servos follow a standard pulse width meaning. A 0.5 ms pulse means go to your min position and a 2.5 ms pulse means go to your max position. Now repeat this pulse every 20 ms and you’re controlling a servo.

PWM Diagram

The pulse width is much more critical than the frequency

These kind of timings are not possible with Python. In fact, they aren’t really possible with a modern operating system. An interrupt could come in at any time in your control code, causing a longer than desired pulse and a jitter in your servo. To meet the timing requirements we have to enter the fun world of kernel modules. ServoBlaster is a kernel module that makes use of the DMA control blocks to bypass the CPU entirely. When loaded, the kernel module opens a device file at /dev/servoblaster that you can write position commands to.

I’ve written a small object oriented layer around this that makes servo control simpler. You can find my library here:

https://github.com/jminardi/RobotBrain

Simple connect the servo to 5v and ground on your Pi and then connect the control wire to pin 4.

Servo Diagram

The python code is quite simple:

import time
import numpy as np
from robot_brain.servo import Servo

servo = Servo(0, min=60, max=200)

for val in np.arange(0, 1, 0.05):
    servo.set(val)
    time.sleep(0.1)

All you have to do is instantiate a servo and call its set() method with a floating point value between 0 and 1. Check out the servo_slider demo on github to see servo control implemented over the network.

SciPy 2013 Conference Recap

Author: Eric Jones

Another year, another great conference.  Man, this thing grew a ton this year.  At final count, we had something like 340 participants which is way up from last year’s 200 or so attendees.  In fact, we had to close registration a couple of weeks before the program because that is all our venue could hold.  We’ll solve that next year.  Invite your friends.  We’d love to see 600 or even more.

Many thanks to the organizing team.  Andy Terrell and Jonathan Rocher did an amazing job as conference chairs this year both managing that growth and keeping the trains on time.  We expanded to 3 parallel sessions this year, which often made me want to be in 3 places at once.  Didn’t work.  Thankfully, the videos for all the talks and sessions are available online.  The video team really did a great job — thanks a ton.

I’ve wondered whether the size would change the feel of the conference, but I’m happy to report it still feels like an gathering of friends, new and old.  Aric Hagberg mentioned he thinks this is because it’s such a varied (motley?) crowd from disparate fields gathered to teach, learn, and share software tools and ideas.  This fosters a different atmosphere than some academic conferences where sparring about details of a talk is a common sport.  Hmh.  Re-watching the videos, I see Fernando Perez mentions this as well.

Thanks again to all who organized and all who attended.  I’m already looking forward to seeing you again next year.  Below are my personal musings on various topics at the conference:

  • The tutorials were, as usual, extremely well attended.  I spent the majority of my time there in the scikits learn track by Gael VaroquauxOlivier Grisel, and Jake VanderPlas.  Jeez, has this project gone far.  It is stunning to see the breath and quality of the algorithms that they have.  It’s obviously a hot topic these days; it is great to have such an important tool set at our disposal.
  • Fernando Perez gave a keynote this year about IPython.  We can safely say that 2013 is the year of the IPython notebook.  It was *everywhere*.  I’d guess 80+% of the talks and tutorials for the conference used it in their presentations.  Fernando went one step further, and his slide deck was actually live IPython notebooks.  Quite cool.  I do believe it’ll change the way people teach Python…  But, the most impressive thing is that Fernando still has and can execute the original 250 line script that was IPython 0.00001.  Scratch that.  The most impressive thing is to hear how Fernando has managed to build a community and a project that is now supported by a $1.1M grant from the Sloan foundation.  Well done sir.  The IPython project really does set the standard on so many levels.
  • Olivier Grisel, of scikits learn fame, gave a keynote on trends in machine learning.  It was really nice because he talked about the history of neural networks and the advances that have been made in “deep learning” in recent years.  I began grad school in NN research, and was embarrassed to realize how recent (1986) the back propagation learning algorithm was when I first coded it for research (1993).  It seemed old to me then — but I guess 7 years to a 23 year is, well, pretty old.  Over the years, I became a bit disenchanted with neural nets because they didn’t reveal the underlying physical process within the data.  I still have this bias, but Olivier’s discussion of the “deep learning” advances convinced me that I should get re-educated.  And, perhaps I’m getting more pragmatic as the gray hairs fill in (and the bald spot grows).  It does look like it’s effective for multiple problems in the detection and classification world.
  • William Schroeder, CEO of Kitware, gave a keynote on the importance of reproducible research which was one of the conference themes.  It was a privilege to have him because of the many ways Kitware illuminated the path for high quality scientific software in the open source world with VTK.  I’ve used it both in C++ and, of course, from Python for many, many years.  In his talk, Will talked about the existing scientific publication model doesn’t work so well anymore, and that, in fact, with the web and tools that are now available, direct publishing of results is the future together with publishing our data sets and code that generated them.  This actually dovetailed really well with Fernando’s talk, and I can’t help but think that we are on this track.
  • David Li has been working with the SymPy team, and his talk showed off the SymPy Live site that they have built to interactively try out symbolic calculations on the web.  I believe David is the 2nd high school student to present in the history of SciPy, yes? (Evan Patterson was the other that I remember)  Heh.  Aaand, what were you doing your senior year?  Both were composed, confident, and dang good — bodes well for our future.
  • There are always a few talks of the “what I have learned” flavor at Python.  This year, Brian Granger of IPython fame gave one about the dangers of features and the benefits of bugs.  Brian’s talks are almost always one of my favorites (sorta like I always make sure to see what crazy stuff David Beazley presents at PyCon).  Part of it is that he often talks about parallel computing for the masses which is dear to my heart, but it is also because he organizes his topics so well.
  • Nicholas Kridler also unexpectedly hooked me with another one of these talks.  I was walking out of conference hall after the keynote to go see what silly things the ever smiling Jake Vanderplas might be up to in his astronomy talk.  But derned if Nicholas didn’t start walking through how he approaches new machine learning problems in interesting ways.  My steps slowed, and I finally sat down, happy to know that I could watch Jake’s talk later.  Nicholas used his wits and scikits learn to win(!) the Kaggle whale detection competition earlier this year, and he gave us a great overview of how he did it.  Well worth a listen.
  • Both Brian and Nicholas’ talks started me thinking how much I like to see how experts approach problems.  The pros writing all the cool libraries often give talks on the features of their tools or the results of their research, but we rarely get a glimpse into their day-to-day process.  Sorta like pair programming with Martin Chilvers is a life changing experience (heh.  for better or worse… :-)), could we have a series of talks where we get to ride shotgun with a number of different people and see how they work?  How does Ondrej Certik work through a debugging session on SymPy development?  Does his shiny new cowboy hat from Allen Boots help or not?  When approaching a new simulation or analysis, how does Aric Hagberg use graph theory (and Networkx) to set the problem up?  When Serge Rey gets a new set of geospatial data, what are the common things he does to clean and organize the data for analysis with PySAL?  How does Wes McKinney think through API design trade-offs as he builds Pandas?  And, most importantly, how does Stefan Van Der Walt get the front of his hair to stand up like that? (comb or brush? hair dryer on low or high?)  Ok, maybe not Stefan, but you get the idea.  We always see a polished 25 minute presentation that sums up months or years of work that we all know had many false starts and painful points.  If we could learn about where people stubbed their toe and how to avoid it in our work, it would be pretty cool.  Just an idea, but I will run it by the committee for next year and see if there is any interest.
  • The sprints were nothing short of awesome.  Something like 130+ people were there on the first day sprinting on 10-20 different libraries including SymPy, NumPy, IPython, Matplotlib as well as more specific tools like scikits image and PySal.  Amazing to see.  Perhaps the bigger surprise was that at least half also stayed for Saturday’s sprints.  scikits learn had a team of about 10 people that worked two full days together (Friday and Saturday activity visible on the commit graph), and I think multiple other groups did as well.  While we’ve held sprints for a while, we had 2 top 3 times as many people as 2012, and this year’s can only be described as wildly successful.

  • While I was there, I spent most of my time checking in on the PySide sprint where John Erhsman of Wingware got a new release ready for the 4.8 series of Qt (bless him), and Robin Dunn, Corran Webster, Stefan Landgovt, and John Wiggins investigated paths forward toward 5.x compatibility.  No one was too excited about Shiboken, but the alternatives are also not a walk in the park.  I think the feeling is, long term, we’ll need to bite the bullet and go a different direction than Shiboken.

TGIF: SciPy 2012 Recap Video

As we wait for the SciPy talk videos to make their way onto the web, we’d like to share a short film recapping SciPy 2012.

The latest iteration of the SciPy conference was another great example of the scientific python community coming together to share “the latest and greatest.” Most organizations want to change the world in some way or another. At Enthought, we attempt to do this by building tools that help our customers – in both academia and industry – concentrate on solving their actual problems rather than wrestling with technology. We believe Python’s ability to operate smoothly in different contexts (e.g., desktop, web, array-based and distributed computing, etc.) makes it a highly productive and pragmatic tool with which to build solutions.

The SciPy community is changing the world by continually pushing technical computing forward in a pragmatic way. One just has to look at the content and tools presented at SciPy historically to know that this community has been been up to its neck in “data science” for some time. One could also argue, however, that SciPy is one of the best kept secrets in technical computing. As the recent focus on MapReduce solutions illustrates, the world is in the grips of “big computation.” It will only get tougher in the foreseeable future. At the same time, “big data” is a relative term. “Big” for a bioinformatician is different than for a macro hedge fund analyst, and these differences can often be measured in orders of magnitude. And when it comes to solutions, rarely does one size fit all.

In contrast, SciPy addresses a broad array of problems. SciPy 2012 offered High Performance Computing and Visualization tracks, with tutorials on machine learning, plotting, parallel computing, and time series analysis. Sometimes all these topics could be found in a single talk (see VisIt). The community also demonstrated some open-mindedness by inviting Jeff Bezanson, one of the authors of Julia, to share his experience building a language specifically designed for technical computing. It turns out there is a fair amount of overlap between what the SciPy community and the Julia team are planning. With LLVM IR increasingly being viewed as a common target, there is real excitement about what the future holds for language development and interaction.

This is all to say that SciPy has a lot to offer the world. Stay tuned for a bigger and better SciPy next year!