Author: Eric Jones
Another year, another great conference. Man, this thing grew a ton this year. At final count, we had something like 340 participants which is way up from last year’s 200 or so attendees. In fact, we had to close registration a couple of weeks before the program because that is all our venue could hold. We’ll solve that next year. Invite your friends. We’d love to see 600 or even more.
Many thanks to the organizing team. Andy Terrell and Jonathan Rocher did an amazing job as conference chairs this year both managing that growth and keeping the trains on time. We expanded to 3 parallel sessions this year, which often made me want to be in 3 places at once. Didn’t work. Thankfully, the videos for all the talks and sessions are available online. The video team really did a great job — thanks a ton.
I’ve wondered whether the size would change the feel of the conference, but I’m happy to report it still feels like an gathering of friends, new and old. Aric Hagberg mentioned he thinks this is because it’s such a varied (motley?) crowd from disparate fields gathered to teach, learn, and share software tools and ideas. This fosters a different atmosphere than some academic conferences where sparring about details of a talk is a common sport. Hmh. Re-watching the videos, I see Fernando Perez mentions this as well.
Thanks again to all who organized and all who attended. I’m already looking forward to seeing you again next year. Below are my personal musings on various topics at the conference:
- The tutorials were, as usual, extremely well attended. I spent the majority of my time there in the scikits learn track by Gael Varoquaux, Olivier Grisel, and Jake VanderPlas. Jeez, has this project gone far. It is stunning to see the breath and quality of the algorithms that they have. It’s obviously a hot topic these days; it is great to have such an important tool set at our disposal.
- Fernando Perez gave a keynote this year about IPython. We can safely say that 2013 is the year of the IPython notebook. It was *everywhere*. I’d guess 80+% of the talks and tutorials for the conference used it in their presentations. Fernando went one step further, and his slide deck was actually live IPython notebooks. Quite cool. I do believe it’ll change the way people teach Python… But, the most impressive thing is that Fernando still has and can execute the original 250 line script that was IPython 0.00001. Scratch that. The most impressive thing is to hear how Fernando has managed to build a community and a project that is now supported by a $1.1M grant from the Sloan foundation. Well done sir. The IPython project really does set the standard on so many levels.
- Olivier Grisel, of scikits learn fame, gave a keynote on trends in machine learning. It was really nice because he talked about the history of neural networks and the advances that have been made in “deep learning” in recent years. I began grad school in NN research, and was embarrassed to realize how recent (1986) the back propagation learning algorithm was when I first coded it for research (1993). It seemed old to me then — but I guess 7 years to a 23 year is, well, pretty old. Over the years, I became a bit disenchanted with neural nets because they didn’t reveal the underlying physical process within the data. I still have this bias, but Olivier’s discussion of the “deep learning” advances convinced me that I should get re-educated. And, perhaps I’m getting more pragmatic as the gray hairs fill in (and the bald spot grows). It does look like it’s effective for multiple problems in the detection and classification world.
- William Schroeder, CEO of Kitware, gave a keynote on the importance of reproducible research which was one of the conference themes. It was a privilege to have him because of the many ways Kitware illuminated the path for high quality scientific software in the open source world with VTK. I’ve used it both in C++ and, of course, from Python for many, many years. In his talk, Will talked about the existing scientific publication model doesn’t work so well anymore, and that, in fact, with the web and tools that are now available, direct publishing of results is the future together with publishing our data sets and code that generated them. This actually dovetailed really well with Fernando’s talk, and I can’t help but think that we are on this track.
- I loved the lab coats that the session chairs wore. Nice touch Anthony. It cancels out your puns during the lightin’ talks. Almost.
- David Li has been working with the SymPy team, and his talk showed off the SymPy Live site that they have built to interactively try out symbolic calculations on the web. I believe David is the 2nd high school student to present in the history of SciPy, yes? (Evan Patterson was the other that I remember) Heh. Aaand, what were you doing your senior year? Both were composed, confident, and dang good — bodes well for our future.
- There are always a few talks of the “what I have learned” flavor at Python. This year, Brian Granger of IPython fame gave one about the dangers of features and the benefits of bugs. Brian’s talks are almost always one of my favorites (sorta like I always make sure to see what crazy stuff David Beazley presents at PyCon). Part of it is that he often talks about parallel computing for the masses which is dear to my heart, but it is also because he organizes his topics so well.
- Nicholas Kridler also unexpectedly hooked me with another one of these talks. I was walking out of conference hall after the keynote to go see what silly things the ever smiling Jake Vanderplas might be up to in his astronomy talk. But derned if Nicholas didn’t start walking through how he approaches new machine learning problems in interesting ways. My steps slowed, and I finally sat down, happy to know that I could watch Jake’s talk later. Nicholas used his wits and scikits learn to win(!) the Kaggle whale detection competition earlier this year, and he gave us a great overview of how he did it. Well worth a listen.
- Both Brian and Nicholas’ talks started me thinking how much I like to see how experts approach problems. The pros writing all the cool libraries often give talks on the features of their tools or the results of their research, but we rarely get a glimpse into their day-to-day process. Sorta like pair programming with Martin Chilvers is a life changing experience (heh. for better or worse… :-)), could we have a series of talks where we get to ride shotgun with a number of different people and see how they work? How does Ondrej Certik work through a debugging session on SymPy development? Does his shiny new cowboy hat from Allen Boots help or not? When approaching a new simulation or analysis, how does Aric Hagberg use graph theory (and Networkx) to set the problem up? When Serge Rey gets a new set of geospatial data, what are the common things he does to clean and organize the data for analysis with PySAL? How does Wes McKinney think through API design trade-offs as he builds Pandas? And, most importantly, how does Stefan Van Der Walt get the front of his hair to stand up like that? (comb or brush? hair dryer on low or high?) Ok, maybe not Stefan, but you get the idea. We always see a polished 25 minute presentation that sums up months or years of work that we all know had many false starts and painful points. If we could learn about where people stubbed their toe and how to avoid it in our work, it would be pretty cool. Just an idea, but I will run it by the committee for next year and see if there is any interest.
- The sprints were nothing short of awesome. Something like 130+ people were there on the first day sprinting on 10-20 different libraries including SymPy, NumPy, IPython, Matplotlib as well as more specific tools like scikits image and PySal. Amazing to see. Perhaps the bigger surprise was that at least half also stayed for Saturday’s sprints. scikits learn had a team of about 10 people that worked two full days together (Friday and Saturday activity visible on the commit graph), and I think multiple other groups did as well. While we’ve held sprints for a while, we had 2 top 3 times as many people as 2012, and this year’s can only be described as wildly successful.
- While I was there, I spent most of my time checking in on the PySide sprint where John Erhsman of Wingware got a new release ready for the 4.8 series of Qt (bless him), and Robin Dunn, Corran Webster, Stefan Landgovt, and John Wiggins investigated paths forward toward 5.x compatibility. No one was too excited about Shiboken, but the alternatives are also not a walk in the park. I think the feeling is, long term, we’ll need to bite the bullet and go a different direction than Shiboken.