Towards Cloud Native Data Science

In my talk at ODSC West I wanted to start a conversation about what if any value the idea of Cloud Native Applications has for data science. The video and slides from my presentation are below and the slides are also available without speaker notes.

If you haven’t heard of Cloud Native Applications, the idea is to write applications that take full advantage of the benefits of cloud deployment and understand the limitations and constraints of the platform.
Continue reading…


PyData: From New York to London

I have been using the Python data ecosystem (consisting of NumPy, Matplotlib, Pandas, and many more) for a few years now, so I was really glad to be able to attend the conference dedicated to all things Python data related, PyData, in its latest incarnation in New York last November.

PyDataPyData has been running roughly three times a year since 2012 when the first event was held in the Google Campus in Mountain View. Having not been to any of the previous events I didn’t quite know what to expect from a conference that is quite specific in its scope, unlike say Strata or Pycon which cater to the huge constituencies of data analysis and Python respectively.

With my colleague Srivatsan Ramanujan, I submitted an abstract for a talk and we were really happy to get a slot on Sunday morning. We even managed to get Pivotal to become involved as a sponsor. We talked about how we use the Pydata stack in our data science work at Pivotal, including using procedural PL/Python in a massively parallel way using the Greenplum database. The slides for our talk are embedded below, and the accompanying video is also available (as are all the other PyData talks).

The atmosphere throughout the entire weekend was great, with a real focus on tools and techniques, and not the sales and marketing overkill of some other conferences. It was good to meet so many people involved in creating and maintaining the tools I use on a daily basis, if only to be able to buy them a beer as thanks for their hard work. The consensus between my colleague and I at the end of the weekend was that it would have been well worth going in a personal capacity, even if our employer hadn’t funded the trip (something you can’t say about many conferences).

It has just been announced that the first PyData event of 2014 is going to be in Canary Wharf in London. I would recommend anyone who has an interest in the Python ecosystem for data analysis to attend. The tutorials on the first day of the conference are a particularly good way to get up to speed on a topic whether it’s using IPython notebook, running simulations in PyMC or creating beautiful graphs with Python and D3. I’m hoping work commitments will enable me to be there, so say hello if you see me there.