Python on Cloud Foundry

I’m very happy to be giving a talk at the latest PyData conference in New York this weekend.

This is a long post but I wanted a place to collect all the code I am showing in my talk and to provide a few more resources for those interested in trying out Python on Cloud Foundry further.

Resources Cloud Foundry

What is Cloud Foundry?

My talk is about how to use Python and the PyData stack on Cloud Foundry the open source cloud platform. Cloud Foundry started life at VMware and development transferred to Pivotal when it was formed. Cloud Foundry has grown much bigger since then with over 30 companies joining together to form the Cloud Foundry Foundation which will guide the development of the open source project.

On one level Cloud Foundry is a simple way to deploy and scale cloud based web applications. Instead of a complicated process to set up a host, install a web server, configure a load balancer etc, Cloud Foundry does all this for you, letting you concentrate on your application rather than the scaffolding around it.

On another level Cloud Foundry provides protection against cloud lock-in, where your application deployment process is so tied in to Amazon Web Services, Google Compute Engine or another provider that you can’t easily move your applications if you want to. In addition, Cloud Foundry lets you build on-site private clouds and your apps will never know the difference compared to a hosted public cloud installation.

Update: Video of PyData talk

Cloud Foundry for Data Scientists

As a data scientist, I tend not to want to get involved in setting up or maintaining systems and Cloud Foundry has given me a really simple way to write, deploy and iterate web apps that display results, process incoming data or bind to existing data stores. More on that later.

Quick Howto

  • Deploy application in the current directory
cf push myapp
  • Scale up and out quickly
cf scale myapp -i 5 -m 1G
  • Create and bind services
cf bind-service myapp redis

More details

Python on Cloud Foundry

Python is a first class language on Cloud Foundry and standard Python web apps can be auto-detected and built.
Cloud Foundry deploys applications in containers and uses buildpacks to install the runtime (a Python interpreter in the case of Python apps), any dependencies for the application, and then launches the app.

The official Cloud Foundry Python buildpack uses pip to install dependencies and is simple to use with (non-PyData) Python web applications.

A simple Flask web app like this one can be deployed using the Cloud Foundry command line interface with cf push and Cloud Foundry will make sure to install Python, the Flask package and all its dependencies, before starting the server with the command in the Procfile.

Here’s a video of the whole process:

Data services

The ephemeral nature of cloud applications means that you cannot rely on the local storage of the container to persist your data. (See the rules for 12 Factor Apps for more info.) Of course in the era of big data you probably need to have a distributed data store in the first place.

You can use Cloud Foundry services to make setting up and connecting with data storage and processing systems simple. Here’s an example using the RedisCloud service to create a Redis store.

  • Create a service from the command line
cf create-service rediscloud PLAN_NAME INSTANCE_NAME
  • Bind the service to your app
cf bind-service APP_NAME INSTANCE_NAME

Your application should look in the VCAP_APP_SERVICES environmental variable to find details of the services available to it in JSON format:

{ "rediscloud": [
{ "name": "rediscloud-42", "label": "rediscloud", "plan": "20mb",
"credentials": { "port"": "6379",
"hostname": "pub-redis-6379.us-east-1-2.3.ec2.redislabs.com",
"password": "your_redis_password” } } ]
}

There are many different data services available for both hosted and packaged instances of Cloud Foundry.

PyData stack on Cloud Foundry

The current Python buildpack uses pip to install dependencies. Anyone who has tried to install NumPy or SciPy using pip knows that the process can be lengthy and painful often requiring manual intervention to correct library paths and install Fortran compilers.

Fortunately Continuum Analytics’ conda package manager was created to solve these problems by packaging and distributing the standard tools of the Python data stack in compiled binaries.

I wanted to build web apps on Cloud Foundry using the PyData stack so with help from a colleague I have written a Cloud Foundry buildpack which uses both conda and pip to install required packages.

You can specify packages to be installed by conda in the conda_requirements.txt file and these will be installed first, followed by packages in the requirements.txt which will be installed as usual by pip.

As an example of a PyData web app, Adam Hajari has created an RShiny equivalent called Spyre. This can be easily deployed to Cloud Foundry by specifying the conda and pip requirements as described above. If you want to try yourself I’ve put together this gist with the simple sine wave example from Adam’s notes.

Summary

Why is Cloud Foundry useful for data scientists? Being able to forget about server provisioning and configuration and concentrating instead on creating compelling visualisations and data driven apps is a welcome step forward for my workflow.

If you want to try out Cloud Foundry for yourself there are a number of hosted options available, one of which is Pivotal Web Services (which is run by Pivotal, my employer). There is a six week introductory trial available and you can estimate your monthly running costs after that.

Further Reading

If you want to learn more about how to use Cloud Foundry or how to write cloud ready applications here are a few links to get you started:

 

Ian

A physicist by training, I am curious about the world around us, from the smallest to the largest scales. I am now a part of the Pivotal Data Science team and work on interesting data science and predictive analytics projects across a wide range of industries. On Twitter I'm @ianhuston, and on Github I'm ihuston.

 

2 thoughts on “Python on Cloud Foundry

Leave a Reply

Your email address will not be published. Required fields are marked *

Bear