Data diving for charity

Last weekend I took part in the first London DataDive, a charitable event organised by DataKind, who previously organised similar events across the US. The basic premise is that charities have collected large amounts of data, on donors, fund-raising and the actual care, help or interventions they provide. Without costly analysts to sort through and make sense of the data, it goes unused, providing little or no value to the organisation.

Datakind wants to solve this problem by organising business consultants, data scientists and other analysts to provide pro bono services to the charities over the course of a weekend. The basic format is similar to a hackathon, with Friday night being spent networking, learning about the problems of the charities and picking one to work with. Saturday is spent working on the data to provide actionable results for the charities. These results are presented on Sunday morning along with any considerations or suggestions from the data scientists.

The three charities at the London event were Oxfam, Place2Be and Keyfund. Having been intrigued by Hannah of Keyfund’s speech on Friday night I opted to help them over the weekend. Keyfund work with young people to develop their skills and confidence through small projects which are conceived, planned and implemented by the young people themselves. Keyfund coordinates the assessment and funding of these projects through partnerships with local organisations across the country.

OKeyfundver the weekend we analysed Keyfund’s data in a number of ways. In particular we considered the demographics of the children in the scheme, quantified the outcomes in terms of self assessments and skills profiles and assessed the likely effect of streamlining their process into fewer stages. Hopefully the results will be of use to Hannah and the Keyfund team in assessing their procedures and convincing funders to support this worthy cause.

On the technical side I took this opportunity to learn more about the Pandas library by Wes McKinney, which provides a structured data companion to Numpy‘s more homogeneous arrays. The accompanying jargon is quite similar to R, with data frames and series in place of arrays and vectors. Some elements took a bit of getting used to, but one powerful feature is the deep connections with Matplotlib, allowing easy creation of histograms and box plots from data frames. I hope to look more into Pandas, having just bought Wes McKinney’s new book “Python for Data Analysis“.

I really enjoyed the first international Datadive and really appreciate the work that organisers Jake Porway and Craig Barowsky put in to make everything run smoothly. The atmosphere was great throughout the weekend, including late into the night on Saturday and the participation from everyone involved was inspiring. At a time when the gender imbalance in science and technology is making headlines, it was also great to see an event where this wasn’t an issue in the slightest. Overall I would heartily recommend to anyone involved in data to give something back to the communities you live in by participating in one of these events. Plans are under way for more events of this kind in London and I will be jumping at the chance to get involved again.

Update: Just noticed that Dirk Gorissen who was on my team has a nice writeup with some results (including one of my graphs).


Simon Singh wins appeal

Congratulations and well done to Simon Singh who today won his appeal for the right to use a “fair comment” defence in his case against the British Chiropractic Association.

Jack of Kent is going to give his analysis of the ruling over the weekend, starting here.

This is only one case however and the need for reform of the libel laws is still as pressing as ever. Jack Straw has outlined Labour’s plans for reform if they win the election. With the general election due in a few weeks, now is the time to put pressure on all politicians by signing the petition at


The Big Bang Fair

The first ever UK Young Scientist and Engineers Fair is taking place in the first week of March. The Big Bang Fair in the QE2 Conference Centre in Westminster will pit hundreds of schoolchildren against each other for the main prizes of UK Young Scientist and UK Young Technologist of the Year. There will also be exhibits run by all the main science and engineering bodies in the UK including the IOP, STFC and RAS.

This is clearly very similar to the long held Irish Young Scientists Exhibition, which has introduced many secondary level students to the joys (and problems) of research for the last forty-four years. Despite much badgering of my science teacher in school I never managed to enter the main competition, but I always tried to visit the RDS in the first week of January to see the show.

The new UK version of the competition is not open to the public unfortunately, so only registered school groups and some VIPs will be able to see the talks, workshops and competition exhibits. As part of my outreach activities I have volunteered to guide school groups on one of the exhibition days. While I won’t be speaking to them directly about my work, I think it will help that the volunteers all have a science based background and should be able to field most general questions about the whole scientific enterprise. I am doing this as part of the Science and Engineering Ambassadors (SEAs) program that I recently joined but which normally entails more direct outreach with school visits, careers fairs etc.

I imagine there will be a lot of media coverage in the run up to the event on the 4th-6th March, especially as this is both the International Year of Astronomy, and the 200th anniversary of Darwin’s birth.