Hadoop Meetup London

17 Sep 2014

The London Hadoop User Group gets to hear EXASOL’s view of SQL-on-Hadoop

Last night I gave a talk to a Meetup of the Hadoop User Group in London about SQL-on-Hadoop, and my theory that it would never replace specialist SQL engines like EXASOL.

I’ve enjoyed attending these informal meetups over the past year or so – they’re a great opportunity to meet a range of people involved in London Big Data – from hardcore techies to entrepreneurs, data scientists to PR experts.

The atmosphere is informal, and becomes increasingly informal as the free beer and pizza is consumed. The informality extends to people occasionally getting up during a talk to go and get another beer or to have a conversation in another room – the pressure is on the speakers to make their talk interesting and entertaining, otherwise they may find themselves suddenly talking to an empty room.

My subject-matter was quite controversial – this is after all a Hadoop User Group, and even though I did say on at least three occasions that I thought Hadoop was a great tool for a particular job, I was still daring to suggest that it wasn’t such a good tool for every job.

I tried to keep the tone light – for example, the attached photo intends to show that lawnmowers should not be used to give your pets a haircut – and in the same way, SQL-on-Hadoop should not be considered a complete replacement for specialist tools like our database.

I then talked about attempts by the Hadoop community to graft something onto Hadoop to overcome the historical problems and I have some great pictures of a ludicrous lawnmower which have been adapted to work as bulldozer, and another that has been fitted to a bicycle. I entitled this slide “My Personal Opinionated Opinion”, because images like this are exactly what go through my head when I think how much effort it has taken to adapt Hadoop to support Analytical Queries.

After some quite robust questions, I sat down and watched my colleague Dave Shuttleworth’s talk about benchmarks and how some (mostly SQL-on-Hadoop) vendors cherry-pick nice easy queries from the TPC-DS benchmark to pretend their product is more flexible and powerful than it is.

I was worried that our message would have caused some anger, but we got a fair hearing and had some excellent conversations afterwards.

The video will be available soon if you want to critique my argument or my signature style of presentation.