I’ve been worrying about Big Data recently. Worried mainly because I don’t really understand what it’s all about.
Worried, and fearful, that I’ve been left behind as one of the ignorant (and exploitable) herd, while the ‘Big Data Hipsters’ head off into the distance (on their fixed gear bicycles).
My anxiety started with Google Flu, something I could just about understand.
Google Flu and Big Data. This is the idea that by using the ‘Big Data’ from millions of internet searches, it is possible measure the health of a population and the outbreak of diseases. In essence, if lots of people from a specific area/community are searching the internet using phrases like; ‘flu symptoms’, ‘flu treatment’, ‘where to by flu medicines’; there is a fair chance that there is a flu outbreak in that area/community.
If you extend this logic further, by using the information that ‘Big Data’ provides, Health Agencies, Government Bodies and others, can do things to stop the spread of the disease and help those affected recover; a great idea that has very many benefits. Compared to waiting until people turn up at the doctors, have the symptoms confirmed and send the data to a central administrator, this is much quicker. In almost real time, you can work out where there is an outbreak of flu happening. You might even be able to predict an outbreak based on people searching for early symptoms. The opportunities for improving public health and finding out interesting things about other issues that affect society are huge.
Big Data is like Teenage Sex. However, things weren’t quite as glorious as people predicted. This article by John Naughton in The Guardian, ‘Google and the flu: how Big Data will help us make massive mistakes’, explains how the Google flu predictions didn’t match the actual flu data collected.
The article talks about the approach being ‘cheap, accurate and theory free‘ and…..‘it doesn’t know anything about the causes of flu. It just knows about the correlations between search terms and outbreaks’. It is worth a look at the comments on the article, not everyone agrees.
It does seem to me that there is still room for some deep subject knowledge and human wisdom in the world of Big Data. That is a relief. I wasn’t quite ready for all decision making to be handed over to the Wizards of Big Data Analytics.
I’m not a Luddite here, I recognise that there are potentially huge benefits to be gained from Big Data. I’m just pleased that there seems to be a bit of recognition that there is a lot of hype around the subject. The quote from Dan Ariely comparing Big Data to teenage sex is reassuring.
Another thing from the Guardian article; it is also worth looking at the link to the Garther Hype Cycle. This neatly summarises over excitement, and unrealistic expectations (hype) that are often found with advances in technology, which brings me onto the Woozle Effect.
The original story comes from the Winnie the Pooh books by A.A.Milne.
Winnie the Pooh and Piglet are out in the snow hunting an imaginary beast, The Woozle. They pick up some tracks, decide they belong to the Woozle, and follow them around a bush. They keep going around the bush, finding more and more Woozle tracks, getting ever more agitated. Eventually Christopher Robin turns up to point out they have been following their own tracks, there is no Woozle, and everything is fine.
The Woozle Effect. In the world of scientific research the phrase Woozle Effect has been used to describe the situation where people reference research work that is unproven or dubious. By continued referencing and citations, the original research reaches the status of ‘the truth’. This is despite it being based upon an unproven theory or concept and the only evidence of it being correct are the ‘footprints’ of those who keep referring to it.
You may have experienced a Woozle Effect situation yourself?
When you think about The Gartner Hype Cycle and Big Data it feels like there is a bit of the Woozle Effect happening. Almost every week I get an invitation to an expensive conference where someone will ‘unlock the secrets of Big Data’ for me. There seems to be an assumption that Big Data is capable of understanding all of the problems in my world, and offering me the solutions. I don’t doubt that Big Data will probably help, but for the moment I’m treating it a bit like teenage sex, and thinking like Christopher Robin.
So, What’s the PONT?
- Big Data offers huge opportunities to better understand and solve problems.
- With new technology there is a risk of hype, inflated promises and unrealistic expecations.
- Make sure the Woozle you are following isn’t just your own footprints, or those of other entusiasts.
Google Flu: http://www.google.org/flutrends/
Big Data Quote: http://whatsthebigdata.com/2013/06/03/big-data-quotes/ links to Dan Ariely
Mika Latokartano Blog: I found this through the Nosapience Blog:http://nosapience.wordpress.com http://imaginarytime.wordpress.com/2014/02/23/hunting-a-woozle-a-case-for-authenticity/
Mark Curnow Blog: Also source for the Winnie the Pooh picture.http://www.voyageronline.com.au/the-woozle-effect/