4. http://media-2.web.britannica.com/eb-media/24/95624-050-16B125AC.jpg Data is like Crude oil…. … valuable but needs to be refined. But value exponentially higher than even our current appreciation of its usage Must be broken down into specific, useful parts One data set can be adapted to be used across several different products
5. Data Symbols that exist and have no significance beyond existence Information Data that is processed to be useful Knowledge Application of data and information Wisdom Evaluating and contextualising of Knowledge Action & £££ What is data? The building block for wisdom!
16. Businesses are starting to understand the value of data and encourage users to create more data
17. Global Information Created and Available Storage Exabytes Source: IDC http://www.linpac.com/Global/Press%20releases/Fast-build,-boltless-META-shelving-from-LINPAC-Storage-Systems-x1500.jpg Data created Available storage Storing all this data is getting harder, because data is getting bigger and bigger…
Mark Getty is part of the Getty family that sold their oil empire to Texaco. He looked for something to do and decided that there was an opportunity to consolidate the market for photography. In an Economist article in 2000 ( http://www.economist.com/node/288515) he said: “Intellectual property,” he says, “is the oil of the 21st century. Look at the richest men a hundred years ago: they all made their money extracting natural resources or moving them around. All today’s richest men have made their money out of intellectual property.” This theme was picked up by Clive Humby, a man who has pioneered use of data with his firm DunnHumby, at the ANA Senior Market Summit at the Kellogg School in 2006. This was picked up again by Gerd Leonhard in 2009 and has become part of the mainstream – so much so that badges are available!
In the traditional DIKW hierarchy, data is the building block for wisdom. This can be refined into information, then knowledge then wisdom.
Data is a key input – maybe even on a par with capital and labour Helps in all areas of the business from strategy and decision making to day to day operations. Increasingly data is also being used as a way to create and / or test new products or services.
People (e.g. data about users or customers or groups) – typically the focus has been on demographic data Things (e.g. data about houses, cars, jobs) Behaviours (e.g. data on searches, messages, ‘likes’, etc) E.g. Amazon keeps track of what you Browse but don’t buy Who knows what on Kindle Nearly 2/3rds of film selections from Netflix come from referrals made by computer
With mobile devices we expand these to proximity, history and state (motion or fixed)
Information & Data – insight (Raw facts, Metadata, behavioural and contextual data) User Experience, Design & Interface Live event driven information & content AND has 28.5m customers, each with multiple touchpoints
People used to use price comparison sites to buy books / CDs / DvDs etc Now people have a preferred supplier who they use solely as it is easier. Recommendations, emails, offers, personal details, history all available. Amazon Prime is a great example, as is iTunes
Interactions are increasingly rapidly as more time is spent online and interactions are increasingly encouraged. For example data on people from Facebook is less interesting than knowing what they have clicked on, who they interact with, what they recommend. Tesco are able to analyse 5 billion pieces of information every week from Tesco shoppers. This data is then analysed against 50 different dimensions, e.g. is the product domestic or foreign, branded or own label, economy or premium, individual or family. This data is becoming much more important than the demographic data used historically. Then a cluster model is used to classified customers into one of 6 segments – price sensitive, traditional, convenience, mainstream, finer foods, kid’s choice. Note that this is done on actions not on demographics as it would have been done historically. It’s interesting to note that Tesco don’t actually hold much personal customer data. It is no coincidence that behavioural targeting is the growing area of online display.
Personal data is almost becoming a commodity, as there are so many people competing to use this basic data that is relatively widely available. Niche information is much more useful.
Given the volumes of data available, there is no need for ‘understanding’ with statistical analysis Current Google projects on translation and voice recognition do not try to ‘understand’ the data, instead they just ‘do the math’. Previous projects in this area have tried to understand the process. Microsoft’s Farecast can replicate airline ticket pricing to help choose when you buy based on 225bn pieces of data In recent years, Oracle, IBM & Microsoft have spend $10bns on buying software firms specialising in data management and analytics Growing at 10% p.a. almost 2x the software business as a whole
One data set can have many different uses – for example in our property business we have data from property searchers, estate agents, developers, surveyors. We can take all this data and use it to help all kinds of people interested in property, not just estate agents or those looking to buy houses, we can also refine the data to be of considerable value to economic forecasters, banks, retailer who want to know which areas are growing etc. One data set can have multiple uses
Data on things is becoming increasingly commoditised, although there are fewer places where you can find detailed granular information about certain things. Again niche information (and very detailed and searchable) becomes more valuable. For example we have extensive information (via Broadbean and Jobsite) about the UK jobs market. This has allowed us to understand the UK job market, and even forecast the UK job market really effectively.
LocalPeople started as a local business directory – now the real value is in the database of content created by local users. Facebook ‘Like’ is an example that can be monetised easily. Spotify social Interactive advertising is a great way for businesses to learn more about what customers want as well as to more deeply engage with them. FourSquare another good example.
The amount of data that exists in the world is growing at around 60% per year IDC found that there is c.1,200 exabytes (1m terrabytes is an exabyte) of digital data was created in 2008 - that;s the equivalent of well over ten trillion copies of your average magazine, or to put it another way nearly 2000 magazines per person per year, or 3 full magazines per person per day! Overall each year we create around 10% of the total stock of data, currently estimated at 1.2 zettabytes (1000 exabytes!).
The Detroit bombing wasn’t prevented as the 550k person database on potential threats is notoriously flawed with few back ups and lots of duplicates. In January 2000 the sheer volume of data pouring into America’s National Security Agency brought the system crashing to a halt for 3.5 days. The then director Michael Hayden said publicly that “We were dark. The ability to process information was gone.” There are also numerous examples of data being lost / stolen, causing serious problems for an organisation. As the volume of data being distributed increases, so do risks associated with this in terms of people getting access to inappropriate data. For example Shell’s contact database of >175k staff and contractors was recently emailed to activists opposed to the company.
Data is the new Philosopher’s Stone Key behavioural data is often a byproduct of other actions Google uses click through rates to improve algorithm (they realised this was a great way to refine search results – better than links / relevance and best of all it creates a virtuous circle) Airlines found that the best predictor of taking a flight – ordering a vegetarian meal. When a hurricane is forecast retailers found that as well as water, torches, batteries, etc, Pop Tarts are huge sellers! This allowed them to stock up in advance. Property searches can tell developers what they should be building – for example we can see what people are searching for so can see whether developers in a certain place should build 2 bed or 1 bed flats. Similarly if we see lots of searches for red cars we can tell car dealers that they should be buying more red cars.
Lots of data is unstructured – especially images, videos, music, etc. The rise of the semantic web will help to address some of this, and some businesses are making a conscious effort to structure their data better. For example Getty Images spends c.$45 per picture to have it scanned properly and keyword tagged so that it can be easily used online.
Refining this data creates scarcity For example, simple demographic information is a commodity whereas being able to refine this data and cross it with actions, interactions, and make very specific focused insights or data sets is where data can add real value. More and more data does not create value, more and more information / knowledge / wisdom does!
Getting the data to the right people in the right form at the right time is key For example, WalMart provides suppliers with almost real time data on how their products are selling to enable suppliers to take the lead in restocking stores
Facebook appear to be at the centre of huge concerns around online privacy. The American Civil Liberties Union has collected >80k signatures on a petition, and there is a growing movement against the lack of privacy on Facebook, with calls for FTC to look into this matter. Facebook has hired Timothy Muris, former Republican chairman of the FTC to help work on this in Washington. Currently the Facebook privacy policy is 5830 words long, longer than the US constitution. There are calls for the FTC to look into this matter.