Data Monsters and Data Saviours

Published in

Swirrl is now TPXimpact

5 min readJun 26, 2017

Ready to face a bunch of data challenges

What’s your data monster? The concept of good vs evil en route to the land of good fortune isn’t one you might automatically associate with data. But at our Power of Data event there were many data monsters and heroes that cropped up. So, with a HUGE HT to Steve Peters (who based his talk on data monsters and transformers), here’s my take on the day, through this lens …

49% of working age adults are not able to understand enough about data to read and understand their own payslips. — Laura Dewis

Poor data literacy is a barrier to data being both understood and used. Over the course of our event, two ‘weapons’ became apparent to fight this particular monster:
1. The need to skill people up within organisations
2. The requirement to give statistics context and metadata to engage users both within, and without, an organisation. Laura highlighted the need to communicate statistics in a way that stays true to the statistics but engages the user: you can’t put emotion or opinion in, but you need to put a person into the data to explain what it means. To see an example of how this is done, head to Visual ONS which is brilliant at this.

The data in there is totally inaccessible.

Here are the data monsters, breathing the hot fire of expensive licences and hiding data in caves. — Steve Peters, DCLG

Data Accessibility. Data that’s hidden away in a cave (silo) is isolated. Data that’s wrapped up in expensive licences is isolated. But data isn’t useful in isolation. A sentiment echoed by Steve Peters, Ed Parsons, Ric Roberts, Tom Smith and Jeni Tennison in their talks on the day too. So, in the fight against this monster, what‘s the plan?
1. Open Data. Get the data out in the open, where privacy issues allow, because people can only use it if they have access to it. This has the dual power of providing transparency too (a point I pick up again soon).
2. Publish data well — so it exists as part of the web, not just on the web. Make it linkable when it’s published so it’s accessible to a range of users — including those who want to make something useful with it via API. Both Jenny Brooker and Jeni Tennison picked up on the road analogy here: In the same way interconnected roads help us get to a destination, we need really good quality interconnected data to help navigate us to a decision. It‘s a quicker way of getting to the value you want to have.
3. Again, use the human element, not for emotion or opinion but to explain what it means. Make it human and make it relevant (HT Laura).

Unreliable data sources: to be avoided, much like the Hulk

It’s going to be emotive - Jamie Whyte, chairing the data and trust panel

Data and Trust. In this strange case, trust is that unnerving data monster — the one who could be your friend or foe, depending on context. The main trust monsters debated at our power of data event were data privacy and data reliability. How to balance individual privacy with potential communal benefits of sharing data? And how do you know which set of crime / education / etc statistics you can really trust?

So how do you defeat the monster in this case? Dealing with these two behemoths is complex and massively varied depending on the individual case.

Privacy is a hot potato. From a user’s point of view, individual consent is vital and it was interesting to hear anecdotes from people who feel more confident trusting Google with their location 24/7 than sharing data with government organisations. NHS Digital have considered, for example, using mobile phone data to help optimise the way A&E departments run, but they have to be extremely careful about questions of patient privacy. In general, finding ways to allow individuals to have control or insight into how their data is used seems to be an essential step, so that people can make an informed choice about what they are sharing and what for.

For questions of trust in statistical data, clear information on provenance is key. If you know the provenance of data it helps you decide whether you should trust it. In the quest for providing metadata and provenance, NHS Digital are working towards this, embedding it at the point of publication (see Rob’s talk for more on this). And when people get so much of their information through search engines and through social media, the data source that is most engaging or easiest to find is likely to win the most attention. Government data publishers need to spend more time thinking about how they communicate complicated information: the experience of the ONS in working closely with journalists is a good example.

So, as the sun sets and the data heroes live to fight another day, what does the land of hope and good fortune for data look like? To Paul Maltby, who chaired our first panel, it was progress on all levels: trying to get policy underpinnings, core infrastructure, accessibility including APIs and also data science and usability (the data literacy on top).

Government organisations are both users and publishers of their own data. They’re motivated and keen. So, yes, there are data monsters hiding under the bed. But there are also data heroes ready to take them on.

Data Monsters and Data Saviours

Written by Sarah Roberts