Visualisations are the bright flowers of the data journalism world, and behind them are a lot of complex processes that have to take place before they come to life.
Where should I look? How do I know whether the data is reliable? What format should it be in? How can I search it? These are just some of the hurdles facing journalists before they can start to think about technicolour charts.
One of the first things we decided to cover when planning news:rewired – noise to signal was what went on behind the scenes in transforming an idea into a dataset ready to add colour to.
In this post we’ve collected links to tips and articles relating to all elements of the data journalism toolkit session, which will feature Professor Paul Bradshaw, visiting professor, City University and founder, helpmeinvestigate.com; Kevin Anderson, data journalism trainer and digital strategist; and James Ball, data journalist, Guardian investigations team will discuss the topic, with more speakers to follow.
From Journalism.co.uk
- How to: get to grips with data journalism – Guardian Data Blog and Data Store editor Simon Rogers with an introduction to data journalism.
- Podcast: what’s driving the data trend? – Journalism.co.uk reports from a Frontline Club data journalism event, and speaks to the Guardian’s Simon Rogers, programmer and editorial developer for the Times Julian Burgess, and audience member and regional newspaper journalist Mary Hamilton about the data trend.
- ‘There was never an average day’: James Ball on being WikiLeaks’ in-house journalist – We spoke to panelist James Ball about his time working as WikiLeaks’ in-house data journalist during the preparation and release of the US embassy cables.
- Do publications need style guides for data? – Developer Tony Hirst writes for Journalism.co.uk about whether the news industry’s age-old lexical style guides need to be accompanied by “consistent style guides for data, compared with a more laissez faire attitude to publishing data more or less as it comes.”
Useful blog posts
- An introduction to data scraping with Scraperwiki – Part of assembling datasets in the early stages of some investigations is data scraping. After spending a day playing with the scraping tool Scraperwiki, panelist Paul Bradshaw blogged to “try to explain what screen scraping is through the functionality of Scraperwiki, in journalistic terms”. (See the Scraperwiki Data Blog for a very interesting diet of data-related posts.)
- Building a data set and visualising it – a day in the life of a data journalist – “Driven by data’ blogger Michael Greenfield decided to take on his own data investigation in a day, using the Royal Society Television awards as his subject. Look at stages one and two in relation to this session.
- “Software developers and data journalists” – Daithí Ó Crualaoich talk at the Guardian – The Guardian’s Martin Belam on a talk by developer Daithí Ó Crualaoich, who points out that “You can’t give a machine data and get journalism out the other end”.
- Journalism in the age of data – Stanford University produced this excellent eight-part video guide to data journalism. See parts six, seven and eight for advice from experts on the initial stages of data journalism.
- Scraping for journalism: a guide for collecting data – Pioneering non-profit journalism outfit ProPublica has put together a guide on how it went about collecting the data for its Dollars for Docs news application and the tools it used to analyse it.
Useful resources
- Data.gov.uk – the government’s online data store. With more than 5,400 datasets available, from all central government departments and a number of other public sector bodies and local authorities, this is a good place to start. See the US version too: Data.gov.
- Guardian Data Blog – the Guardian’s world leading Data Blog publishes a wide variety of datasets and visualisations from all sorts of projects. See this new guide to the five most important datasets related to the 2011 budget, and the blog’s general guide to world government data.
- Google Refine – a handy new tool from Google for working with messy data, cleaning it up, transforming it from one format into another, extending it with web services, and linking it to databases.
- Get the data – a new site from Rufus Pollock that helps you find data relating to a particular issue and how to cleanse and sort data or get it into a format you can work with.
Buy tickets for news:rewired – noise to signal at this link.