Data visualization has made journalism more objective
“A picture is worth a thousand words” is a well-repeated cliché. But it is not any more in journalism, where the increasing use of data visualization to tell stories has revolutionized the field, making journalism more objective, more interpretative and bringing authenticity to storytelling.
The workflow in data journalism has three separate processes. The first is data sourcing and preparation. This can be done in various ways — either through direct sourcing from public documents, or from surveys or indirectly through methods such as “web scraping” and creation of data sets from digital resources. Web scraping requires a lot of refining and cleaning up of data from various sources like PDF documents, HTML pages and text files. There are several free tools available for this job. At an advanced level, a working knowledge of the python programming language and various libraries which aid in HTML scraping is useful.
Also read this: Essential tips and tools for beginning data journalists
Some document caches from which data is to be created are so large that it is difficult to parse or prepare useful tables out of them without the help of a much larger team than what newspapers typically have. Simon Rogers (who was formerly The Guardian’s data editor), in his book on data journalism, Facts are Sacred writes how The Guardian used techniques such as crowdsourcing to obtain big data used to come up with stories, like the MP expenses scandal in the United Kingdom. Mr. Rogers rightly points out in his book that data journalism is “80% perspiration, 10% great idea and 10% output” — a statement that rings true and puts emphasis on the first process of data preparation.
Also read this: Not Numbers, but Numbers Which Matter
The next step in data journalism is analyzing the data and looking for patterns, rules, exceptions, in order to tell a coherent story. For non-coders — most data journalists come under this category — this typically involves a lot of work with spreadsheets, pivoting tables, simple statistical analyses and so on. Analyzing data for journalistic purposes does not require one to be a trained statistician but one needs to be at least familiar with simple statistical concepts (for example, correlation does not amount to causation). If one requires a crash course in basic econometrics, D.N. Gujarati’s book (of the same name) is a good place to start.
This post was originally published on The Hindu and is reproduced here with permission.
Main Image: FoxBusiness