Data-driven journalists don’t commission a statistical study to merely illustrate a news story under development. Rather, the data-driven approach is more like going to a great art exhibit and letting its art move your responses. Or as Johannes Kepler wrote about his explorations of the heavenly harmonies, “Of heavenly birth was the measuring mind.” This journalistic approach looks for opportunities to let the aesthetics of the statistical patterns themselves create and shape the reporting, and its eventual final appearance to the viewer.
Statistics are among the signifiers of purposeful lives and the meanings of the art of human interaction. Data driven journalism is a like conversation with a massive numbers of peoples that is boiled down to its essential elements.
One of our first data-driven story series, "The Rise of the Postsecular City," provided the first ever statistical overview of evangelical church planting in Manhattan. The map shows the distribution of the evangelical churches and a chart on the timing of their founding. The fact that about 40% of evangelical churches in Manhattan below 96th Street on the East Side and 125th Street on the West Side were founded between 2000 and 2009 immediately strikes one as a stupendous new feature of New York City. We can’t help but wonder, how did it happen?, who are these people?, and what do they portend for the future of New York City? The portrait of the city character is changing all around us: new beliefs, new peoples, revival of old beliefs, and new waves of eager migrants from the rest of the United States.
The reaction to these statistics shows that they are not about mere facts. Statistics have their own aesthetic principles (based on its science) of proportion, relevance, significance, and chiaroscuro, which shape their viewers’ responses. The numbers reveal that we can know some remarkable things that we didn’t know before, but there are hints that there are realities and patterns that remain hidden from view. The best data-driven stories show the aesthetic features of the facts. How important is this statistic in the total scheme of causes, actions and historical traditions? The news consumer can both objectively read the data while letting intuition be inspired by the statistical aesthetics that unleash the imagination to the wider significance and relevance of the religious news.
The best statistics, graphs and data-maps grip the attention like religious icons. And like icons, data-driven journalism graphics promise that there is even more going on than meets the eye.
The complexities of life means data-driven journalism is an art based on science
The gathering of data is part of the art of data-driven journalism. The process is complex, and the people on which we report are even more complex, always incomplete and changeable. Complex social systems produce an immense variety of pauses, causes, and random occurrences. So, we must keep in mind the limits and hidden possibilities of our data mapping.
Over the last few years, a couple of research projects put together portraits of religious America based on narrow samples. One had respondents who were mostly white. Another left out most foreign language respondents (the interviews were only done in English). The researchers extrapolated (a fancy word for sophisticated guessing) the beliefs of non-White or foreign language respondents through various weighting techniques. One wonders how useful such surveys are to the understanding of majority ethnic-immigrant cities like New York.
Also today, many people just don’t pick up the phone to answer questions. This is particularly true in New York City. The true response rates of even the best surveys have dropped to really low levels. The PEW Research Center is one of the very best in the field and spends a lot of money to get their work right, yet they report having only 9% response rates, which they call “a drastic drop.” This means that our stats are missing the actual voices of 91% of the sample that was drawn to tell us about a population of people.
The problem became evident early in the 2016 political primaries. At that point in time, the surveys often underestimated the number of voters who were for Ted Cruz. Leading up to the Wisconsin primary, most surveys of Cruz voters were off by 10-15 percentage points, a huge error. Some of that error was due to the fact that Cruz voters were late deciders. A more important reason may have been that Cruz voters didn’t trust the surveyors who they associated with a liberal “Intelligentsia-media complex.” In fact, people in general have become more wary and tactical in answering social surveys. So, the Cruz voters often declined to participate in the surveys but did turn out to vote.
In their “Assessing the Representativeness of Public Opinion Surveys,” published in 2012, PEW outlined some successful counters to the problem of low survey participation. Essentially, they concluded that the only surveys to trust are the really large ones with tested ways of evaluating and weighting results and specialized surveys of populations that have high response rates.
An additional layer of mystery is added by the fact that the results of social surveys sometimes drive the people concerned to change their behavior or ideas so that the next survey results will be different from the previous ones.
After the data is gathered and inputted into a computer, the statistical studies generate another type of complexity, which is a huge number of statistically significant correlations that may not be meaningful to the subject matter at hand. Most of the “statistically significant” findings are just noise. These spurious correlations among facts may be spurious because many of the possible causal variables have to be left untested within a single study on account of the time, money and difficulty of obtaining the data. Who would fill out a hundred page questionnaire that would test out all the theories about some social phenomenon? Scholars usually work through this problem by chipping away at the flaws by examining possible explanations one at a time. Of course, data-driven journalists don’t have this luxury of time, so they have to be more cautious about their findings.
We should not forget that in the drive to provide “smart” solutions, a lot of fluffy, if entertaining, nonsense gets reported. Maybe, the best response at times is an unsanctimonious imperturbance, a steady state that Col Allen, the recently retired editor of The New York Post, held unless deeply disturbed.
Cross-tabulations is when you weigh the effect on the end result by focusing on one factor at a time. Going beyond simple cross tabulations is much more difficult to show clear causality, and the results are more disputable because they often entail many assumptions.
For example, early in the Democratic primaries, surveys revealed that males and females had quite different opinions about Hillary Clinton. Females were much more likely to view her favorable. However, when you add another factor – age, the story gets a little more complex and precisely revealing. The older women adore Clinton, but at the time of the surveys, the younger women not so much.
A newsroom needs to develop the deep experience, knowledge and analysis of city religions and decades of collective experience of using statistics. Then, this craft of the aesthetic stat will pay off in the discernment on how to grasp the significant statistics and to let go of those that might seem correct but are most likely rabbit trails toward foolish and erroneous ends. Graphically, the statistical results that have the most impact are iconographic representations that represent the merger of science, intuition and life as it is experienced.
An iconograph is meant to generate a quick understanding followed by a slower contemplation of the meaning of its truth, its context and an experiencing of the reality portrayed by the graph. Quick stats should add flavor to simmering reflection.
Graphs should tell their main story without much explanation. The audience ought to be able to look at a graph and say, “Wow!” A graph should have riveting power to its truth, much like an icon that powerfully focuses one’s attention so that one can envision the religious realities of the whole city.
A statistical insight should also rest in the context of the reality behind it, mainly to the faiths and their role within the city. At its best the graph and reality will seem to be inhabiting the same space, a sort of multi-faceted communal experience of truth.
Finally, the data-journalist should want the unknown – alternative explanations, flaws, the remaining questions and the mysteries – to surround the experience of the graph like a frame of an icon which reminds the viewer that the imperative of the truth of the image is limited and might be an empire waiting to be over-thrown by new inputs or re-framings. Stats should let the messiness of life intrude as a reminder that much of life happens outside of our neat numbers. An iconography displays an appreciation of nuances, smaller stories in the background, and questions that still need to be answered.
The goal is an iconography—graphs and maps that grab you with their insight. But due to finitude, prejudice, and foolishness, you will often get “just graphs,” and, occasionally, a cartoon of the data-journalist’s folly or fantasy. Mea culpa.
Short list of useful tools
Using advance statistical packages like SPSS (Journey's preferred tool), SAS and the like, the journalist can explore complex relationships between different factors (variables). Simpler programs like Google Fusion Tables are helpful for creative, quick tables and maps for online interactives that allow the viewers to create their own stories based on their experience of the data patterns. Embedded interactive statistical platforms like Tableau Public are very helpful to answer simple straightforward questions about the frequency of phenomena, like how much money did one company give to a candidate. Here is an interactive chart on religion in prisons:
Other platforms make simple crosstabs between variables. The statisticalatlas.com does thorough and quick visualization of U.S. Census data. There are many other good platforms too, and more are coming out every week.
Types of iconogaphs
Surprising discoveries simply displayed are very effective in catching the eye and engaging the mind.
At the bottom of human thinking is an attempt to separate out and lump together. Then, some of our biggest surprises happen when we cross boundaries long established in society and our thinking. The data-driven journalist should look for statistics that lump together groups of people so that a reality previously unnoticed comes into view. And then, those cross-over phenomenon that unlump reality really stand out as news.
Twenty-five years ago, I experienced a shock when I entered KGB headquarters in Moscow. I had this feverish desire to go back and forth through the doors because of the bewildering feeling of how the Cold War had ended. The statistical graphs about the beliefs of military and security personnel were also like going through a door which one had never been through before.
The statistics of faith among Russian intelligence agents had changed from little (admitted) belief in God to a third of them saying that they certainly believed in God. They didn’t believe in Communism any more, and some were opting for religious beliefs to guide them.
Bolshevism itself was a veritable religion with sacred rituals and an apocalyptic mood. One of the most popular revolutionary songs during the revolution and civil war expressed this religiously ultimate mood with the line, “And all of us will die for our cause.”
Today, it is hard to convey the intensity with which Russians felt that their very minds and consciences had been violated and polluted by the moral failures of Communist ideology and practice. It was like losing your religion.
So, when the security agencies, led by the head of the KGB, launched a coup on Sunday, August 21, 1991 against the reformers lead by President Boris Yeltsin, I knew that the coup would fail. With some like-minded scholars –Soviet sociologist Samuel Kliger and theologian Paul de Vries, we quickly reported on the implications of the changing religious situation. In radio and in print media, we added the data-driven prediction that the coup would fall apart by Wednesday, August 24, which it did.
Our reasoning was powerfully influenced by our graph that showed such a marked turn among security personnel toward religion. The religious-like authority of Communism and Soviet society was fragmenting. The agents of power were too divided in their beliefs to keep any secrets or even work together. The leaks from KGB headquarters and the Fourth District military headquarters came to us like a sprinkler system had been triggered.
Here are some examples of useful data-driven journalism:
Chronographs: Timelines, statistical time maps.
Chronographs: dynamic, interactive timelines
New online statistical tools allow the creation of very dynamic graphs that change as you click on different times and qualities of people. The challenge is to provide these bells and whistles in a way that they are useful for the viewer. It is one thing to be fascinated by changes, it is another to actually to be able to use the fascinating charts for living life.
Another challenge is that the quality of the data and the assumptions made in the chart are often not immediately obvious or are easily covered by the fascinating display. It is important to allow the viewer to imagine or even insert other assumptions. In the end, unless well thought out to maximize practical use and transparent as to its data limitations and assumptions, the online displays may end up as carnival curiosities.
Data synthesizer and display artist Chris Whong tied together a data base of gps locations and fare pickups to chart every taxi ride in New York City for one year. See NYC Taxis: A Day in the life
Interactive graphs that give multiple insights
At an Institute so grounded in science and technology, where do faith and spirituality fit in? MIT isn’t famous for being a god-fearing place. Few people know that there are 16 chaplains and nearly 30 student groups dedicated to religion at the Institute.
In 2012, The Tech surveyed 2,943 undergraduate and graduate students at MIT — about 27 percent of the student population — on their religious life. 1,295 (44 percent) were undergraduates.
These four interactive infographics break down parts of the religion survey data by dorm, by year, by major, and by religious affiliation.
Shape analogies are effective when statistical patterns bear a resemblance to some other commonly recognized shape like a letter of the alphabet. Journey utilized such a resemblance to the upside down letter "J" in its Illustrated Explorer's Think Card on Religion in Washington Heights, Manhattan. We noticed a distinct stream of Dominican Americans going up Broadway from 153rd Street up to the tip of Manhattan and then over the bridge to the Bronx and traveling southward.
We analyzed a dataset on Domincan American faith from The Pew Center and U.S. Census data. The photos of a famed Dominican Pentecostal preacher and a mural in a Catholic Shrine were by Journey staff as they reported on 107 religious sites in Washington Heights, and the map was created in Maptitude. In addition to online magazine usage, the Think Card was printed on two sides of clay coated paper stock (for brighter colors and handcrafted feel) and then handed out to leaders in Washington Heights.
Sometimes, a map or graph will show striking social and religious divisions. For our Illustrated Explorer's Guide to Williamsburg and Greenpoint, Brooklyn, we included a map that clearly shows three geographic areas based on income. There are really three Williamsburg - Greenpoints: a middle class lodged particularly in the north part; an upper class in the center and nearest Manhattan; and a poverty band in the south and east.
What is the most representative of Williamsburg-Greenpoint area? An art gallery in Billburg? A design shop? Or a forty-five hundred student Jewish high school? And a half-block sized replica of the ancient temple of Jerusalem. Huh?
According to our street by street count, there are over 300 religious organizations in the area. Artistic and cultural organizations are prolific but numbering 126 in 2011, according to the Brooklyn Arts Council, they amount to less than half the number of religious organizations.Twenty-six of the murals in the area are religiously themed. The numbers are striking and thought provoking.
Maps with practical implication
Most of the thirty-four churches on or near a seven block stretch of Ralph Avenue in Northern Crown Heights, Brooklyn say that they have an intense focus on the youth of the area. However, because they didn't know many youth there were, they were uncertain over whether their efforts were grounded in the local reality. Our map of youth shows a great many youth on either side of Ralph Avenue and provided incentive for the churches to redouble their efforts.
Maps that tell a story with dates or data
We imagined Pastor Bill Devlin's 40 day Fast in February 2012 as a pilgrimage against the discriminatory exclusion of religious groups from using community space in NYC public schools in the off-hours. An antique map of John Bunyan's Pilgrim's Progress served as the template for excerpts from 40 days of interviews with the pastor.
Maps that make personal a social phenomenon
Rebecca Solnit created an anti-war map by highlighting the locations of the conservative/military brain trust in San Francisco-Bay area neighborhoods.
A renaissance of data-driven journalism and graphic data displays are laying out a superhighway on the roadmap for the future journalism.