I read a thought-provoking blog post by Roger Peng of Simply Stats (and professor in the Department of Biostatistics at the Johns Hopkins Bloomberg School of Public Health) entitled “What is a Successful Data Analysis?“. It’s an interesting question – data analysis is concerned with measuring performance of people, processes and policies, but is there a widely-accepted measure of success of analysis itself?
It seems like success in data analysis should be relatively straightforward to define, right? We all know good analysis when we see it, or at least we think we do. If you read Peng’s entire post, though, you’ll see that it’s tricky. If you’re tempted to use ‘veracity’ as a yard stick, how do you know whether the analysis is ‘true’ or not? If you feel like adherence to ‘best practices’ should be how we measure success, who defines what’s ‘best’, and would most people agree? In analysis there can be many ways to skin the cat, and multiple different findings can emerge from the same set of data.
Ultimately, he proposes the following definition for success for data analysts:
A data analysis is successful if the audience to which it is presented accepts the results.
What do you think? Do you agree with this definition? As Peng’s audience, do you accept his selection of ‘acceptance’ as the sole criterion for success? If so, can you explain why this definition works for you? If not, do you think there’s another sole criterion, or do you think there are multiple criteria that should be used instead? Or maybe you think it’s not something you can define in general terms.
Here’s my take, off the cuff
I want to like this definition, mostly because of the importance it places on communication and making an impact on other human minds, but unfortunately I just can’t accept it. It’s similar to leadership, in my view: is a ‘good’ leader someone who just gets people to follow him or her? If so, every tyrant and despot in human history would fit this definition. They may have been effective at getting people to follow them, but what good is that if they led them right off of a cliff?
So, too, with analysis: Colin Powell’s February 5, 2003 presentation on the intelligence community’s findings of the presence of weapons of mass destruction in Iraq was accepted by many in the US government (though not in the UN or much of Europe, incidentally). This analysis has been declared “a great intelligence failure” in retrospect. In 1633, Galileo’s findings in favor of the Copernican theory of a heliocentric solar system was rejected by the Catholic church, earning him condemnation and threatened torture. The church didn’t officially change its stance until more than 350 years later, when John Paul II formally acknowledged the Vatican’s error.
These two examples of analysis that was initially accepted or rejected erroneously are similar to type I and type II errors, as my co-worker Scott Teal pointed out to me as we chatted about Peng’s blog post. The possibility of making such errors, combined with the known propensity of humans to pay more attention to analysis that confirms biases and previously held beliefs lead me to reject this definition.
So what’s my suggestion, then?
I don’t like to shoot holes in someone’s proposal without coming forward with an alternate suggestion, myself. In this case, though, I don’t have one that I’m confident in, so I’d like to hear your ideas. Ultimately, I doubt there’s a single criterion that would hold up to every circumstance, as convenient as that would be.
I believe successful analysis can be described by the following six traits:
- Falsifiable: The analysis puts forward statements that are possible to be refuted
- Sound: The analysis is conducted using valid techniques, and is generally free from error
- Confirmable: The analysis can be repeated, replicated, or corroborated by alternate means
- Compelling: The analysis is presented in a manner that is clear and highly convincing
- Weighty: The findings matter a great deal
- Ethical: The entire activity follows the values and principles outlined in the Manifesto for Data Practices
What do you think – should one or more of these six be excluded from a list of characteristics of a successful data analysis? Are there traits missing?
I’m not so sure these six are easy to measure, by the way. For example, as Peng rightly points out, an analysis may have involved an incredibly expensive experiment, so it may be cost prohibitive to repeat or replicate it. Or perhaps some don’t agree with how compelling the presentation of the analysis was. Or findings may seem weighty in the moment but may be inconsequential within a matter of hours or days due to a changing factor.
Long story short: I agree with Peng that’s it’s challenging to put forward a simple and concise definition of ‘successful’ analysis. I applaud him for trying, and again, I wish I could accept his definition that makes use of a sole criterion – it sure is easier to remember than mine with six distinct factors.
I’d love to hear your thoughts, too. Leave a comment, join the discussion on social using #SuccessfulDataAnalysis, or, better yet, wrote your response on your blog.
Thanks for reading,
I think one more criterion should be applied:
The analysis should be understandable to the intended audience. They don’t have to agree, but they need to understand (at least superficially) how the results were obtained. It can be as simple as “users were grouped by similar characteristics and this group represents X% of the users” (clustering) or “we evaluated the influence each (or this) factor had in Y behaviour” (linear or logistical regression).
Hi Zahra, I’d agree, though this can be a challenge for the analyst depending on the methods used and the audience to which those methods are presented. There could be some considerable education required during the presentation, or at least a quick primer provided in cases with very sophisticated analyses and relatively green audiences. There’s definitely a risk of losing them in the details, or alternately glossing over things too much. I believe the most compelling analyses are ones we can fundamentally grasp, though, so this might be baked into the others to some degree. Thanks for reading & commenting!
I had read this one a while ago: https://www.wired.com/story/pennsylvania-partisan-gerrymandering-experts/
Although the calculations and methods are pretty advanced, Pegden’s simulation’s basic principles are easily grasped. Chen’s would need some concrete/graphical examples to be understood more easily, I think. For Warshaw’s, I’d need to rewatch a vulgarisation video on the Efficiency Gap before looking at his arguments.