Skip to content

Building a thriving data visualization community

2016 July 21
tags:
by Ben Jones

Community
There’s a verse in Proverbs of the Tanakh that goes like this:

As iron sharpens iron, so one person sharpens another. – Proverbs 27:17

Leveling Up

What does this verse mean? It means that when you have a good friend, that friend will push you and make you better and stronger. If you make a mistake, they’ll point it out to you in the right way so that you don’t make it again. If you do something well, they’ll pat you on the back, but they’ll also help you figure out how to do it even better next time. And vice versa.

It’s about leveling up.

That’s the value of the data visualization community, and why I believe the more voices that join, the better off we’ll all be. There’s a big qualifier to that statement, though. In order to reap benefits from our interactions with each other, we must avoid two diametrically opposed cultural paradigms which are the antithesis of iron sharpening iron:

1. Avoid the Love Fest

A “love fest” may feel great at first (‘oh my gosh, they think I’m great!’), but it soon gets old, and the growth curve for those involved is severely hampered. This is where everyone just congratulates, retweets, likes, etc, everything everyone ever does. But because everything is so GREAT all the damn time, no one gets any better. So go ahead and join the love fest if you want. You may get a gig or two out of it, but you’ll never be any better than the day you join. It’s not for me.

2. Stay Out of The Shark Tank

The “shark tank” is the opposite of the love fest, and is characterized by snarky attacks, arrogant snubs, and all around competition and condescension. Success is a zero sum game in the shark tank. In order to rise up in the hierarchy, you have to pull someone else down. And to stay at the top you have to step on any new comer and reestablish your dominance. This is a culture of fear in which sharing your work is akin to putting raw meat into the water. Sorry, it’s also not for me.

So where are we now?

Here’s a tough statement that you may or may not agree with: the data viz community has been both a love fest and a shark tank over the past few years. I’ll turn the spotlight on myself and admit that I’ve contributed to each dysfunction at different times in my involvement.

There’s a pendulum effect at play – the culture can swing from one dysfunction to the other – but it’s up to us to strike the balance. Here are some tips:

6 Tips to Building Community

  1. It starts with caring. Does it matter to you whether other people get better over time, or do you just care about your own reputation? If you don’t get this one right, don’t bother with the other five on the list. Just please also exit stage left.
  2. Always create. The best way to have a useful opinion is to always be visualizing data. You’ll have fresh understanding of the inherent trade-offs and dilemmas, and you’ll have much better suggestions and ideas to deal with them.
  3. Try different tools and techniques. What we’re doing by building this community goes beyond any one company’s quarterly growth objectives. It’s about increasing the data literacy of our species on this planet. Yes, you’ll likely interact more with people who use the same tools as you. Make an effort to connect outside of those subgroups so that they don’t become silos.
  4. Seek feedback. After you create something, ask people to look at it and give you their thoughts. What was confusing? How could you increase the impact of the message? Listen to what they say and resist the urge to intervene and explain. And don’t make it an exercise in fishing for compliments.
  5. Give honest feedback, with tact. Tell people what you think. Don’t hold back. Just be tactful. Organize your feedback in terms of plusses and deltas – what did you like and what could have been even better? And ask them what they think about your suggestion – “do you think that would work?”, etc. A little humility goes a long way.
  6. Never fall into the trap of the expert. People who think of themselves as experts tend to stop listening. They become close-minded. But there’s always so much to learn. I’m not advocating fake humility, here. Sure, you have knowledge and skills that someone new can benefit from. But I guarantee you that every newbie has a few things to teach you, too.

An Example of Iron Sharpening Iron

A couple weeks ago, I got caught up in the whole soccer craze fueled by the Euro 2016 and Copa America tournaments, and I wanted to know who had scored the most goals in international play ever, so I created this simple scatterplot:

Unsolicited Feedback Done Well

Shortly after I published this viz, my colleague at Tableau Florian Ramseger pinged me on Slack with a great suggestion:

Florianslidersuggestion

I totally agree with Florian. This is a much better way to combine the slider and the size legend so that they actually work together in harmony instead of just being located close to one another. Try it out to see why it’s so much better:

Solicited Feedback On Demand

I also reached out to Alexander Mou via twitter to ask him if he had any suggestions to make it better. Alexander regularly posts suggested improvements via twitter on what he calls #TweakThursday. I’ve seen him do that for a few months now, and I often like his ideas, so I thought I’d request his input proactively. He not only created a souped-up version, he wrote about it here.

I was quite happy with both exchanges, and felt that I had some new ideas to improve my work. This is a good example of why I appreciate being part of a data visualization community. People out there are looking at what I’m doing and giving me their thoughts, and I have a whole army of advisors to weigh in when I ask for help.

That’s the kind of community I want to be a part of. It’s pretty simple.

Thanks for reading

Ben


A First Look at Google Data Studio

2016 July 2
by Ben Jones

{Note: for another example of a website stats dashboard made with Google Data Studio, see David Murphy’s Datasaurus Rex blog post.}

I’ve been playing around a little with Google Data Studio. Google Data Studio is the free version of Google’s data visualization product Data Studio 360. It’s currently in beta as of the time of writing, and it lets you create up to 5 reports from connections to Google sources (like Google Analytics, Google Sheets, AdWords, YouTube). You can share these reports with others who can either view or edit them depending on how you configure them, just like how you’d do it with Google Docs.

Now, I work for Tableau, but I’ve always written about different data viz tools on this site. I also teach data visualization theory at University of Washington. I teach students in my class how to use a growing number of free data visualization tools like Tableau Public, Plotly, Quadrigram and R. Lisa Charlotte Rost recently wrote a great blog post comparing 12 of these tools which I recommend you read. The number of tools is growing every year.

I like different things about each of these tools. They all have their unique strengths, and their respective drawbacks. We live in a time in which data literacy is on the rise, but there are still so many people who don’t know how to effectively work with data that the true competition in this space is data illiteracy. But this is just my own personal blog, and my own musings on a topic we all love. Nothing more, nothing less.

In playing around with Google Data Studio, I managed to create a visualization showing the health and wealth of countries using the Gapminder data set that Lisa had used for her tools review. After getting this first dashboard under my belt, I decided to connect to Google Analytics and tackle my website stats. The dashboard is embedded below, and you can compare it with a richly interactive version I created using Tableau Public a couple years ago here.

GDSWebsiteStats

Fig 1: A dashboard created with Google Data Studio that shows my website stats

What I liked about it

I was pretty impressed with how easy it was to connect to my GA data and figure out the user interface. Creating views in the dashboard interface was pretty intuitive and downright fun. Try it yourself – pick the chart type, drag open a window and position it where you want on the display. Here’s a screenshot of the edit experience:

GDSeditexperience

Fig 2. A screenshot of the Google Data Studio dashboard creation user interface

Formatting the charts themselves was a little more tricky, but I got the hang of it before long. Click on a chart, edit the Data and Style options in the panel that opens on the right hand side of the screen, and you’re good to go. Adding filters, images and text was also very straightforward. The hardest part to figure out for me was how to change a “Scorecard” call-out at the top from a Sum to an Average. Eventually I got it – I just had to click on the data field I wanted to modify in the Data pane on the right and then click “Create New Dimension” and edit the metadata table.

Allowing others to view and edit my work was also an intuitive experience, mostly because I’ve been using the exact same controls with Google Docs for a while now. So that’s efficient. Not much more to figure out there.

My colleague Dash Davidson pointed out that the dashboard works quite well when viewing it on a phone:

Fig 3. A mock-up of what my dashboard looks like on an iPhone 5S

Fig 3. A mock-up of what my dashboard looks like on an iPhone 5S

 

What I wanted to do but couldn’t

Creating a simple multi-view dashboard was easy and fun, but I wanted to go a step further many times but couldn’t. For example, I’d like to add “rich interactivity” – letting my readers click on one chart (say a bar in a bar chart, a country shape on the map, or a data point on the timeline) and have that selection filter or highlight the other charts in the view. You can’t do that as far as I can tell. It could just be me though. I don’t claim to be an expert user of this software. Just a data viz guy who likes to tinker. So to me, Google Data Studio is a fabulous visualization tool, but not quite an analytics tool.

Another limitation is that there’s no embed option, again, as far as I can tell. I can email and send links to people with ease, but I had to put a screenshot in this blog post because I couldn’t figure out how to embed it like a YouTube video directly inline. So it seems to me that it’s not really for sharing broadly on the web as much as it is for sharing amongst a group of colleagues. But for the latter, it’s quite effective.

While customization of the views is pretty easy to do, it’s also somewhat limited. For example, I wanted to size the dots for each browser in the scatterplot at the bottom based on how much traffic came from each browser, but I couldn’t. So it’s a true scatterplot but not also a bubble chart.

Color options are also fairly limited – you can’t use color density to encode quantitative data in bars in bar charts or dots in scatterplots like you can in choropleth maps, for example. And you can’t do other little things with color like change the color in an axis. The fact that there aren’t these features made it relatively simple to learn, but I often found myself unable to do what I wanted to do.

My Final Take

This is an easy to learn data visualization product for your Google data sources that’s still fairly limited. Overall I really like the user interface, it’s very easy to connect to my Google data sources, and I really like the collaboration features. But while I could quickly visualize my data and add helpful filters, I couldn’t really drill down and explore the data in depth.

It did pass one important test though – I learned as much about my data as I did about the tool while playing with it.

Now to figure out what I can do to make this website more mobile friendly so I can increase the low percent of mobile readers…..

Thanks for reading. Let me know if you’ve tried it out, and if so, what you think.

Ben


Sports Viz: The Top International Goal Scorers

2016 June 27
by Ben Jones

Hi all,

It has been a while since I posted here. This one’s just for fun.

There are two exciting men’s football (aka ‘soccer’) tournaments going on right now – Euro 2016 and Copa America. I was wondering which players in the history of the sport have the highest number of goals scored over the course of their international careers. The data was fairly easy to find on Wikipedia, and since I’m somewhat fond of scatterplots, I created this simple viz to help me understand who has been the most prolific at putting the ball into the back of the net:

Some notes about the project:

  • The viz features three filters – Confederation and Career Status selectors, as well as a Goals per match slider.
  • I annotated the player with the highest goal scoring rate (Poul Nielsen of Denmark) since the relative sizes of the flag shapes are much harder to compare than the x- and y-axis positions of the shapes.
  • I used a dual axis on the scatterplot to create a yellow border around the players who are still active. I also added a note in the bottom right explaining the meaning of the yellow outline.
  • In order to figure out if the player is still active, I created a calculated field called “Active?” that looks at the value in the field “Career End” and assigns a value of “Active” if it’s null, and a value of “Retired” if it’s not null.

That’s about it. I enjoyed the process, and learned something new about a sport I love.

Thanks, I’d appreciate any feedback you might have, as always.
Ben


A First Look at Simple Chart-Builder Atlas

2016 May 12
by Ben Jones

Hi all,

Just a quick note in case you didn’t see the announcement that digital news publication Quartz is making their internal chart-making tool Atlas available to the public. I registered and was given access to use this web-based simple chart maker, and here’s what I created:

It only took me about 5 minutes to get the hang of the web-based UI and create this bar chart. I tried a couple more complicated data sets first, but ran up against a 12 column limit. I didn’t see how to sort the bars, so I went back to the Excel document, sorted the data table in the spreadsheet, and then re-copied and pasted it into the window so the bars were sorted how I wanted them – by decreasing french fry score.

You can also export to SVG, the code is open source, and the chart looks great on a mobile device without any programming or configuration on my part. It just works. No filtering, interactivity or advanced analytics, very limited formatting and customization options, and no dashboards. Single charts are what this tool allows you to build, and what it does, it does well. I can see lots of people who want to make simple charts and graphs using this tool to crank out publication and mobile-ready views.

Anyone else play with it? What do you think?

Thanks,
Ben


How to Embed Google Trends in a Tableau Dashboard

2016 April 27
by Ben Jones

This blog post shows how to marry two free online data tools: Tableau Public and Google Trends. Why? Because you might want to quickly check how certain categories in your data have fared in search over time relative to one another. {Disclaimer: this tutorial will only be valid as long as Google keeps it’s Google Trends URL scheme the same. I learned the hard way with my Google Maps + Tableau tutorial that you just can’t bank on Google leaving things the same for very long.}

It’s in response to a question posted by Alexandra Samuel on twitter:

I had given a crack at using Google Trends data in Tableau a year and a half ago with an Ebola scare viz, but it involved a very manual process of downloading the CSV out of Google Trends. Hardly the web data connector experience Alexandra was looking for. Fellow Tableau geek Eric Peterson did some digging and found out that Google does not have an API for Google Trends:

So end of the road? Are we out of luck if we want to automatically pull search interest data into Tableau until Google makes an API for Google Trends? Not entirely. Here’s my solution (using fast food survey data in honor of Tableau Public’s Food Viz Month theme), and you’ll find a tutorial on how I built it below:

How to Add Google Trends Data to a Tableau Dashboard

The trick to do this involves two key elements:

Step 1: Build a new Sheet

This part was totally straightforward. I created a basic Excel table out of YouGov survey results where respondents said which restaurant had the best burgers and which had the best fries. Then I dragged “Best beef burger” to the Columns shelf and “Best fries” to the Rows shelf, adding the fast food logos by dragging “Fast Food Chain” to Shape and making sure I had a folder with the logo png files inside my Documents/My Tableau Repository/Shapes folder. Easy.

Step 1: Create a new sheet

Step 1: Create a new sheet

Step 2: Create a new dashboard and add a blank Web Page object

Once I had build my scatterplot, I created a new Dashboard, dragged the scatterplot sheet onto it, and the dragged a new Web Page object from the section on the left middle panel and just leaving the Edit URL field in the resulting dialog box blank by clicking OK:

Step 2: Add a blank Web Page object to the Dashboard

Step 2: Add a blank Web Page object to the Dashboard

Step 3: Create a new URL Dashboard Action

Now that the Web Page object is out on the dashboard, I can control which website gets shown inside the box by creating a URL Dashboard Action. To do so, click Dashboard > Action > Add Action > URL.

Step 4: Build the Google Trends URL in the URL Action dialog:

The trick is to copy and paste this URL into the Edit URL field:

https://www.google.com/trends/fetchComponent?hl=en-US&q=DIMENSION_FIELD_NAME&cid=TIMESERIES_GRAPH_0&export=5&w=500&h=300

where instead of DIMENSION_FIELD_NAME you would put the data field that contains the values you want to compare in search interest. In our case, it’s the field Fast Food Chain surrounded by angle brackets. Notice that I also changed the URL to run on Select instead of Menu, and I made sure to allow for multiple select by checking the box at the bottom:

Step 4: Build the URL Action

Step 4: Build the URL Action

That’s it! A word of caution: Google Trends seems to have a search quota, so you might hit your limit like I did when creating this dashboard and trying different things, hitting the trends site from the same IP address over and over. The good news is if you want a couple hours, you should be good to go again. At least that’s how it was for me.

Obviously this isn’t quite as awesome as accessing the actual raw search trends data itself directly within a Tableau workbook, but I think it’s a pretty good half-way solution. One limitation is that you can’t control the colors of the timelines in the Web Page object, which makes it tough to coordinate your dashboard colors. That’s one reason I used logo images instead of colors to differentiate between the restaurants in the scatterplot

I think this tip could be taken to a whole different level by embedding Google Trends maps, or searching within countries, etc. All you’d need to do is figure out how Google Trends controls these parameters by looking at the URL that it creates when you do a search.

Okay, thanks for reading. I hope this was helpful to you. Let me know if you find a way to improve on it!

Ben


The Design of Everyday Visualizations

2016 April 11

I’ve been educated and inspired recently by the best selling design classic “The Design of Everyday Things” by UX guru Don Norman. You really have to read the entire book, which applies to all types of objects that people design – from chairs to doors to software to organizational structures. It provides thoughtful and practical principles that guide designers to design all of those things well. By “well” he means “products that fit the needs and capabilities of people.” (p.218)

As I read it, it occurred to me that data visualizations are “everyday things” now, too. Even richly interactive ones viewed on tablets and phones. That has only become the case in the past half-decade or so. Yes, examples can be traced back to the early days of the internet, but the recent explosion of data, software tools and programming libraries has caused their proliferation.

And I found that point after point, principle after principle in Norman’s book applied directly to data visualization. I’d like to call out five points that struck me as particularly relevant to recent discussions in the field of data visualization.

1. Good visualizations are discoverable and understandable

Norman starts his book describing two important characteristics of all designed products:

  • Discoverability: Is it possible to even figure out what actions are possible and where and how to perform them?
  • Understanding: What does it all mean? How is the product supposed to be used? What do all the different controls and settings mean?

He talks about common things that are often anything but discoverable and understandable, such as faucets, doors and stovetops. One of my favorite quotes in the book is about faucets:

If you want the faucet to be pushed, make it look as if it should be pushed. p.150

Regarding doors, Vox published a great video on a particularly poorly designed door on the 10th floor of the Vox Media office. The video references and even includes interview footage with Don Norman himself. And it’s funny. You should watch it.

It occurred to me that the typical stovetop design snafu has a direct translation into the world of data visualization. To explain, let’s start with the problem with stovetops. Ever turn on the wrong burner? Why? Because you’re stupid? No. Because there are often poor mappings between the controls and the burners. The burners are often arranged in a two-by-two grid and the controls are often in a straight line, like this:
poorstovetopdesign

What does that have to do with data visualization? We often use similar controls – radio buttons, combo boxes, sliders, etc – to filter and highlight the marks in the view. When there are multiple views in a visualization (a dashboard), there is a similar opportunity to provide clear, or natural, mappings.

Norman gives the following advice for mappings:

  • Best mapping: Controls are mounted directly on the item to be controlled.
  • Second-best mapping: Controls are as close as possible to the object to be controlled.
  • Third-best mapping: Controls are arranged in the same spatial configuration as the objects to be controlled.

Often the software default places the controls on the right hand side. Here’s my attempt to show these options on a generic data dashboard, where the four different views are labeled A, B, C and D, and the controls that change them are labeled according to the views they modify:

Screen Shot 2016-04-11 at 12.16.59 AM

This is a relatively straightforward example, and the job of the designer of a more complex visualization is to make it similarly clear what can be done and how to do it. Designers use things like affordances, signifiers, constraints and mappings to make it obvious. Note that it takes a lot of effort to make the complex obvious.

2. Don’t blame people for getting confused or making errors

A fundamental principle that Norman drives home a number of times in the book is that human error usually isn’t the fault of humans, but rather of poorly designed systems. Here are two great quotes on the topic:

It is not possible to eliminate human error if it is thought of as a personal failure rather than as a sign of poor design of procedures or equipment. p167

And again on the same page:

If the system lets you make the error, it is badly designed. And if the system induces you to make the error, it is really badly designed. When I turn on the wrong stove burner, it is not due to my lack of knowledge: it is due to poor mapping between controls and burners. p.167

Norman differentiates between two types of errors: slips and mistakes.

  • Slips are when you mean to do one thing, but you do another.
  • Mistakes are when you come up with the wrong goal or plan and then carry it out.

Both types of errors happen when people interact with data visualizations. In the world of mobile, slips are so common – Maybe I meant to tap that small icon at the edge of my phone screen, but the phone and app recognized a tap of an adjacent icon instead.

Mistakes are also common. Maybe it made sense to me to filter to a subset of the data to get my answer, but in reality I was misleading myself by introducing a selection bias that wasn’t appropriate at all. If someone makes the wrong decision based on misinformation they took from your visualization, that’s your problem at least as much as it is theirs, if not more so.

How to make sure your readers avoid slips and mistakes? Build and test. Iterate. Watch people interact with your visualization. When they screw up, don’t blame them or step in and explain what they did wrong and why they should’ve known better. Write it down and go back to the drawing board. If the person who agreed to test your visualization made that error, don’t you think many more likely will? And you won’t be there to tell them all what they did wrong. Your only chance to fix the error is to prevent it.

3. Designing for pleasure and emotion is important

I’m a big believer in this principle. Norman states that “great designers make pleasurable experiences”:

Experience is critical for it determines how fondly people remember their interactions. Was the overall experience positive, or was it frustrating and confusing? p.10

How can an experience with a data visualization be pleasurable? In lots of ways. It can make it easy to understand something interesting or important about our world, it can employ good design techniques and artistic elements, it can surprise us with a clever or funny metaphor, or some combination of these and more.

What about emotion? The “e word” to which the analytical folks in our midst are allergic. Cognition gets a lot of play in the world of data visualization, but emotion does not. But these two horses of the chariot that is the human spirit are actually inextricably yoked:

Cognition and emotion cannot be separated. Cognitive thoughts lead to emotions: emotions drive cognitive thoughts. p.47

I also love the following quote:

Cognition attempts to make sense of the world: emotion assigns value…Cognition provides understanding: emotion provides value judgements. p.47

So let’s embrace emotions. Some data visualizations piss us off. Some crack us up. Some are just delightful to interact with. These elements of the experience should be part of the discourse in our field, and not ignored just because they don’t match the left-brained predisposition of the bulk of the so-called experts. If we take them into consideration, we’ll probably design better stuff.

4. Complexity is good. Confusion is bad.

There’s a trend in our field to move away from the big, complex dashboards of 2010 and toward “light-weight” and uber-simple individual graphs, and even GIFs. Why? A big part of the reason is that they work better on mobile. It’s true, and what we’ve learned in the past few years is that the complexity of those big dashboards isn’t always necessary.

This is a great development, and I’m all for it, but let’s just remember that there was often a great value to the rich interaction that is still possible on a larger screen. Instead of abandoning rich interactivity altogether, I believe we should be looking for new and innovative ways to give these advanced capabilities to readers on smaller devices. When those capabilities will help us achieve some goal, we’ll be good to go. We’re not there yet.

After all, it’s not the complexity of the detailed, filterable dashboard that’s the problem on the phone – it’s that we haven’t figured out how to give these capabilities to a reader using a phone yet, and the experience is confusing. Does this make sense to you?

I actually see this as a good thing. Our generation has the chance to figure this out for the generations to come. The growth of the numerical literacy of our population will be well worth the effort.

5. Absolute precision isn’t always necessary

I have to be honest. This one is my hot button. There’s a school of thought that says that the visualization type that gives the reader the ability to guess the true proportions of the thing visualized with the greatest accuracy is the only one that can be used. Some go so far as to declare it immoral to chose a visualization type that introduces any greater error than another (they all have some error).

I found this great visualization about visualizations in Tamara Munzner’s book Visualization Analysis and Design:

datavizerror

The problem with this line of reasoning is that absolute precision isn’t always necessary for the task at hand. He uses the example of converting temperature from Celcius to Farenheit. If all you need to do is figure out if you need to wear a sweater when you go outside, a shortcut approximate conversion equation is GOOD ENOUGH. It doesn’t matter whether it’s 52, 55, 55.8, or 55.806. In all four cases, you’re wearing a light sweater.

And let me repeat the point: there are errors associated with every visualization type – we aren’t machines and perfect decoders of pixels or ink. Sometimes it’s okay that a general understanding is achieved.

And for goodness sake, if absolute precision is required, then use labels, or just show a table of exact values.

Wrapping it up

I hope this was helpful for you! I love doing this kind of thing – pulling lessons from other amazing writings and seeing how they apply to data visualization, which I see as the “catch-all” discipline. It’s part numeric, part editorial, part graphic. To do it well, we need to embrace the principles of good design. I’ve tried to outline a few here from a true expert who we should all be familiar with. If you do read Norman’s book, you’ll find that there are many more.

Many thanks to my colleague Jewel Loree for pointing me to this book way back when. I finally got around to reading it, and am glad I did.

Thanks,

Ben


Gun Homicides vs. Gun Suicides in the US

2016 January 28
by Ben Jones

Nelson Davis, Matt Chambers and Alex Duke have formed the Reviz Project which you can read more about here. Their first challenge was to visualize the relationship between gun homicides and gun suicides in the United States. Data is sourced from the CDC.

I took a quick pass at visualizing the ratio between suicides and homicides for each state and over time using a scatterplot (I’m a scatterplot junkie) and a timeline. Hover over each state circle in the scatterplot to filter the timeline below to show the trend for a chosen state, and use the slider at the bottom of the timeline to explore the relationship between these two variables for a particular year in the scatterplot:

The dashboards they created to visualize this same data are quite elaborate (you can find them on their blog). While this month’s data story is very sobering, it’s always fascinating to me how different people, starting with the same raw data, and – in the case of Nelson, Matt, Alex and I – the exact same tool (Tableau), can come up with very different results, and very different insights.

That’s why data visualization has such a strong social component to it.

Thanks for reading,
Ben


To Optimize or to Satisfice in Data Visualization?

2016 January 12
by Ben Jones

Q: In data visualization, is there a single “best” way to visualize data in a particular scenario and for a particular audience, or are there multiple “good enough” ways?

That’s the debate that has resurfaced on Stephen Few’s and Cole Nussbaumer’s blogs today.

  • In summary, Few says “Is there a best solution in a given situation? You bet there is.”
  • In contrast, Cole says “For me, though, it is possible to have multiple varying visuals that may be equally effective”

Could Both be Right?

This is going to sound strange, but I think both are right, and there is room for both approaches in the field of data visualization. Let me explain.

Lucky for us, really smart people have been studying how to choose between a variety of alternatives for over a century now. Decision-making of this sort is the realm of Operations Research (also called “operational research”, “management science” and “decision science”). Another way of asking the lead-in question is:

Q: When choosing how to show data to a particular audience, should I keep looking until I find a single optimum solution, or should I stop as soon as I find one of many that achieves some minimum level of acceptability (also called the “acceptability threshold” or “aspiration level”)?

The former approach is called optimization, and the latter was given the name “satisficing” (a combination of the words satisfy and suffice) by Nobel laureate Herbert A. Simon in 1956.

So which approach should we take? Should we Optimize or Satisfice when visualizing data?

I believe there is room for both approaches. Which approach we take depends on three factors:

  1. Whether or not the decision problem is tractable
  2. Whether or not all of the information is available
  3. Whether or not we have time and resources to get the necessary information

But What is the “Payoff Function” for Data Visualizations?

This is a critical question, and where I think some of the debate stems. Part of the challenge in ranking alternative solutions to a data visualization problem is determining what variables go into the payoff function, and their relative weight or importance. The payoff function is how we compare alternatives. Which choice is better? Why is it better? How much better?

Few says that “we can judge the merits of a data visualization by its ability to make the information as easy to understand as possible.” By stating this, he seems to me to be proposing a particular payoff function: increased comprehensibility = increased payoff.

But is comprehensibility the only variable that matters (did our audience accurately and precisely understand the relative proportions?) or should other variables be factored in as well, such as attention (did our audience take notice?), impact (did they care?), aesthetics (did they find the visuals appealing?), memorability (did they remember the medium and/or the message some time into the future?) and behavior (did they take some desired action as a result?).

Here’s a visual that shows how I tend to think about measuring payoff, or success, of a particular solution with hypothetical scores (and yes, I’ve been accused of over-thinking things many times before):

datavizsuccessmeasures

It’s pretty easy to conceive of situations, and I’d venture to say that most of us experienced this first-hand, where a particular visualization type may have afforded increased precision of comparison, but that extra precision wasn’t necessary for the task at hand, and the visualization was inferior in some other respect that doomed our efforts to failure. Comprehensibility may be the single most important factor in data visualization, but I don’t agree that it’s the only factor we could potentially be concerned with. Not every data visualization scenario requires ultimate precision, just as engineers don’t specify the same tight tolerances for a $15 scooter as they do for a $450M space shuttle. Also, visualization types can make one type of comparison easier (say, part-to-whole) but another comparison more difficult (say, part-to-part).

Trade-Offs Abound

What seems clear, then, is that if we want to optimize for all of these variables (and likely others) for our particular scenario and audience, then we’ll need to do a lot of work, and it will take a lot of time. If the audience is narrowly defined (say, the board of directors of a specific non profit organization), then we simply can’t test all of the variables (such as behavior – what will they do?) ahead of time. We have to forge ahead with imperfect information, and use something called bounded rationality – the idea that decision-making involves inherent limitations in our knowledge, and we’ll have to pick something that is ‘good enough’.

And if we get the data at 9:30am and the meeting is at 4pm on the same day? Running a battery of tests often isn’t practical.

But what if we feel that optimization is critical in a particular case? We can start by simplifying things for ourselves, focusing on just one or two input variables, making some key assumptions about who our audience will be, what their state of mind will be when we present to them, and how their reactions will be similar to or different from the reactions of a test audience. We reduce the degrees of freedom and optimize a much simpler equation. I’m all for knowing which chart types are more comprehensible than others. In a pinch, this is really good information to have at our disposal.

There’s Room for Both Approaches

Simon noted in his Nobel laureate speech that “decision makers can satisfice either by finding optimum solutions for a simplified world, or by finding satisfactory solutions for a more realistic world. Neither approach, in general, dominates the other, and both have continued to co-exist in the world of management science.”

I believe both should co-exist in the world of data visualization, too. We’ll all be better off if people continue to test and find optimum visualizations for simplified and controlled scenarios in the lab, and we’ll be better off if people continue to forge ahead and create ‘good enough’ visualizations in the real world, taking into account a broader set of criteria and embracing the unknowns and messy uncertainties of communicating with other thinking and feeling human minds.

Thanks for reading my $0.02. I’d like to hear your thoughts.
Ben


When Memorability Matters: Another Practitioner’s View

2015 December 10
by Ben Jones

The following comments are in response to Stephen Few’s recent newsletter entitled “Information Visualization Research as Pseudo-Science” in which he critiqued an academic paper by Borkin et al entitled “Beyond Memorability: Visualization Recognition and Recall“. I’m not an academic researcher, so I will leave it to others in the field to respond to Few’s specific criticisms of the paper’s methods. My goal in this article is to respond to opinions Few voiced about memorability in data visualization.

I’d like to start by asking a few questions:

  • Does it matter whether a data visualization is memorable or not?
  • Should we, as data visualization practitioners, care about memorability?
  • Should we design our visualizations so that those who view them are more likely to remember them at a later point in time?
  • Is memorability a worthwhile area of study for those studying data visualization in academia?

In my opinion, and in my experience, the answer to each of these questions is ‘Yes’.

In Stephen Few’s recent newsletter entitled “Information Visualization Research as Pseudo-Science”, though, he put forward a differing opinion:

“Visualizations don’t need to be designed for memorability— they need to be designed for comprehension. For most visualizations, the comprehension that they provide need only last until the decision that it informs is made. Usually, that is only a matter of seconds.” – Stephen Few (emphasis his)

This statement helped me understand why Few and I disagree about memorability: we disagree about how data visualizations are used by groups of people. Simply put, I don’t believe data visualizations are “usually” followed by decisions “only a matter of seconds” later. That may be how a robot or a computer algorithm would approach decision-making, but it’s just not how groups of humans in organizations go about it.

How do groups of humans usually work with data visualizations, then? Well, analysts prepare dense packets for pre-reading materials, directors and VPs attend review meetings where they look at lots and LOTS of data and charts, sometimes they take copious notes, sometimes they zone out and check their smart phones, then they break for lunch, check their email, reconvene and consider different topics, only to have the final decision made at a totally different planning meeting or off-site weeks later.

Sound familiar? That’s a whole lot messier than question -> visualization -> decision in seconds. And that’s only one reason why memorability matters.

In my experience, the memorability of the overall message (of which the visualizations are a critical element) matters most when:

  1. Decisions won’t be made immediately
  2. The audience doesn’t care deeply about the topic
  3. The environment is already saturated in data and visualizations

To illustrate these three conditions, let me relate a personal story from my experience working with data and groups of decision makers. The specific details of the account have been altered to protect the innocent.

A Practitioner Wins Thanks to Memorability

One time I had the unenviable task of presenting the results of the launch of a product that was, shall we say, less than “top-of-mind” to the executives at a Fortune 500 company. Think “razor” of the razor – razor blade model. Sales should just be a pull-through, so they didn’t pay much attention to it at all.

But what we were finding was that the relative neglect of this high-touch product was causing a lot of dissatisfaction, and our lack of attention to the details of the product offering was causing us to lose customers.

In preparing for the presentation, I created plenty of nice, Tufte-compliant charts and graphs, like this one (a generalized mock up), to show how the recently-launched product was doing in the marketplace:

ChartNoPic

A comprehensible but not particularly memorable chart

Do you notice the problem in the chart? That’s right, we didn’t launch a green SKU in Configuration B.

Why not? Tooling investment.

Who cares? Customers did. A lot of them. The nature of the product was such that customers couldn’t select between A & B. There were factors that pre-determined that for them.

Now I was scheduled to be the fifth presenter in a very long review meeting where many other topics would be discussed, and as I mentioned, this product just didn’t matter to the executives. My charts were going to get glossed over. If the executives gave me 10 seconds of attention on each chart, I’d have considered myself lucky. The way the situation was shaping up, I felt pretty sure that this product line’s issues weren’t going to be addressed as a result of my presentation.

So instead, I showed charts like this, with actual photographs of actual customers and their actual quotes:

ChartWithPic

The same chart made more memorable by the addition of a human’s face and their own words

The result was palpable.

They leaned in. They looked at the faces in the pictures. Actual customers. People that looked like their sons, their daughters, their mothers. They chuckled at the funny social media handles. They cared. For the first time in a long time, they actually cared about the razor. And they cared about the fact that customers just weren’t loving it.

A few weeks later, I received an email that the go-ahead had been given to resolve a number of problems with this product line, including the missing green SKU in Configuration B. The VP thanked me for showing the “human side” of the data in my presentation.

When the time came to make the decision, they opted to fund a product they didn’t used to care about, thanks to charts they couldn’t forget.

Memorable or Comprehensible, or Both?

Stephen Few made the statement that comprehensibility matters, but memorability doesn’t when it comes to designing data visualizations. Well the original charts in my real-life example above were definitely comprehensible. I changed them because they weren’t particularly memorable.

My original charts were in the bottom right quadrant of the 4-blocker below, and all I did was push them up to the top-right. Sure, sometimes, it’s not necessary to do so. Sometimes, though, it’s make-or-break:

memorablecomprehensible

Note that for scenarios where the audience members already deeply care about the data, comprehensibility itself will result in memorability. Adding photos of beautiful, smiling faces just isn’t necessary.

But let’s be honest. Having an audience of 100% of the key decision makers that wait with bated breath for our next bland chart that results in a blank check being given right there on the spot just isn’t normal. It would be nice, sure, but how many times have you actually been in that situation? So many times you absolutely need them to remember your message. Having charts that draw them in and stay in their brains just isn’t a bad idea.

Sometimes There’s Just No Decision

So far I’ve written about data visualizations in the context of human decision-making. But many data visualizations don’t inform decisions at all. Decision support is but one of many possible purposes. Data visualizations can be created to merely inform, to educate, and yes, even to entertain. In those cases, design for memorability can be the difference between having someone share your work with others, and having them forget they ever saw it.

Few made the following comment about adding images to visualizations:

If I incorporate an image of a kitten into a data visualization, I can guarantee that a test subject would remember seeing that kitten if it is shown to her again a few minutes later. But how is that useful? Unless the visualization’s message is that kittens are cute and fun, nothing of consequence has been achieved. – Stephen Few

He answers the question himself quite well: images are useful if the visualization’s message is enhanced by the presence of the images.

Take my Edgar Allen Poe timeline for example:

Does the image of Poe add any value at all? How about the image of his signature? Are these components nothing more than “chartjunk” (Few mentioned to me in an email that he would not call the image of Poe “chartjunk” based on his 2011 writings on the subject), or do they actually perform a function?

I submit that they perform a vital function. The visualization shows the life works of one man as blocks stacked together in the years they were written. Works that were written on ink and paper by his own hand.

There’s no decision here. The visualization is simply intended to educate you. And it’s my opinion that your education takes on a whole different meaning – a whole different feeling – when you see Poe’s face and an artifact of his own penmanship.

And let’s be honest, the following version is pretty damn boring, you’d probable ignore it if you saw it in your twitter feed, and it’s not nearly as memorable, is it?

I’d like to conclude by quoting from Stephen Few’s critique one final time:

The greatest tragedy of this research is that what makes a visualization memorable is actually of no consequence. – Stephen Few

I hope I’ve made it clear in this blog post why I think that memorability can actually be of great consequence in data visualization. But did you notice that in my comments above I used phrases like “in my experience” quite often, and that all I really did was relate an anecdote and state my opinion? My opinion does not amount to codified knowledge, and my experiences do not amount to rigorous research.

And that’s exactly why I would appreciate further attempts by academics to study what makes charts more or less memorable. I’m sure this task isn’t easy. Visualizations are but one piece of an overall message that can be delivered in myriad ways to a variety of audiences. For those who are studying this topic, do know that there are practitioners out there who are hoping that the insight you glean into this topic can help us all.

Thanks,
Ben


For Fun: Happy Fibonacci Day!

2015 November 23
by Ben Jones

It seems there’s a day for everything, right. National Cashew Day (you guessed it, that’s today), World Philosophy Day (nope, sorry – last Thursday). Heck, you can even register your own National Day of [fill in the blank].

So thanks to MIT’s twitter account, I became aware that today is Fibonacci Day. Makes sense – November 23rd is 11/23, which are the first four numbers in the famous Fibonacci Sequence.

So, to earn my Math Nerd Card for 2015, I created the following dashboard that visualizes the first 100 numbers in the Fibonacci Sequence, starting with 1 instead of 0, as I’m led to understand is the more modern convention:

Math + Data Nerds Unite: know of any other good math vizzes out there? Leave a comment!

Thanks for humoring me,
Ben