Tuesday, January 7, 2014

What's the Story, Morning Glory?

The tagline for this blog is Use data to tell your story. In fact, that is also the title of a few presentations I've been doing in 2013 (and am scheduled for in 2014). Whether you use Excel, R stats, Google Apps, or other tools, finding the story in a data set is a complex set of circumstances. Sometimes, you feel like you are trying to divine meaning from a set of numeric entrails. It's often difficult to know where to begin...or, when you're finally at the end, to feel 100% confident that you have honored the best story.

It's important that you be the storyteller. Yes, I know, many of you work in districts where some sort of software will spit out reports for you. Maybe it's your gradebook or benchmark testing. Those reports can be very convenient, but I would like to remind you that they are a developer's idea of what you need---they may not provide the level of insight (or quality of visualization) that supports your work. I know they save time, but with so much riding on the decisions made from these, shouldn't they be all that they can be? Don't settle for this junk. Own your story.

Exhibit OMG: Typical DIBELS Report

But where to start? Whether it's the scores from your gradebook, the annual data dump from state testing, monthly fiscal updates from the business office, or a download from a benchmark testing site, we're all faced with the same question: Now what?

Here's the deal. Working with data is messy. I'm not going to give you any hard and fast rules here ("When you have x data, always do y!"), because frankly, they don't exist. However, we can look at some workflow ideas that will build your capacity to interact with your data. I promise that you won't have to reinvent the wheel everytime. If you know you're going to get DIBELS data every two weeks...and report card scores every six...take the time to figure out your strategies once and then apply them consistently.

So, let's start with the same data set as the previous two posts. It has both categorical (ethnicity; staff and student) and longitudinal (2004 - 2013) data. You may work with a lot of data that has similar features. For example, a gradebook has both categories and time-bound information. So does a fiscal spreadsheet a school business manager might use. You can download the workbook for this post here.

Clean the Data
This idea needs a separate post, but for now, let's just acknowledge that you often get stuck with "dirty data" that you will have to spank into shape. Are the numbers formatted as numbers? Do you have missing data points...and is that okay? Is it organized and labeled, ready for use in charts? What about data quality---are the data valid and accurate? (This last idea is also another post unto itself, which we will save for later.)

Explore the Data
When you have a new-to-you data set, and it's all tidied up and ready for church, make a few different charts. Remember, the reason we make charts and graphs, instead of only using tables, is because we are reducing the cognitive demand while increasing the amount of understanding. By this, I mean that it would be very difficult, at best, to keep all the numbers in the table in your head while simultaneously interpreting the results. A picture really is worth 1000 numbers. 

Which ones, you ask? Do you start with a line chart? A column/bar chart? A scatter plot? You may have to try more than one. I always recommend the Chart Chooser as a starting point.

It's not the only tool out there to help you think through what you might want to look at. You might also like the Visualization Options over at Many Eyes. Or the Classification of Chart Types over at the Excel Charts blog. If you really want a deep dive, bigger ideas than just the charts themselves, read Resonate by Nancy Duarte (it's free!). Find something that helps you think through what you want to show (e.g., part-to-whole, relationships...).

Get in there and make a couple of pictures. At this stage, it's okay if they're ugly---they won't be your final products. You just need to see what story to pull out.

We did this in previous posts. We looked at a column/bar chart using these data and a line chart. They're totes ugly. However, they do show us a couple of things. First of all, we can see that only one population of students has had consistent growth over time (Hispanic), another has had a significant decrease (White), and the rest have remained about the same. For staff, there is not as big of a story, but it echoes what we see with the student data. The number of Hispanic teachers is increasing (3.5 to 4.9%), the number of White staff is decreasing (90.4 to 87.6%), while the rest remain steady.

So, now we have a better idea. We need something that highlights the two changes (or at least one of them for further discussion).


Tell the Tale, Nightingale
A lot of people think making a basic chart is the end. Even if you see what you need to see, be sure to clean them up. Beyond that, extend your thinking about the best way to show the data.

What about a "win-loss" chart for these data?



This is just a basic column/bar chart in Excel, except I've had it plot the overall change, instead of year-by-year. There may be times when we care about the data for in-between years---when we're trying to spot patterns in the fluctuations among groups. But perhaps it's better to just cut to the chase using a chart like this one. The drawback to a chart like this is that it doesn't give you a perspective on the proportions of each population as a part of the whole. Sure, white kids aren't as numerous as before, but what we can't see is that they're still close to 60% of the total.

Or, what do we notice if we plot the data as small multiples?

This type of chart is a combo of several charts, allowing you to make comparisons among groups. (Jon Schwabish has a great tutorial for creating small multiples using Excel.) The big thing here is to keep the axes among all the charts the same and to line up your charts so it is easy to compare across groups. I could have also done this version as a column/bar chart or even separated students from staff. These charts could even be reduced further to sparkline form, and we'd still get the idea. I always find small multiples to be a very busy way to present data, but they do allow you to spot patterns---and, especially, common patterns---much more easily. For example, in every case except White, there is a greater percentage of students than staff for a particular ethnicity.

Which one is the "right" one? No hard and fast answers here. Like any story you tell, you need to consider your audience and purpose. Sometimes, the audience is just you, the teacher, trying to decide where you need to go with your instruction tomorrow. Other times, you are trying to build a case for a school board to allot money for a capital project. Or influence policy. But the students we serve deserve the best stories we can share. Take the time to develop the best one you can.

Bonus Round
Did you know about HelpMeViz? It's a place where you can both give and receive support for data visualization. Go have a look, offer your ideas, or seek feedback on a project of your own.

You might also enjoy accidental aRt, a tumblr devoted to visualizations that turned out a little more interesting than anticipated.

1 comment:

  1. Thanks for the post. Here is another free Chart Chooser option for your readers.
    http://labs.juiceanalytics.com/chartchooser/index.html

    ReplyDelete