Saturday, January 26, 2013

Getting It All Laid Out

One of the MOOCs I'm currently enrolled in is Alberto Cairo's Introduction to Infographics and Data Visualization. So far, it's familiar territory, which is nice. Some things are new to me, but I am not overwhelmed because everything is new to me. It's a six-week course. We're starting the third week and have our first offline assignment.

Imagine that you report to me (your managing editor in a news publication). You wish to make a proposal for a visualization based on these numbers. How would you convince me that your idea is relevant? You will need to show me detailed sketches (made by hand or through a design program) to do that.

By "these numbers," he is referring to these data about the changing number of tenured faculty at U.S. universities. I've downloaded the data, but I'll need to do some more reading and digging before I'm ready to do something with them. I have to figure out the "So what?" before I draw up a presentation of the data.

In the meantime, I thought I'd share a bit about the process I use when building a data display...something I worked on this week.Over the next month, I'll be sharing some data with groups of small districts. The data is not so much a report---they already have some of it in various forms---but something to explore in common. These groups of districts will be trying to identify some common ground as a starting point for some work together.

I started by finding all of the data I could about these districts: fiscal, staff, and student-level. Then I pared down the data sets by thinking about what the audience would want to see. Next, I got out my pencil and some scrap paper. It was time to draw some scenarios. I think that even if my digital tools weren't limited to Excel, I would still begin with an analog model. I like to list the types of things I want to show and then figure out how to arrange them. Mind you, this is only a starting point. I often find myself in the middle of developing something, only to find out that it didn't make sense after all.

Finally, I get knee deep in Excel. For this project, I ended up with three different displays: one that showed overall trends in student performance, another to dig deeper into performance in various subject areas, and a demographics overview for each district.

Here is the first one of the series. I am still struggling with it in terms of whether or not to go with clustered columns instead. Such a display does make it easier to compare data over the years, but since I am including both regional and state level data for three different years, clustered columns get a little busy. The reason why I am leaning toward the version shown is that it is easier to compare patterns between the region and state.
There are things I don't like about this display, so if you have some ideas about fixing it, I'm all ears. For example, I'm not convinced about using different colors for the bars in the top row, but since I'm not going with the clustered columns, I think the color helps make comparisons across the years. I really don't like the labels along the bottom set of charts. I suppose I could shorten them and then create some sort of legend that gives the real version. There are two choices a user can make in the interactive version of this chart. They can pick a subject area (reading, math, writing, science) and grade level (3 - 8)---so not all of the labels are as cumbersome as they are for reading.

The second display shows trends for graduating cohorts---although these students may be many years from walking across the stage. The purpose of this display is to look at performance for the same set of students. For example, how did a group of fifth graders score when they were in fourth and third grades? The current grade levels of students are in parentheses.


It's a little busy. Mind you, you need a big monitor to view the spreadsheet all at once. I've tried to be as consistent as I can with the color scheme, headers, etc. I do think that the small size of the graphs is a bit misleading---some of those gaps are as much as 20 points. Users can hover the cursor over points to see the numbers associated with them.

Finally, here is a sample of the district overview.

It's the only one of the three that doesn't have comparisons, but in this case, it should be okay. It will serve as a reference when teachers from each district talk about their schools. I have some stacked bar charts here to help conserve space, but I've tried to keep the style more or less the same. I know that the more I fuss over the details, the less they will be in the way of others making sense of what is presented.

For me, these sorts of designs are a slow process. The last graphic took me most of a day to derive. Sometimes, graphs don't turn out in the way you think they will. Or they take more space. As hard as I try to be consistent with colors, labeling, and fonts---there is always something I miss. Sometimes, I build something only to find out that I don't need it. And, no doubt, the day after I share these, I will see two or three other ideas I wish I would have incorporated. But I'm feeling pretty good at this point in the build.

Now, about those tenure data...guess I've procrastinated long enough...

No comments:

Post a Comment