Key concepts in data visualization

I’m a big proponent of Edward Tufte and his information design principles. I’ve included a few key concepts in this post from Tufte’s writings that I have found useful in developing my own visualization techniques and strategies.

Small multiples

Small multiples is a concept that leverages small plots repeated in succession to illustrate how conditions may be changing over time. The use of a series of small plots can be very powerful in representing time series views of large data sets in a very small space. I probably use this concept the most for visualizing transactional data sets consisting of millions of data points. Typically we can squeeze 10’s of millions of transactions into a single page, presenting a full day of transactions in a very limited space.

Data density

One challenge that I enjoy working on is increasing the efficient use of space when representing a data set. This involves looking at the data density, or the amount of data points represented per square inch in a plot. This use of space focuses on maximizing the amount of information presented to the user, allowing for a rich visualization that is efficient in summarizing large amounts of data.

Sparklines

Sparklines are a concept created by Tufte that leverages small word graphs that are embedded in text or used in small multiples to visualize a data set and embed it directly in your field of view when reading a document or parsing data. The classic use case is to embed the sparkline directly in a paragraph describing the data. This allows the reader to view the data without having to move their eye to a different part of the page. It’s far more descriptive and efficient.

Negative space

Sometimes we can say a lot by saying nothing. With data visualizations, we can show a lack of activity by simply showing white space. For instance, for OLTP style workloads, this can be quite useful in illustrating server failures. A time series plot of transactional activity may show white gaps with a lack of data. This is a great indicator of a problem on the system, and large swaths of logs can quickly be parsed by looking at time series plots for small white gaps.

Key concepts in data visualization

Small multiples

Data density

Sparklines

Negative space

Recent Posts

site tools

Key concepts in data visualization

Small multiples

Data density

Sparklines

Negative space

Tags

Recent Posts

site tools