Anyone who has taken a photography class—or even a presentation class—has heard the constant adage, “A picture is worth a thousand words.” It’s a revelation about how people process information conveyed to them and what we’ve learned is that sometimes pictures can provide faster than a thousand words; but it’s only useful when the producer and viewer speak the same “language.” In the Big Data industry, visualization is the process of taking a lot of data and changing its context—this is often done with the hopes that something interesting will reveal itself either to unveil some effect/pattern otherwise hidden or to highlight something otherwise ignored.
There’s no doubt this is always true in big data enterprise.
The Dashboard and Analytics: Visualization for Customers
Vehicles and instruments convey information to users via dashboards and readouts in order to deliver much needed information at a glance—anyone who drives a car takes advantage of a number of single-datum gauges: speed, engine heat, oil pressure, etc. Big Data visualization for customers is extremely similar in its approach; with the exception that it tries to contextualize the data with a meta analysis. The customer has a product they’re using that watches an engine and analyses the information pouring through it in order to provide insights that would otherwise be lost in the moment.
Looking at visualized data, charts that compare two or more types of data across a timeline can help us understand if two sources are related or make inferences about behavior within the underlying data. Keeping the engine as a metaphor, a long-term graph that charts engine heat with oil pressure might reveal the tell-tale signs of imminent failure if pressure spikes correlated to heat increases. For a webmaster, an increase in page load time coupled with an increase in traffic might lead them to discover a particular image-heavy page becoming popular—allowing them to streamline that page so that the server load doesn’t adversely affect other traffic.
A dashboard filled with data visualization acting as an instrument panel becomes a common approach for giving an at-a-glance view of the current health of a system for a multitude of products. We’ve seen this put to use in the newly minted Android expanded stats dashboard for developers to get an idea of who uses their software and how it gets distributed. It’s inherent in analytics products, such as Twitter Analytics, that are used my marketing and social-media companies to understand how much impact their campaigns are generating. Data centers and infrastructures need to collapse a multitude of data streams and metrics into a visual format in order to understand trends and make use of them—thus the emergence of real-time cloud analysis that would rely heavily on the ability to visualize its insights.
The Infographic: Visualization for Consumers
Sometime during the “information age” both marketers and journalists learned that statistics are extremely boring—and extremely difficult to convey in text without a lot of follow up and follow through. So most of the time when a statistic is used, it’s thrown out there with a percentage attached, or just left a number; but very big and very small numbers don’t mean anything to readers without some sort of context or comparison. So pretty charts and graphics in order to visualize the comparison began to emerge—graphs displaying dollar-signs of varying heights might be used to show a visual comparison between the cost of two laptops in order to convey the difference in price.
Today, infographics are loosely organized into a type of visual information that puts words, numbers, and different types of visualization together with pretty pictures in order to convey a lot of information all in one go. They often attempt to convey this information while also telling a story with the imagery or attempting to highlight one particular aspect of their revelation.
Infographics are for consumers; an ecosystem of visualization methodology designed to deliver a particular point: graphics to give concepts of scale wed with cute pictures to drive a narrative; maps such as heat-maps, color-maps, icon-maps that show how populations cluster or disperse around particular ideas or geographic regions. These are used by journalists and educators alike to quickly convey information about large-scale events that draw in a lot of data from multiple sources—the CDC uses heat maps to track pandemics; the Red Cross might use one to pinpoint casualty zones; political maps display the locality and degree of influence a particular idea or trend may have on a population.
Most infographics are static, they snapshot a particular variety of data in order to paint a picture or make a point more visible. Examples include a graphic designed to deliver statistics about the safety of customer data in the cloud, using not just numbers and graphics but also memorable koans and messages about data security; visualizing the impact of the public vs. the private cloud by comparing costs, industry use, and other information designed to highlight the difference; a vast longitudinal amount of data can also be compressed into (albeit a multipage graphic) about how cloud-computing/-storage has changed and will change over a 5-year period.
Then there’s the next stage of information retrieval and visualization. A step beyond the rolling tickers, spinning numbers, and graphs and charts that would simply make most magazine readers dizzy—an interactive design that updates with changes being fed to it, while also conveying a particular message. One great example of this happens to be from politics: voter statistics. Last year, Viralheat came up with a big data exemplar of how to use social media statistics, coupled with locality data, and an analysis of social traffic to predict voting results and gauge the political atmosphere of the United States during the midterm elections. The visualization launched on the Huffington Post and delivered some interesting results.
Infographics show that visualization is a lot more than just presentation—it’s about communication.
Something’s Gotta Code: Visualization for Developers
Often, putting the data and the production directly in the hands of the consumer or customer can be a very good thing. Google has done just that with an amazingly ambitious project that crunches public data: Google Public Data Explorer (more coverage here). With this product, Google absorbed datasets from numerous public venues that boiled down into databases, spreadsheets, locality data, social data, statistics, and more and put them into a format that interesting parties could then put together into graphs, charts, and a multitude of other visualization sources (or just export them for their own.) It also gives visitors the hands-on capability to upload and publish their own datasets and compare them using Google’s giant number of analysis products.
As I mentioned earlier, visualization can be extremely important to educators and journalists—this is especially true for sociology and anthropology, my chosen profession—and having products that provide hands-on ease-of-use to students, educators, and journalists that allow them to pull insights about people from otherwise inaccessible mountains of data go a long way. Not to be left in the dust by Google, Microsoft also released some visualization software with Visual Fusion 5.0. Geolocation is just another form of data that can be collected along with other factors and especially in the case of events and people, location can be extremely important.
Anyone who has seen a science fiction (or even contemporary) movie that involved computers has seen a giant amount of flashy special-effects magic visualization on computer screens. Certainly, real life doesn’t work that way (yet), but sometimes when it comes down to presenting a thesis about the coverage of devastation from a natural disaster, following the wake of a buying trend among home-owners, or portraying the pattern of disease clusters it comes down to a pretty graphic that the people who need to know can know now rather than having to listen to a lecture on exactly how the underlying analysis works.