The Big Data Visualization Conundrum

24 Sep

Businesses have been analyzing and evaluating data for as long as we can remember. Whether it’s combing through spreadsheets of financials or using complex algorithms and applications. Today, the latter is more common, and data is a major part of daily business around the world. It can be the deciding factor on when to launch a new product or can be the reasoning behind raising or lowering prices – among many other things. As businesses collect more and more data from more and more sources, often in real time, it is becoming increasingly difficult to analyze, evaluate, and visualize in order to quickly make critical business decisions.

The buzz word these days when it comes to business intelligence and analytics is ‘Big Data’. Big Data refers to the amount of data being stored, the type of data being stored, and the speed at which it is obtained. Not only are people interconnected now more than ever, but different devices are as well. It is important for certain businesses, such as Internet based, to know what type of devices their customers are using, how often, and how they interact with those devices. On top of that, they are collecting data from a number of different sources. Perhaps they push out a few blocks of content across different platforms, have several landing pages and ad sources, or have data being retrieved from social media outlets. This data is not only difficult to store, it is difficult to make sense of.

One main reason this data is difficult to visualize is the variety of data. Many businesses ask themselves, “How do we store this data? How will we analyze?”. The traditional relational databases such as SQL have limitations in this regard, and often require a unique skill set in order to visualize complex data, although it has been found that the increased need for analytics in general (not necessarily Big Data) has boosted the use of SQL. There are also NoSQL databases, which do not store data in a relational model, and instead focus on storing and pulling data using other types of schema that are more favorable to the variety and speed at which that data is obtained.

It is starting to sound as if Big Data may be more trouble than it’s worth. However, this couldn’t be further from the truth. The problem lies in the fact that many businesses just don’t have the proper tools, or don’t understand how to make sense of the complex data. This brings us to the three main quandaries, or the three “V’s” as some call it, of Big Data. Variety, Velocity, and Visualization.

As we said earlier, there is a ton of variety when it comes to Big Data. You’re now collecting data from a ton of different sources, and it may come in a number of different forms. You might be trying to determine the emotional reaction of a customer based on where that person clicked on a page. Or you could be trying to determine whether your brand is getting a positive or negative reaction on social media. Taking it a step further, employees may store data in a PDF or other document, the IT department stores log files and have automated data feeds for managing critical equipment, the sales staff stores customer preferences and sales history. The marketing team stores data, the sales team stores data, the support team stores data, the IT staff stores data – and it’s all critical to decision makers in some way. The sheer volume of this data can be overwhelming, but make no mistake it can make a huge difference in the success of a business.

On top of the pure volume and variety of data is the speed at which it is obtained. A Tweet can be re-tweeted hundreds of times in the matter of seconds. An article can be read by thousands in just a few minutes, and employees are storing more and more documents by the day. On one hand it is difficult for companies to control the flow of data pouring in, as it often comes from outside sources. However, the fact that these companies can now receive this data in a real-time fashion allows them to make real-time analysis. Something that has become a driving force for many companies around the world.

Finally, once you have found a place to store all your data that your constantly retrieving you must find a way to present this data in a way that people can understand. In other words, you must visualize this data. This is perhaps the most difficult thing to accomplish. You could go with your basic instincts which are combing through the data looking for trends, outliers, and the like. Conversely, a simple pie chart or line graph is no longer sufficient to fully understand what your data means. This is especially important as you have many different skill sets and needs throughout the organization. What might make sense to someone on the business intelligence side, may not make sense to someone in sales, for instance. Finding a way to visualize this data different ways, or in a way that is understandable by all, is the hardest part. Due to the granularity of Big Data one must be able to delve in and find the information that is most important to them – and make sense of it.

Luckily, there are an increasing number of options out there to store and analyze Big Data, as well as an increasing number of people who specialize in this field. In fact, it is Big Data that drives the creation of more robust and easy-to-use tools that can store, analyze, and visualize Big Data sets. Companies like IBM, Amazon, and Cloudera provide tools for analyzing Big Data; and there are a number of open source options such as Hadoop and MongoDB. In short, Big Data is getting bigger – in a good way. The more businesses collect and use Big Data the more options there will be to analyze and visualize it. That’s great news for the business intelligence community.