March 9, 2014

An underlying theme in the ongoing discussion on Big Data is that there is simply too much data to make good use of it all. Still, getting or producing more data is often the right thing to do in many cases.

Information overload has been a major concern for years - after computers became common in the work place, and especially since the internet arrived in the mid-90s. Research shows that productivity is suffering from this overload of (digital) information, as workers are overwhelmed by the amount of structured and unstructured data hitting them every single work day.

A lot of data is tucked away on servers or in databases without being used, and thus creating little or no business value. Data management policies are often concerned with keeping storage costs down or upholding legal obligations rather than turning information into a corporate asset.

So why amass even more data when it’s difficult enough to make use of what you’ve already got? In fact, there are many situations where getting more data will be helpful. A well-known example is data scientist Nate Silver accurately predicting the 2012 presidential election,while most traditional political pundits failed. With detailed, data-driven analyses, Silver was able to call the outcome of each state correctly and confidently predict the Obama victory.

The big internet players have long generated huge amounts of data on customer activities and sales transactions. Amazon started back in the 90s suggesting books you might like, based on your previous purchases, and then on other customers’ purchases. Now, Amazon wants to predict what you want to buy, and ship it to you before you make the order ("patent pending"…). Netflix’ movie recommendations are based on multi-categorization, user metrics, as well as likes and ratings. Google of course have stored more data than anyone, and use it to predict what you’re about to search for, which ads to show you, how to rank search results, and much more. For Google, the "more data" approach is behind everything it does, from search algorithms to self-driving cars.

For a small business that didn’t start out with a net- or data-centric business model, it’s more difficult to take full advantage of the opportunities present today. The good news is that you don’t have to build the tools yourself. If you have an online business - a web site or e-commerce solution – it’s easy to collect transactional data. By using tools like Google Analytics you get a wide range of reports, and can customize them for your needs.

A potential problem for decision makers is conflicting data. If your sales data doesn't show any clear trends you might be tempted to revert to gut feelings or anecdotal evidence to decide what customers or market segments to prioritize. Rather than acting on instinct you should drill down further and generate new reports to get deeper insights. You might also want to set up additional data collection points to produce more data for analysis. This is normally easier and cheaper to do when your business is online, but it’s also doable at “real-life” points-of-sale. A shopping mall, for instance, may collect visitor data in each mall entrance in addition to sales data. 

Unstructured data in the form of texts and documents are often seen as the main culprit of information overload. Organizations create large volumes of file-based information, store them on intranets, file servers or document management systems, never to be seen or used again.  

Fortunately, there are ways to overcome this challenge too. For instance, employees should learn to “tag” and categorize documents properly before committing them to storage, in order to enhance their findability. Search engines can use metadata filters to provide more accurate and tailored results for various user scenarios. Thus, by creating more data - metadata - organizations will be able to get a bigger payback from their data repositories and document sets.

The emergence of online media and social media is yet another opportunity to add new data to decision-making processes. Competitive analysis, market research, and customer surveys may be enhanced and supported by data pulled from Twitter, Facebook, blogs, news feeds and search engines. Given the right tools, like TextOre.net, this additional data may be what you need to identify or explain market trends and make the appropriate decisions.

Therefore, in the age of Big Data, “more data” should be seen as a competitive advantage and not as a problem. When in doubt, getting more data is the right thing to do - at least if you have the mindset, tools and processes to handle it.

TextOre, Inc.