Why Big Data is so Difficult

Don E. Schultz
Marketing News
Current average rating    
Key Takeaways
  • "Generally, we’re social scientists, not engineers. Perhaps with the exception of people who have been working in direct and database marketing, many of us were not trained to deal with massive data sets or sophisticated analytical tools."
  • "We’re also challenged, as marcom people, because Big Data never stops flowing in. It’s perplexing to think that you have your arms around a project, only to find more data coming in over the transom—and you can’t close the transom."
  • "Big Data is messy, it’s dirty and it’s sloppy. It doesn’t fit in the frameworks that we social scientists have developed and codified over time, so when some of it spills out, we’re generally at a loss as to what to do."

Having just returned from an academic conference, where much of the discussion revolved around “Big Data” (there were as many definitions of Big Data as there were participants, plus maybe a few extras), I began to realize why academics and practitioners alike have so much difficulty with the topic. It’s not just that it is new or even that it involves lots of numbers. It’s that Big Data simply confounds and confuses many marketing and communication people because it is so different. 

Generally, we’re social scientists, not engineers. Perhaps with the exception of people who have been working in direct and database marketing, many of us were not trained to deal with massive data sets or sophisticated analytical tools. That’s something that the IT people do for us. The primary tools that we’re accustomed to using have been focus groups, surveys, consumer panel data and the like. We get more than a few hundred responses and we think that we’ve unlocked the mysteries of the universe. Many of us have to learn a new set of tools to be able to move beyond Excel spreadsheets and crosstabs.   

We’re also challenged, as marcom people, because Big Data never stops flowing in. It’s perplexing to think that you have your arms around a project, only to find more data coming in over the transom—and you can’t close the transom. In most traditional marcom research projects, there’s a beginning and an end for data collection. Often, that’s driven by a planning cycle that’s broken down into financial quarters because that’s how budgets are allocated. We’ll do a bit of research in the third quarter because we’ve budgeted for it during that time period on the planning calendar. 

As advertising and PR people, and even as sales promotion and direct marketing folks, we’ve always worked with campaigns, activities with a “start” and “stop” date. We know how to plan and execute against that, but with Big Data, things just keep happening. The flood continues and the piles of data keep getting higher. Data never seems to stop unless the system crashes or all of the disks fill up. Moving from a campaign mentality to one of continuous activity in a dynamic system isn’t in most marketing communication people’s DNA.  

Another challenge is that Big Data is unstructured, which simply means that no one has sorted through the raw data and put it into nice, convenient categories that can be analyzed, and to which statistical tools can be applied. Big Data is messy, it’s dirty and it’s sloppy. It doesn’t fit in the frameworks that we social scientists have developed and codified over time, so when some of it spills out, we’re generally at a loss as to what to do. We’re continuously looking for significance—that is, something that either verifies our hypotheses or negates them. Many of us are “theory-bound”: Things are supposed to happen in a certain way and when they don’t, we’re at a loss. But then here comes more data, which either confounds us, or presents a new context for what we’re trying to manage and control.  

At the conference that I attended, one speaker raised the issue of “big and thick data,” or data that has several levels to it. It’s multidimensional, and most advertising and marketing communication researchers don’t know how to deal with it. Rather than having clear-cut categories for respondents, we end up with people who are users in more than one category and they switch back and forth, making them almost impossible to put into a single box in a PowerPoint presentation.  

There’s another problem with Big Data, too: It’s not just big and thick. It’s also long and broad. In other words, most of the real knowledge found in Big Data comes from longitudinal data, which is gathered and analyzed over time. That’s important because people are continuously changing, evolving and adapting. All of us go through life stages, but unless we look at people over time, it’s hard to understand when they started a certain life stage, where they’ve been and where they’re going. While we have some research tools such as time series analysis, for the most part, advertising and marcom researchers focus primarily on “here and now” snapshots. The world that we inhabit, though—how people behave over time and in different circumstances—is mostly a moving picture and that type of Big Data is messy, confusing and complex, and not really suited for presentations in which we can summarize the entire world in what amounts to a Twitter post.  

The biggest challenge in dealing with Big Data, in my view, is that it’s all networked: There are very few clear-cut relationships where we can say that A leads to B and B leads to C, and so on. Yet most of our statistical tools are linear. Everything is assumed to follow a “normal curve” with a “normal distribution.” We have few ways of dealing with nonlinear systems, and even fewer tools to help in the analysis of Big Data’s networked and nested relationships.  

Big Data is thick and broad and long, and also bumpy and continuously changing. It requires a new set of tools and, perhaps, even a new set of researchers to make use of them. Trying to use today’s tools to understand tomorrow’s problems just isn’t going to work. We need to re-educate ourselves in “real time” or risk becoming obsolete. 


This article was originally published in the May 2014 issue of Marketing News.​​​


Author Bio:

Schultz headshot
Don E. Schultz
Don E. Schultz is a professor (emeritus-in-service) of integrated marketing communications at Northwestern University in Evanston, Ill. schultz@northwestern.edu
Add A Comment :