Computerized Content Analysis of Online Data


Opportunities For Marketing Scholars And Practitioners, Special issue of European Journal of Marketing; Deadline 31 Jan 2019


At the end of 2016 it was estimated that 90% of the world’s data had been created in the last two years alone, at a rate of some 2.5 quintillion bytes of data a day (IBM Marketing Cloud, 2016). Moreover, the data growth rate is accelerating rapidly. A number of factors account for this, including the global proliferation of mobile devices, the advent of the Internet of Things (which means that technologies and not just humans are producing data), and of course, the rise of social media, which has made everyone a broadcaster. The bulk of data is also no longer numerical, and it is seldom neatly organized. Most of the data being produced today is in the form of unstructured text, and increasingly, video, audio and graphics. Making sense of it is no longer as simple as taking refuge in a database, a spreadsheet or a statistics package.

Consumers produce data nowadays in many formats, including blogs, onine interviews, text messages, emails, online reviews of products and services, and social media posts. These postings range in size from a 140 character tweet to a multi-page review of a restaurant on a travel website such as TripAdvisor. They vary in complexity from an inane, “Love your hat LOL” in response to a friend’s picture on Facebook, to a multifaceted review of their hip replacement surgery by a patient on a medical website such as Medicinenet. Individually, most of these messages enlighten marketers very little. When agglomerated however, they can reveal patterns and provide insights that can inform researchers and managers significantly.

Content analysis, the common term used for a range of techniques for gathering and analyzing the content of a piece of text or document, has been used as a data exploration tool by social scientists for a long time. This content may include words, meanings, pictures, symbols, ideas, themes, or any message that can be communicated (Neuman, 2003). It is also represents a range of methods for codifying the contents of a document into various themes or categories, depending on the criteria selected by the researcher (Weber, 1988). A plethora of research on, and using content analysis in the marketing field has, for example, focused on searching for meaning in magazines (see Tse, Belk and Zhou, 1989; Gross and Sheth, 1989; Kolbe and Burnett, 1991; Kolbe and Albanese, 1996;); television advertisements (Resnik and Stern, 1977; Dowling, 1980) and best-selling books (Harvey, 1953; Mullins and Kopelman, 1984). Some scholars (e.g. Bryman and Bell, 2003; Berelson, 1952) argue that content analysis is quantitative, while others (e.g. Miles and Huberman, 1994; Boyle, 1994; Tesch, 1990) make qualitative claims. A third camp sit on the fence, and argue that content analysis is dynamic in nature and that it can be both qualitative and quantitative (Marshall and Rossman, 1999; Cooper and Schindler, 2003; Krippendorf, 2004). Perhaps more significantly, until the mid-1990s, most content analysis in marketing research was conducted manually; for example in her seminal work on male-female partner-seeking exchanges, Hirschman (1987) employed human coders to code hundreds of newspaper column “personals” ads.

The sheer volume, variability, veracity and velocity with which online data are produced today, and as a result are available to marketing researchers, makes manual content analysis futile at best, and mostly, impossible to conduct. As Humphreys and Jen-Hui Wang (2017, p. i) point out, “Researchers, consumers, and marketers swim in a sea of language, and more and more of that language is recorded in the form of text.” Fortunately, the recent past has not only seen a significant rise in the amount of unstructured textual data available to researchers, but also a noteworthy increase in the number and sophistication of tools available to perform text content analysis using computers. These range from software that relies on supplied- or user-created dictionaries, such as WordStat (e.g. Pitt et al., 2007) and DICTION (e.g. Short and Palmer (2003; 2008), to packages that produce graphic output that users then need to interpret, such as Leximancer (e.g. Campbell at al., 2011). Perhaps the most exciting and promising tools for automated content analysis are in the domain of artificial intelligence. Documents and even whole websites can be analyzed using tools such as Google’s Brain and Microsoft’s Azure. Currently, among the most advanced platforms is IBM’s Watson, a tool that has begun to achieve prominence in the marketing and management literatures. Amongst other applications, Watson can identify an author’s Big Five personality traits by reading a piece of text, as well as their consumer preferences and values. Watson can also uncover the emotions (Cabanac, 2002; Ekman, 1992) and sentiment expressed (Turney, 2002) in a piece of text or on an entire website (e.g. Treen et al., 2017).

Humphreys and Jen-Hui Wang (2017) contend that while automated text analysis cannot be used to explore all marketing phenomena, it is a useful tool for examining patterns in text that researchers could not be able to uncover without access to computers and powerful software. The wide range of software tools available to marketing scholars and practitioners today enables them to uncover and work with a spectrum of psychological and sociological constructs produced by organizations and players in both consumer- and business-to-business markets.

While work that uses automated text analysis to explore marketing issues has begun to appear sporadically in scholarly journals, there is a need for a concerted concentration of work that will alert the broader academic marketing community to the potential of automated text analysis. This special section of EJM seeks to publish excellent recent work on automated text analysis in marketing from both theoretical/conceptual and empirical perspectives.

Possible topics might include, but are not limited to the following:

  1. Theories of text and content analysis in marketing
  2. Comparisons of various approaches to automated content analysis in marketing
  3. Studies of various phenomena such as consumer – and other stakeholder emotions, sentiment, personalities, values and preferences as uncovered by automated text analysis
  4. Studies that link the textual characteristics of websites, such as for example, sentimemnt and emotions, to aspects of their performance (hits, search prominence, sales etc.).
  5. Longitudinal analysis of textual data to identify and interpret broader trends over time
  6. Measurement issues when conducting automated text analysis, including, sampling, reliability and validity
  7. The ethical research issues involved in gathering what is perceived by many to be “free” data in the form of personal blogs, product and service reviews, and posts on social media
  8. Studies that link textual data to other kinds of data, for example: text generated by organizations in the form of websites, blog posts and policy statements to published performance data, managerial actions; text generated by consumers on social media and other measures obtained in face-to-face and online interviews, or other independent actions
  9. The possibilities of using automated analysis to explore non-textual, non-numeric data, such as graphics, sound and video.

The Co-Editors

Leyland Pitt, MCom, MBA, PhD, PhD is the Dennis F. Culver EMBA Alumni Chair of Business and Professor of Marketing in the Beedie School of Business, Simon Fraser University, Vancouver, Canada. The author of over 300 papers in peer-reviewed journals his work has been published in journals including Journal of the Academy of Marketing Science, California Management Review, Sloan Management Review, Information Systems Research, European Journal of Marketing , Journal of Advertising, and MIS Quarterly (which he also served as Associate Editor). Currently he is Associate Editor of the Journal of Advertising Research and Business Horizons, and editor of the Journal of Wine Research. The special issue on Social Media and Creative Consumers that he co-edited with Pierre Berthon for Business Horizons in 2011 is the single issue of the journal with the highest average number of citations per paper.

Jan Kietzmann, BCom, MEC, PhD is a professor of Innovation and Entrepreneurship (I&E) and Management Information Systems (MIS) in the Gustavsom School of Business, University of Victoria, Canada. His research interests combine organizational and social perspectives related to new and emerging technologies. Jan’s current research projects include such phenomena as social media, crowdsourcing, user-generated content, 3D printing, gamification and sharing economies. Jan has published primarily in MIS, Marketing and general business journals, including California Management Review, European Journal of Information Systems, Journal of Strategic Information Systems, Management Information Systems Quarterly Executive, Journal of Knowledge Management, Journal of Advertising Research, and Organization & Environment. Jan is perhaps best known for his award-winning 2011 Business Horizons article “Social media? Get serious! Understanding the functional building blocks of social media”, which has now been cited more than 3500 times.

Contact details:

Leyland Pitt:
Jan Kietzmann:

Submission Details:

Prior to submission please review the EJM submission guidelines at and submit online following the instructions provided. Please ensure you select this issue from the drop down menu when you submit.

Closing Date for Submissions: January 31, 2019


“10 Key Marketing Trends for 2017 and Ideas for Exceeding Customer Expectations”, 2016. IBM Marketing Cloud, IBM Corporation: IBM Watson Marketing.

Berelson, B. 1952. Content Analysis in Communication Research. Glencoe, IL: Free Press.

Boyle, J. S. 1994. “Style of Ethnography”, in Critical Issues in Qualitative Research Methods, Morse J.M. Ed., 159-185. Thousand Oaks, CA: Sage Publications.

Bryman, A. and Bell, E. 2003. Business Research Methods, New York, NY: Oxford University Press.

Cabanac, M. 2002. What is emotion? Behavioural Processes, 60, 2, 69–83.?


Campbell, C.L., Pitt, L.F., Parent, M., and Berthon, P.R. 2011. Understanding Consumer Conversations Around Ads in a Web 2.0 World, Journal of Advertising, 40, 1, 87-102.

Cooper, D. R. and Schindler, P. S. 2003. Business Research Methods, New York, NY: McGraw-Hill.

Dowling, G. R. 1980. Information Content in U. S. and Australian Television Advertising, Journal of Marketing, 44(4): 34-37.

Ekman, P. 1992. An argument for basic emotions, Cognition & Emotion, 6, 3–4, 169–200.?

Gross, B. L. and Sheth, J. N. 1989. Time-Oriented Advertising: A content Analysis of United States Magazine advertising, 1890-1988, Journal of Marketing, 53, 3, 76-83.

Harvey, J. 1953. The Content Characteristics of Best-Selling Novels, The Public Opinion Quarterly, 17(1): 91-114.

Hirschman, E.C. 1987. People as Products: Analysis of a Complex Marketing Exchange. Journal of Marketing, 51(1): 98-108.

Humphreys, A., & Jen-Hui Wang, R. (2017). Automated Text Analysis for Consumer Research. Journal of Consumer Research, (in print).

Kolbe, R. H. and Albanese, P. J. 1996. Man to Man: A Content Analysis of Sole-Male Images in Male-Audience Magazines. Journal of Advertising, 25(4): 1-20.

Kolbe, R.H. and Burnett, M.S. 1991. Content-analysis research: An examination of applications with directives for improving research reliability and objectivity. Journal of Consumer Research, 18(2): 243-250.

Krippendorf, K. 2004. Content Analysis: An Introduction to its Methodology, 2nd Ed. Thousand Oaks, CA: Sage Publications.

Marshall, C. and Rossman, G. B. 1999. Designing Qualitative Research, 3rd Ed. Thousand Oaks, CA Sage Publications.

Miles, M. B. and Huberman, M. A. 1994. Qualitative Data Analysis: A Source Book of New Methods, 2ed. Thousand Oaks, CA: Sage Publications.

Mullins, L. S. and Kopelman, R. E. 1984. The Best Seller as an Indicator of Societal Narcissism: Is There a Trend? Public Opinion Quarterly, 48(4): 720-730.

Neuman, W. L. 2003. Social Research Methods: Qualitative and Quantitative approaches, 5th Ed. Boston, MA: Allyn and Bacon.

Pitt, L.F., Opoku, R., Hultman, M., Abratt, R., and Spyropoulou, S. 2007. What I Say About Myself: Communication of Brand Personality by African Countries Through Their Tourism Websites, Tourism Management, 28, 3, 835-844.

Resnik, A. and Stern, B. L. 1977. An Analysis of Information Content in Television Advertising. Journal of Marketing, 41(1): 50-53.

Short, J. C., and Palmer, T. B. (2003). Organizational performance referents: An empirical examination of their content and influences. Organizational Behavior and Human Decision Processes, 90(2): 209-224.

Short, J. C., and Palmer, T. B. 2008. The Application of DICTION to Content Analysis Research in Strategic Management. Organizational Research Methods, 11(4): 727-752.

Tesch, R. 1990. Qualitative Research: Analysis, Types and Software Tools. New York, NY: The Falmer Press.

Treen, E.R., Lord Ferguson, S.T., Pitt, C.S., and Vella, J. (2018). Exploring Emotions on Wine Websites: Finding Joy, Journal of Wine Research, 29, 1, 64-70.

Tse, D. K., Belk, R.W. and Zhou, N. 1989. Becoming a Consumer Society: A Longitudinal and Cross-Cultural Content Analysis of Print Ads from Hong Kong, the People’s Republic of China, and Taiwan. Journal of Consumer Research, 15(4): 457-472.

Turney, P. (2002). Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. Proceedings of the 40th annual meeting of the association for computational linguistics (pp. 417–424). Philadelphia, PA.?


Weber, R. P. 1988. Basic Content Analysis. University Paper Series on Quantitative Applications in the Social Sciences, Series 07-049. Beverley Hills, CA: Sage.