Melvin Wevers & Juliette Lonij
This paper discusses how computer vision can be combined with text mining and metadata to extract cultural-historical information from advertisements. The digitized newspaper collection of the National Library of the Netherlands provides an important and exciting set of advertisements. Newspapers played a major role in the dissemination of advertisements. Roland Marchand argues that advertisements provide an insight into the ideals and aspirations of past realities. Advertisements show the state of technology, the social functions of products, and provide information on the society in which a product was sold (Marchand 1985). A large part of the meaning in advertisements was expressed in images. The extraction of advertisements from the corpus allows for the analysis of visual information in advertisements.
This paper reviews how we approached the extraction and analysis of advertisements. It also shows some preliminary results of how a combination of computational methods can be used to identify trends and breakpoints in visual trends in newspaper advertisements, which are a proxy for larger cultural-historical shifts in Dutch consumer society. Using text-mining techniques and metadata concerning the size and position of the advertisements, we clustered ads for different product groups. More specifically, for the text-mining, we used TF-IDF to extract significant words from the ads as well as entity linking to extract groups of products and their brands. For the clustering of the metadata, we applied k-means clustering on metadata such as width, page number, and number of character per square pixel.
We relied on TensorFlow and the ImageNet classifier and OpenCV to extract objects and faces from ads within these different categories (Fire and Schler 2015; Russakovsky et al. 2015). This paper presents preliminary results derived from a subset of the data and discusses how these results function as indicators for larger trends in the Dutch advertising landscape.
Fire, Michael, and Jonathan Schler. 2015. “Exploring Online Ad Images Using a Deep Convolutional Neural Network Approach.” arXiv Preprint arXiv:1509.00568. https://arxiv.org/abs/1509.00568.
Marchand, Roland. 1985. Advertising the American Dream: Making Way for Modernity, 1920-1940. Berkeley: University of California Press.
Russakovsky, Olga, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, et al. 2015. “Imagenet Large Scale Visual Recognition Challenge.” International Journal of Computer Vision 115 (3): 211–252.