Match, compare, classify, annotate: computer vision tools for the modern humanist

Giles Bergel

This paper will relate the University of Oxford’s Visual Geometry Group’s experience in making images computationally addressable for humanities research.

Humanities data is distinctive: humanities applications are thus demanding use-cases for computer vision. The Visual Geometry Group has built a number of systems for humanists, variously implementing (i) visual search, in which an image is made retrievable; (ii) comparison, which assists the discovery of similarity and difference; (iii) classification, which applies a descriptive vocabulary to images; and (iv) annotation, in which images are further described for both computational and offline analysis. Methods range from bag-of-visual-words retrieval systems to neural network based classifiers. Applications have included the detection of reused printing surfaces; the retrieval of similar shapes of sculptures and ceramics; the annotation of complex medieval diagrams; the classification of large corpora of illustrations; and temporal sequencing of copies of artworks. The paper will describe and demonstrate these and other tools.

The paper will also outline the author’s experiences in preparing humanities materials and research questions for digital analysis. Trained as a book historian working on cheap printed broadside ballads, he utilises computer vision to address historical, iconographic and forensic research questions. These include the identification of unique printers’ woodblocks; the detection of stylistic similarities; and their classification by place of origin or content. The paper will reflect on how to adopt computational methods for this area, including working with such interested parties as librarians and collections managers; printers and book-conservators; and art historians and literary scholars. How to engender and manage a conversation between these parties is a particular interest, and a necessity for a successful collaboration.

Lastly, as computer vision tools make their way out of the lab and into the digital humanities toolbox, the paper will offer some thoughts on how humanists can make these tools truly their own, as co-creators as well as users of them. Are humanities materials more than a way in which computer vision tools can be benchmarked, or can humanists improve them? Do they replace or supplement traditional methods? What special knowledge or priorities can humanists bring to the table? Does doing visual research computationally fundamentally change the nature of this work?


J. Crowley, A. Zisserman, The Art of Detection, Workshop on Computer Vision for Art Analysis, ECCV, 2016

J. Crowley, O. M. Parkhi, A. Zisserman, Face Painting: querying art with photos, British Machine Vision Conference, 2015

S. Chung, R. Arandjelović, G. Bergel, A. Franklin, A. Zisserman, Re-presentations of Art Collections, Workshop on Computer Vision for Art Analysis, ECCV, 2014

J. Crowley, A. Zisserman, In Search of Art, Workshop on Computer Vision for Art Analysis, ECCV, 2014

Arandjelović, A. Zisserman, Name that Sculpture, ACM International Conference on Multimedia Retrieval, 2012

Bergel, A. Franklin, M. Heaney, R. Arandjelović, A. Zisserman, D. Funke, Content-based Image Recognition on Printed Broadside Ballads: The Bodleian Libraries’ ImageMatch Tool, Proceedings of the IFLA World Library and Information Congress, 2013