By Mark Vareschi and Mattie Burkert
As a necessary precondition of large-scale digital humanities projects, texts, archival materials, and historical individuals must become data, a process that involves choices about collection, curation, and preparation. While scholars of media and digital culture make clear the mediated and constructed nature of data, practitioners of “distant reading” and related methods have been less inclined to offer a transparent account of their materials. In this essay, we model a theoretically rigorous approach to a new dataset of our own creation: a set of 1,421 playbills from eighteenth-century London. Tracing how categories operate over time on playbills, we find that the inclusion of genre is a more powerful mode of categorization for eighteenth-century theatrical publics than the inclusion of a named author. The case study of the generically- and authorially-indeterminate dramatic adaptations of Oroonoko and the shifting categories used to advertise them reveals that eighteenth-century theatrical publics had an idiom, previously unrecognized by scholars, for talking about generic ambiguity and even using it to market performances. Oroonoko and other plays that similarly challenged conventional generic and authorial categorization were often advertised as “a Play,” a seemingly empty label that is revealed to carry significance when these playbills are subjected to quantitative analysis. Throughout, we attend to the transformation of our archival artifacts into data objects, insisting that the knowledge claims we can make based on these playbills are enabled rather than hampered by our awareness of the highly mediated nature of the dataset. As such, our study demonstrates how a more reflective approach to humanities data collection opens up new interpretive terrain—terrain that takes advantage of the opportunities available at scale while maintaining the humanities’ commitment to ambiguity, mediation, and situatedness.