This project involves corpus development and text mining of the medieval Latin letters issued by the papal writing office of Rome between 858 and 882 CE. Three popes sat on the papal throne during this transformative quarter-century: Nicholas I, Hadrian II and John VIII. Together, they wielded a brand of papal authority and influence not seen since the foundational rule of Gregory I (590-604), and not to be matched again until the reform pontificate of Gregory VII (1073-1085). In their negotiations and struggles with the neighboring Byzantine and German empires, they established the authority of Rome within Europe and helped shape the very identity of Europe for centuries to come.
Nicholas I, Hadrian II, and John VIII recorded their thoughts, strategies, negotiations and vision in an extensive collection of 606 letters available today on 840 printed and OCR’d pages at the digital site of the Monumenta Germaniae Historica http://www.dmgh.de. Each of these letters is more or less accurately dated and addressed to a particular recipient in Europe or Byzantium. Each one issues an argument, injunction or complaint about a specific situation in a particular time and place. Most letters also cite earlier letters, papal decrees, and/or biblical quotations to buttress their arguments. By early medieval standards, this is a treasure trove of intertextual activity in legal, theological and geographic realms.
Scholarly attention to this chapter of papal history has by and large focused either on a single pope’s letter collection, or on a specific episode from their pontificates. By contrast, relatively little remains known about the broad ideological contours of Nicholas I, Hadrian II or John VIII taken as a group. This is an important and sorely needed intervention, for we know that the popes did not themselves write their own letters; instead, they relied on a small group of letter writers, some of whom served under more than one pope. Therefore, to understand the origins and dynamics of their influential vision of European power and identity, we need to study their letters as a single corpus. A digital humanities approach holds great promise in this regard.
The proposed project will proceed through stages of corpus development, text mining, and data visualization. Each stage will require extensive collaboration with students and library staff.
In Stage 1 (corpus development), we will convert the existing corpus of printed papal letters into a collection of digital text files. To do this, we will split up the 840 pages of letters equally among the participants. These pages have already been OCR’d and checked by the MGH editors with high accuracy. So our work will involve two simpler steps: first, deleting the editorial apparatus (headers, footnotes, line numbers, etc.) that accompanies each OCR’d letter; and second, outfitting each letter with metadata tags recording date, issuing pope, recipient, subject, and location of recipient. I will make myself available over Skype and email for consultation during this process.
In Stage 2 (text mining), we will apply methods of digital text analysis and stylometry to trace and quantify the literary tastes, practices and strategies of the various authors behind the letters in the corpus. I would like to open this stage by brainstorming with the students and library staff the types of questions we can ask, and the computational steps needed to answer them. For example, text mining tools would help us identify favored and rare turns of phrase, formulas of address and closing, biblical citations and their uses, canon law citations and their uses, etc. Meanwhile, methods of stylometry would help us cluster the letters into authorial groups, and might even identify the authors themselves. Once we establish our questions, we will need to apply software to our corpus in order to trace these features and identify the clusters. I have a number of text analysis and stylometry scripts already prepared in the Wolfram Language (Mathematica) that I will make available to collaborators; much of their work will involve using these scripts to mine the corpus for questions of interest to us. I will encourage the students to work on programming skills by modifying these scripts or writing new ones, and will be happy to supervise their coding work.
In Stage 3, we will consider the presentation and visualization of our results, e.g., dispersion plots of particular biblical citations across the corpus, matrices showing repetition and borrowing between letters, maps of letter recipients revealing the geographical perspectives of the popes, etc. I envision two publications emerging in this stage: a report on our findings in a scholarly journal, and a freely available database of geotagged papal letters at the Digital Atlas of Roman and Medieval Civilization (https://darmc.harvard.edu/data-availability).
I expect to complete Stage 1 (corpus development) and make significant progress on Stage 2 (text mining) between June-August 2018, and to complete the project in AY 2018-19. I expect no prior knowledge or experience from the students besides an open mind and an interest in medieval history, language and text mining. A basic working knowledge of Latin is preferred.