I apologize for my prolonged absence. It has been a very busy time for me for the last few months, but things at the Medieval Academy are slowing down a bit as we enter the summer months. I’ll get back to writing about North American manuscript collections next time. Today, I want to take a detour from my road trip and drive around the neighborhood of best-practices and Digital Humanities.
Two online resources that I use often and steer my students toward have disappeared in recent weeks. The Harry Ransom Center Fragment Project, an online repository of images and associated metadata for hundreds of manuscript fragments found in early bindings at the University of Texas at Austin, has been taken down due to the non-renewal of the staffer who was producing what was turning out to be a fruitful project. The other was a very important and otherwise unpublished resource for identifying the origin of Books of Hours that went dark following the death of the Principal Investigator, Erik Drigsdahl. Both projects are partially retrievable through the Internet Archive’s Way Back Machine, the second more effectively than the first, but this situation demonstrates why the best DH projects take sustainability and long-term digital archiving seriously.
In March of 2014, when I wrote about early manuscripts in Texas, I held up the Harry Ransom Center at the University of Texas at Austin as a model of digitization, metadata structure, and crowd-sourced cataloguing:
In addition to digitizing its codices, the Ransom Center has been engaged for some time in an innovative crowd-sourcing cataloguing project, posting images of fragments recovered from bindings in their collection on a Flickr Photostream and asking the hivemind to help identify and catalogue them.
This project is a spectacularly successful example of how social media and web networks can be used to tap into the collective expertise of scholars worldwide. The image collection not only demonstrates the myriad ways manuscripts were recycled as binding waste but is also yet another example of why it is always worthwhile to conduct a survey of the early bindings in your Special Collections library. The Ransom Center undertook just such a survey and found a treasure trove of hundreds of fragments – some nearly 1,000 years old – hiding in the stacks.
I’m sorry to report that while the University’s codices are still accessible online, the Fragment Project’s Flickr site has been taken down because Micah Erwin, the staffer who spear-headed the project, did not have his contract renewed (more about that here). Personnel issues aside, the fact that the images are no longer online is a great loss to scholarship, to students, and to the manuscript community. This collection of images and associated metadata was a great example of why surveys of early bindings are worthwhile, regardless of how you may feel about the effectiveness of crowd-sourced cataloguing. Micah had made several important discoveries, finding Carolingian fragments, bits of early music and liturgy, excerpts from important texts, and medieval documents, all hiding in HRC’s early bindings. He had presented the project at conferences and symposia around the country, and scholars were just beginning to make use of the images and Micah’s careful metadata. I am hopeful that at the very least the University will decide to host the images and metadata in some format and that the hundreds of images have been archived.
With the death of Erik Drigsdahl in March 2015, the field of manuscript studies lost a great scholar who had produced an important body of work. One of his greatest contributions to the field was his online Book of Hours tutorial, an introduction to working with and understanding Books of Hours that included an updated and expanded list of the famed (and flawed) Falconer Madan Tests for Localization (F. Madan, “The Localization of Manuscripts,” in H. W. Carless Davis, ed., Essays in History Presented to Reginald Lane Poole (Oxford: Clarendon Press, 1927), pp. 5-29). This refers to the strategy of using the Prime and None antiphon and chapter reading incipits from the Hours of the Virgin to help determine the locale for which a particular Book of Hours was made. Falconer Madan published dozens of such combinations; Drigsdahl expanded Madan’s list to include hundreds. Much of his work was otherwise unpublished, and with the expiration of the domain registration (chd.dk), Drigsdahl’s site is now offline. Fortunately, the pages were last backed-up to the Internet Archive’s Way Back Machine on April 6 and can be accessed here, and although this is essentially a screenshot it does preserve Erik’s research. Peter Kidd reports that he will soon be meeting with Erik’s executor in hopes of reviving the site on a different server.
These are cautionary tales of which anyone involved in Digital Humanities should take note. In the first case, lack of institutional commitment and support led to the demise of a very worthwhile project. The second case begs the question, what happens to our digital footprint after we die? The Internet Archive, through the Way Back Machine, preserves snapshots of sites for posterity, and we are all learning to save our work to The Cloud, but these are, in the great scheme of things, short-term solutions. Every Digital Humanities project must include plans for sustainability, long-term viability, and archiving as part of initial planning and strategy.
A recent study by Valerie Johnson and David Thomas of digital projects funded by the New Opportunities Fund in the UK found that of the 155 projects granted support between 1998 and 2003 (at a cost of £55 million), “twenty-five can no longer be found, while there have been no changes or enhancements to a further eighty-three. Of the 155, there are only thirty which have been enhanced or added to since the launch. So in less than ten years, sixteen per cent of resources have been lost and fifty-three per cent have, at best, stagnated.” (Valerie Johnson and David Thomas, “Digital Information: ‘Let a Hundred Flowers Bloom…’ Is Digital a Cultural Revolution?,” in The Sage Handbook of Historical Theory, edited by Nancy Partner and Sarah Foot (London: Sage Publications, 2013), 466-467). These are alarming statistics that are almost certainly representative of the larger problem confronting Digital Humanities.
I have no idea what the “permanent” long-term digital archiving solution is going to turn out to be. We already know it isn’t the Cloud or tape or CDs or floppy disks or punch-cards. Data migration and storage upgrades are a constant necessity, as technologies change at a faster and faster pace. The Internet Archive, which is a good model for best-practices, recommends full data migration and storage upgrades every ten years. In the meantime, I’m saving this blog to my external drive, storing files in the Cloud, saving to PDF, and printing hardcopy for storage in files and binders. In the long run, I’m not so much worried about the survival of the rare books and manuscripts that are the primary subject of my blog. They’re fairly sturdy and have managed to survive fire and flood and rodents and pillaging and censorship and ocean voyages and centuries of use. Even fragments and single leaves – broken and cut by dealers and collectors – are long-term survivors, as the HRC Fragment Project demonstrated. Printed books on paper can survive for hundreds of years, and manuscripts handwritten on parchment are built to last for a millennium. It’s the digital word we have to worry about.