My recent projects as Editorial Assistant at the William Blake Archive have shared a mission: to ensure the consistency of the Archive’s text. My last project was to go through the Blake Archive Documents (BADs) and capitalize the C’s, P’s, and O’s in the words “Copy,” “Plate,” and “Object” (and their plurals) whenever they refer to specific copies, plates, or objects. My current project is to enter Bob Essick’s revisions to the lists of related works for each object, so that when the redesigned Archive is unveiled, it will have the most comprehensive, accurate, and consistent information possible.

For a body of work as vast as Blake’s, with its universe of versions, editions, copies, and drafts, one might think that consistency is a Sisyphean ideal, a task to be reset with every new editorial staff or version of the website. I’m reminded of the struggling librarians in the University Archives of Patrick Rothfuss‘s Kingkiller Chronicle:

“Cataloging,” Wil said. “There have been many different systems over the years. Some masters prefer one, some prefer another.” He frowned. “Some create their own systems for organizing the books.”

I laughed. “You sound like they should be pilloried for it.”

“Perhaps,” Wil grumbled. “I would not weep over such a thing.”

Sim looked at him. “You can’t blame a master for trying to organize things in the best way possible.”

“I can,” Wilem said. “If the Archives were organized badly, it would be a uniform unpleasantness we could work with. But there have been so many different systems in the last fifty years. Books mislabeled. Titles mistranslated.”

He ran his hands through his hair, sounding suddenly weary. “And there are always new books coming in, needing to be cataloged. Always the lazy E’lir in Tombs who want us to fetch for them. It is like trying to dig a hole in the bottom of a river.”

As a digital project, however, the Blake Archive is not a river of texts sweeping inexorably forward and taking any hope of organization with it. Quite the opposite: by using GREP searching, or variable expressions, to find-and-replace bits of text in XML files, I’m now able quickly to implement changes across hundreds of documents. The hardest part is identifying the necessary patterns and developing formulae that will make only the desired changes in all the right places.

When GREP searching isn’t possible because the edits are too substantial to be boiled down to formulae, I find myself solving a series of micro-problems with the help of my best digital friends: copy and paste. My current project, revising lists of works related to each of the Archive’s objects, would be a ridiculously long labor if I tried to edit one entry at a time—some of our objects have hundreds of relationships. When faced with such data-mountains, I do what humans are best at: find and repeat patterns. For example, I’m asked to delete an entry for a relationship to Butlin 536 (the Butts set of Illustrations to Milton’s “Paradise Lost”) entitled Twelve Illustrations to Milton’s “Paradise Lost”: The Thomas Set. Rather than enter the next entries individually, I take a moment to recognize that the next eleven entries are an itemized listing of the “twelve illustrations” mentioned in the deleted entry, minus one. I copy the first into the XML file and add the appropriate HTML tags (italics and line breaks). I’m then able to copy-and-paste this finished entry ten more times. Since the only difference between them is the object number, the work goes quickly, although I must still be vigilant: often a single entry in a series will be housed at a different location, or created on a different medium. By combining my capacity for recognizing and deploying imperfect patterns with a little digital magic, a boulder of a project becomes manageable.

These editing processes entail some puzzling, but their payoff is significant: not only do they keep the Archive’s text (that generated by both past and present staff) consistent, but they also keep the Archive accessible to its users. The ability to consistently style and organize data is one of the major advantages of the digitization of texts. When biological and digital skills are combined, consistency—even across generations—becomes possible.

Work Cited

Rothfuss, Patrick. The Wise Man’s Fear: The Kingkiller Chronicle: Day Two (p. 123). Penguin Group US, 2011. Kindle Edition.