By the Book: Catching Up with the XML Tag Set

This is not a call for reform. This is not an admission of guilt.

Moving deeper into Blake’s typographic works for the Archive has presented a number of new encoding questions, particularly with how to handle potentially “secondary” text on the page, like printer’s marks, catchwords, page numbers, titles, etc.

The first kinds of questions we asked dealt with transcription display:

How would we handle different fonts/sizes/spacing?
Do we want to display “secondary text” with a specific color code?
How many different kinds of “secondary text” should we classify?

These are good questions to ask because a rich transcription can help users make sense of a manuscript. In the case of typographic works, it’s usually not a strain to read the text, but a rich transcription can help distinguish different kinds of texts at work on the typographic page.

We were, perhaps, getting ahead of ourselves.

Questions of transcription display are always implicitly questions about encoding. What we’re really talking about with “color code” or text classification is, behind the transcription display, XML attributions that make editorial observations (or claims) about what a particular part of the manuscript is doing. OK, so what should we do about it?

Well, in the instance of typographic features like catchwords, titles, page numbers, and printer’s marks, most of the answers were sitting and waiting for us in the existing XML tag set:

<texthead>
<textfoot>
<catchword>
<physnumber>

A few of these elements are self-explanatory. Element <physnumber> is used for tagging plate/object numbers that appear on the physical text. Most of these are also able to be specified through element attributions, such as <texthead type=”title”>.

After finding this stuff, I proceeded to have a sandbox session with the Archive’s DTD, to see how these elements behaved in our XML editor. Here’s a few screenshots:

When I presented these XML tags at our next meeting, most people were unaware of their existence. (Why would we be? We’ve only just begun with typographic works.) It was a good learning experience, and it led to more productive, more specific versions of the questions we were asking about typographic features at the outset.

So if there’s a moral to the story, it’s to challenge group members of any research project to continually reacquaint themselves with existing standards. Especially with the content diversity in a project like the Blake Archive, specialization works well only for a very short time. New projects lead to new problems, but that doesn’t mean the whole wheel needs reinventing. Usually it means remembering where you left the toolbox.

—

by Eric Loy

1 Comment

By the Book: Catching Up with the XML Tag Set

You may also like

1 comments

Discover more from Hell’s Printing Press