One of BAND’s long term projects is what we’re calling the <choice> Tag Project. The ultimate idea behind this project is to standardize our use of <choice> tags in textual transcriptions. Since there have been several blog posts written on this topic in the past, I thought a good way to begin would be to read those three posts and then to look through recently transcribed BADs in order to get a thorough exposure to all the different ways in which people are using <choice> tags.
Several things seem to be settled:
The purpose of <choice> tags is firmly established and clear-cut: to make the Archive’s search function more user-friendly. We want users who search for any given word to be able to find every instance of that word in the Archive including instances in which that word is abbreviated, misspelled, unconventionally spelled or split across multiple lines or objects.
While it would be impossible to compile a comprehensive list of all the situations which would require a <choice> tag, there are several situations that seem to account for most of the <choice> tags in textual transcriptions. Here are some of the most common <choice> tag situations:
- Blake begins a word on one line or object and finishes it on the next.
- Blake ends a past tense verb with “d” or “‘d” instead of “ed”
- Blake writes “thro” instead of “through”
- Blake writes “anything,” “everything,” “anyone,” etc as two words instead of one.
- Blake begins a word with prefix “in” instead of “en”
- Blake abbreviates a date.
- Blake abbreviates the name of a person (sometimes himself).
- Blake spells the name of a historical figure strangely.
- Blake uses an unconventional contraction.
- Blake places a hyphen in a word often seen without one.
Something else that is well established is that there are situations which call for more than one <reg></reg> per <choice> tag.
As Eric pointed out when he blogged on this topic, this is especially important to keep in mind when transcribing names of historical figures. For example, when adding in the standardized version of Albrecht Dürer’s name, it’s helpful for users if searching for the name either with or without the accent mark will take them to the right place.
There are also some questions about how we use <choice> tags that don’t seem to be conclusively answered as far as I can tell.
- Anglicized spellings: In the earliest blog post on <choice> tags, Hardeep wrote that we don’t Americanize British spellings because if we did, we’d also have to Anglicize American spellings. This made a lot of sense, but would there actually be any American spellings in any of the works in the Archive? It also seems like some of the BADs transcribed since then have been providing the American version of English spellings.
- <reg></reg> versus <corr></corr>: In some BADs, I see the <orig></orig><reg></reg> format for every <choice> tag. In other BADs, some <choice> tags contain <corr></corr> instead of <reg></reg>. If we’re going to keep this distinction, what precisely will it be based on? I guess <corr></corr> could imply a misspelling instead of an outdated or less common way of spelling the word. But I can see this distinction getting blurry.
- The structure of split word <choice> tags: Does <reg></reg> belong in both lines or just the first line? I’ve seen it done both ways.
When adding choice tags myself, I’ve learned not to be afraid to use google if I need to figure out whether or not the spelling of a word I come across in a Blake object is the correct/standard spelling. Taking the few extra moments to make sure we’re getting it right is another thing that should be standard practice with <choice> tags.