Decoding the Civil War: Phase 2, Two Work Flows, Your Choice

After a year of hard work by our volunteers on Decoding the Civil War, Phase 1, we are about ready to launch Phase 2, the marking of metadata within specific telegrams. There are two work flows to this task. The first work flow, Code Words, is marking the arbitraries, or code words, for those messages in code. These coded telegrams will then be fed into Phase 3, the final decoding of the telegrams. Having the marked arbitraries should make the process of decoding much faster, possibly aided by computer algorithms.

Phase_2_HLT_example_08_code_words

Example of highlighted code words (arbitraries).

The second work flow, Metadata, is a little more ambitious and complex, as we are asking our volunteers to work with individual telegrams, identifying specific metadata such as the sender, recipient, date sent, time received, etc. We are asking for metadata for a total of 10 fields; most telegrams have only a few; rarely do they have all 10. What we wish to accomplish is a way to provide simple metadata that will enable researchers to find all the telegrams to, say, Secretary of War Edwin M. Stanton, no matter whether it is in Ledger 2, or 6, or 22.

But did not Phase 1 enable full-text searching? Yes it did, and it is wonderful, but the transcriptions are accurate to the text as written in the ledger. Keeping with Stanton, if you typed in “Stanton” in the search box, you would get those pages where “Stanton” matches the search. But what if the telegram begins or ends with “EMS” or “Stantin” or “the Secretary of War”? The full-text search would ignore those pages. Furthermore, such a search looks at the whole message and returns results for any mention of “Stanton,” including other people named Stanton or places named Stanton. What if you want to look for Stanton only as the recipient? A search in a specific metadata field for “recipient” would enable that search and give you the correct results.

Phase_2_HLT_example_06_classification_8_fields

Example of highlighted metadata fields.

To aid in that search we will take the metadata tagged by the volunteers in Phase 2 and standardize the terms. So, continuing with Stanton, if the recipient is “EMS” and it is tagged as a sender or recipient, we will be able to take the consensus term and edit it to the standardized form of “Stanton, Edwin M. (Edwin McMasters), 1814-1869.”  Once all the telegrams are tagged and the fields edited, if you do a specific search for “Recipient” as “Stanton, Edwin M. (Edwin McMasters), 1814-1869.” you will only get those telegrams to Stanton, not from or about him, and you will have those whether they are sent to him as “EMS,” “Stanton,” or “Stantin.”

The tagging of individual telegrams in the Phase 2 Metadata workflow will eventually enable specific searches to be done across the almost 16,000 telegrams. It will enable users to look for individuals or places or dates in specific fields. And the tagging of code words (arbitraries) in the Code Word work flow will help round out this project with the final decoding of encoded telegrams. An incredibly useful archive has been made available in Phase 1 of Decoding the Civil War. Help us leverage and categorize that hard-earned knowledge in Phase 2 to aid in the discovery of the American Civil War.


The Beta Test site for Phase 2 is here. The original site can be seen here.

Advertisements

Tags: , , ,

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: