Documenting the American South Logo
Loading
Encoding Guidelines
Training School for Wives and Mothers from "The Church in the Southern Black Community" Collection  the Deliverance by Ellen Anderson Gholson Glasgow from The "Library of Southern Literature" Collection  Fight or Buy Bonds: Third Liberty Loans by Howard Chandler Christy from the "North Carolinians and the Great War" Collection  Portrait of Frederick Douglass from the autobiography Life and Times of Frederick Douglass from "The North American Slave Narratives" Collection  Letter from the Robert March Hanes papers from the "North Carolina Experience"  Collection 

Attention: These guidelines are for internal use only.

Version: Created 1999. Updated December 2004 by Natasha Smith and Elizabeth Wright. Special thanks to Marisa Ramírez.

Scope: These guidelines are written for graduate Research Assistants working on the digital initiative Documenting the American South. DocSouth Research Assistants should use these guidelines to work with texts encoded in SGML and prepared by our outsourcing company, Apex Data Services. Research Assistants will receive in-house training before using these guidelines. Please share your comments with Natasha Smith, Digitization Librarian.

Click here to open a table of contents in a separate window.

Brief Overview

We use Standard Generalized Markup Language (SGML) to encode all original materials that we publish on the web. We follow the rules and guidelines for encoding in SGML provided by the Text Encoding Initiative (TEI). For more background, please see "A Gentle Introduction to SGML" by C. M. Sperberg-McQueen and Lou Burnard, available at http://www.tei-c.org/Papers/gentleguide.pdf.

We use SGML to divide the text into its constituent parts. SGML-encoded texts have a hierarchical structure that is based on an analysis of the text's content. The formatting of an original text should not drive the SGML encoding, but rather the meaning and structure of a text should determine the encoding.

SGML, like many markup languages, uses Elements and Attributes. An element, also referred to as a tag, surrounds a block of text and describes the function of that text. An attribute further defines an element. These guidelines explain which elements to use for common textual content and explain which attributes must be assigned to certain elements.

Opening the File

We use SoftQuad Author/Editor 3.5 to edit TEI SGML files. Open Author/Editor by going to programs in the start menu. Use File>Open to select a file to work on. Check the bottom right corner of the program window to make sure it says "Rules Checking: On." If the rules are not on, go to "Special" and choose "Turn Rules Checking On."

Also, you should see the SGML tags surrounding the text in the window. If the tags do not show up, go to "View" and choose "Show Tags."

There are lots of keyboard shortcuts and ways to make your editing job more efficient in Author/Editor.

Checking SGML Structure

Our outsourcing company encodes all texts in SGML according to DocSouth specifications. When we receive the marked-up file, we check it carefully to verify it has been encoded correctly. The bulk of the work has already been done, but often the file needs further markup and some corrections.

Make sure you have the original book or a full copy of the original in front of you while you are encoding. The first step is to look through the book and analyze its structure. For encoding purposes, we interpret the book as a hierarchical structure. A book often has three large sections: Front, Body, and Back. A document must include a Body, but the Front and Back are not always present. Within these large sections, we nest smaller sections, like chapters and paragraphs.

The Front section includes all preliminary material before the real content begins. The Front may include:

Look carefully at the book and make sure you can tell where the Front section begins and ends.

Now, examine the end of the book. The Back section may include an index, epilogue, appendix, afterword, or other added material that appears after the main portion of the text.

The main content of the book makes up the Body section. In a novel this would be all the chapters of the novel. In a report, this would include the main articles, letters, table and data that comprise the report.

After you have analyzed the original book, check the encoded text. The <front> tag should surround all the material that belongs in the Front section. If you disagree with the way the text has been divided, add or subtract encoded text to the Front. All encoded text should be in the Front, Body, or Back sections. If you move something out of the Front section, you have to move it into the Body section. There are several ways to move text in and out of larger sections. For assistance see Lisa or Natasha.

Go to the end of the file and check to make sure the Back section has been surrounded by the <back> tag correctly. The Body tag should surround all the rest of the text between the Front and Back.

Checking Division Structure

The text within the Front, Body, and Back sections is divided into Divisions. Everything except the <titlePage> section should be surrounded by a <div1> tag. <div1> is then divided into smaller sections starting with <div2>. <div2> sections may include <div3> and <div3> sections may include <div4> and so on, as needed. Smaller divisions are nested within larger divisions. Division tags should reflect the structure of the original document. Does the document have illustrations, a table of contents, list of illustrations, or index? Is the text composed of chapters, letters, poems, diary entries, or short stories? Are chapters divided into subchapters? Does a poem or letter appear within a chapter? Are poems or short stories grouped according to theme or author, etc.

Front Section Divisions

Division tags should be assigned to correspond with the natural divisions of the text. In the front portion of the text, the title page and title page verso are surrounded by the <TitlePage> tag. All other sections, such as the table of contents, list of illustrations, dedication, introduction, prologue, etc., should each be surrounded by a <div1> tag.

Please note that the front matter should also contain separate divisions (<div1>) for each image that will be included in the digitized version of the text. Examples of these images would include: cover, spine, frontispiece, title page, and title page verso. For more information see the section about figures.

Remember that the tag for the last division within the front section should be placed directly before the closing </front> tag. In this way, tags are nested within each other. For example:

Example: Common Front Divisions

<teiHeader> . . . </teiHeader>
<text>
<front>
<div1>
<p>
<figure>
<p>[Cover Image]</p>
</figure>
</p>
</div1>
<div1>
<p>
<figure>
<p>[Frontispiece Image]</p>
</figure>
</p>
</div1>
<div1>
<p>
<figure>
<p>[Title Page Image]</p>
</figure>
</p>
</div1>
<titlePage>[This tag surrounds the transcription of the title page]</titlePage>
<div1>[This division surrounds the entire Table of Contents]</div1>
<div1>[This division surrounds the entire List of Illustrations]</div1>
<div1>[This division surrounds the entire Introduction]</div1>
</front>
[The Body and Back sections go here]
</text>

Body Section Divisions

Divisions within the Body of the text are often more complicated than divisions within the Front and Back. In deciding how to assign divisions to the text body, examine the first page of the text's main body (look for the open tag <body> tag). Does the document's title appear on this page in addition to chapter, short story, section, or poem titles/designations? If so, the entire section within the <body> tag needs to be surrounded by a <div1> tag and each chapter, poem or other subdivision should be surrounded by <div2>. If not, each chapter, short story, poem or main section will be surrounded by the <div1> tag—unless it is more logical to divide those sections in a different manner. For example, if a collection of poems features selections by several authors that are grouped according to author, each author's section of poems should be surrounded by <div1> and each individual poem by <div2>.

Example: Document title on first page of the <body>

<text>
<body>
<div1>
<head>Title of Document</head>
<div2>
<head>Chapter Title goes here<head>
<p>Text of Chapter 1 goes here</p>
</div2>
<div2>Chapter 2 goes here </div2>
<div2>Chapter 3 goes here </div2>
etc...
</div1>
</body>
</text>

Example: Document with no title on first page of the <body>.

<text>
<body>
<div1>Chapter 1 goes here</div1>
<div1>Chapter 2 goes here</div1>
[ etc...]
</body>
</text>

Example: Document with poems by one author and no title on first page.

<text>
<body>
<div1>[poem 1 goes here]</div1>
<div1>poem 2 goes here</div1>
[ etc...]
</body>
</text>

Example: Document with poems by several authors, grouped according to author, and title on first page.

<text>
<body>
<div1>
<head>document title</head>
<div2>
<head> poems by author 1</head>
<div3>poem 1 by author 1 goes here</div3>
<div3>poem 2 by author 1 goes here</div3>
</div2>
<div2>
<head>poems by author 2</head>
<div3>poem 1 by author 2 goes here</div3>
<div3>poem 2 by author 2 goes here</div3>
</div2>
</div1>
</body>
</text>

Example: Document with poems by several authors not grouped by author, no title on first page.

<text>
<body>
<div1>poem 1 by author 1 goes here </div1>
<div1>poem 1 by author 2 goes here </div1>
<div1>poem 1 by author 3 goes here </div1>
<div1>poem 2 by author 1 goes here </div1>
<div1>poem 2 by author 3 goes here </div1>
<div1>poem 3 by author 1 goes here </div1>
etc...
</body>
</text>

Note: If two or more poems by one author appear in succession at some point in the above example (even though this is not the general pattern in the document itself), it would be permissible to surround that section with <div1> and each of the poems in that section with <div2> (or <div2> and <div3> if a title appears on the first page).

Example: Diary entry in a text divided into chapters, no title on first page of <body>.

<text>
<body>
<div1>
<head>Chapter 1</head>
<div2>diary entry 1 goes here</div2>
<div2>diary entry 2 goes here</div2>
</div1>
<div1>
<head>Chapter 2</head>
<div2>diary entry 1 goes here</div2>
<div2>diary entry 2 goes here</div2>
<div2>diary entry 3 goes here</div2>
</div1>
etc...
</body>
</text>

Example: Diary entries in a text not divided into chapters, with no title on first page of <body>.

<text>
<body>
<div1>diary entry 1 goes here</div1>
<div1>diary entry 2 goes here</div1>
<div1>diary entry 3 goes here</div1>
etc...
</body>
</text>

Please Note: In many cases, a letter, poem, or other item may appear within a chapter, diary entry, or other division. If the item is a poem or quotation that appears before the beginning of the chapter text, it is tagged as an epigraph. However, if a letter, poem, quotation, etc., appears in the middle of a document's chapter, it is surrounded with the <q> tag. See the Quotation section for further information.

Please Note: See also the section on Figures for a description on how to encode illustrations and other graphic materials that appear within the text.

Back Section Divisions

Divisions for the back section (if one is present) follow much the same pattern as those for the front section. Each index, epilogue, or other section should be surrounded by <div1>.

Example: A typical Back section.

<back>
<div1>an epilogue goes here</div1>
<div1>index goes here</div1>
</back>

Assigning Attributes for Divisions

Each division must be assigned an appropriate "type=" attribute. Our outsourcing company usually assigns all "type=" attributes, but you should double-check the attributes for the divisions as you review the file. Sometimes you will create new divisions and you must remember to assign the "type=" attribute.

Begin by placing the cursor after the <div> tag. Hit the F6 key or select Markup from the Author/Editor toolbar and then Edit Attributes from the menu. On the screen that appears type in a description of the division contents beside the "type=" line. Typical descriptions include:

When the division contains an image, the description should be specific to that image. For example, the division created for the image of the cover should be called "cover image". The division created for the frontispiece should be called "frontispiece image", etc. The description should correspond to whatever materials are contained within the <div> tag. You will assign these descriptions for all division levels (e.g., <div1>, <div2>, (<div3>, etc.). After typing in the description, click on the Apply button, or hit enter.

Page Breaks

According to DocSouth convention, the page break tag <pb> precedes the text of the relevant page, independently of where this page number appears in the original text. Our outsourcing company inserts all page breaks and assigns <pb> attributes. Check several page breaks to ensure that the page numbers have been assigned correctly.

The <pb> tag should be assigned attributes for "id=" and "n=". To view <pb> attributes, place the cursor within the tag and press F6. The correct "id=" should be the letter "p" plus the page number. The "n=" should be just the page number. For the title page verso, the "id=" should be "pverso" and the "n=" should be "verso".

Example: Attributes for a <pb> tag at the top of Page 34.

<pb id="p34" n="34">

Example: For page iii in the introduction to a book.

<pb id="piii" n="iii">

Example: For the title page verso of a book.

<pb id="pverso" n="verso">

Please Note: In cases where a division begins at the top of a page, it is important that the <pb> tag is placed within the <div> tag, not before.

Example: A <pb> tag with attributes where a division begins on that page.

CORRECT: <div1 type="chapter"> <pb id="p260" n="260">
INCORRECT: <pb id="260" n="260" > <div1 type="chapter">

Encoding Title Pages and Title Page Versos

One of the most common sections you will encode is the transcription of a title page and title page verso. The title page transcription should appear in the file after all the front matter images and any other transcribed pages that precede it. Title pages vary widely in style, appearance, and the amount of information they contain. Depending on the information available, make sure the appropriate tags surround the information. The following tags should be used for encoding title pages:

<titlePage> used to surround the title page and title page verso, if present. Do not put a <div> tag around the <titlePage> tag.

<docTitle> used to surround one or more <titlePart> tags.

<titlePart> used to surround the title of the document as it appears on the titlepage. A title page always includes a "main" <titlePart>, and sometimes also a "subtitle". Always define the "type=" attribute for <titlePart>. Use two <titlePart> tags in cases where the author clearly indicates that there is a main title and a lesser title. Look for punctuation or syntax that indicate a clear split between a main title and a description of the book. Semicolons, colons, and the phrase "or," are common indicators that a second title follows. Many DocSouth books have very long titles; please enclose the entire title as it appears on the title page within the <titlePart> tag. Consult with Lisa or Natasha as needed.

<byline> used to surround the word "by" or the phrase "written by" if this word or phrase appears on its own line in the title page. If the word "by" appears inline with the author's name, use <docAuthor>

<docAuthor> used to surround just the name of the author or editor of the document, sometimes used with the <byline> tag)

<epigraph> used to surround any quote or verse (anonymous or attributed) that may appear on the title page or title page verso. (See also guidelines for encoding epigraphs.)

<docEdition> used to surround the document's edition statement found on the title page or title page verso.

<docImprint> used to surround the imprint statement (place and date of publication, name of the publisher).

<docDate> used to surround the date of the document.

<pubPlace> used to surround the location of the publisher.

<publisher> used to surround the name of the document's publisher.

Example: A brief title page.

<titlePage>
<docTitle>
<titlePart type="main"> MY OWN LIFE </titlePart>
<titlePart type="subtitle"> or, A DESERTED WIFE </titlePart>
</docTitle>
<byline> by </byline>
<docAuthor> Mrs. I. M. BEARD </docAuthor>
<docEdition> FIFTH EDITION </docEdition>
<docEdition> (Copyrighted) </docEdition>
</titlePage>

Example: A fuller title page without a title page verso.

<titlePage>
<docTitle>
<titlePart type="main">A VIRGINIAN VILLAGE
<lb>
AND OTHER PAPERS
<lb>
TOGETHER WITH SOME AUTOBIOGRAPHICAL NOTES </titlePart>
</docTitle>
<byline>by</byline>
<docAuthor>E. S. Nadal</docAuthor>
<docImprint>
<pubPlace>New York</pubPlace>
<publisher>THE MACMILLAN COMPANY </publisher>
<docDate>1917 </docDate>
</docImprint>
</titlePage>

Example: A fuller title page with a title page verso.

<titlePage>
<docTitle>
<titlePart type="main">A Confederate Girl's Diary </titlePart>
</docTitle>
<byline> by </byline>
<docAuthor>Sarah Morgan Dawson </docAuthor>
<docEdition>WITH AN INTRODUCTION BY WARRINGTON DAWSON AND WITH ILLUSTRATIONS</docEdition>
<docImprint>
<pubPlace>Boston and New York</pubPlace>
<publisher>Houghton and Mifflin Company</publisher>
<publisher>The Riverside Press Cambridge</publisher>
<docDate>1913</docDate>
</docImprint>
<pb id="pverso" n="verso">
<docImprint>
<docDate>Copyright, 1913, by Warrington Dawson
<lb>
All rights reserved
<lb>
Published September 1913 </docDate>
</docImprint>
</titlePage>

Line Breaks

The <lb> tag is used to place line breaks within a set of tags when the preservation of the appearance of the original is important. Line breaks are most often used on title pages to retain the appearance of the original. Do not use line breaks between tags.

Example: A title page with a long title using line breaks and two <titlePart> tags.

<titlePage>
<doctitle >
<titlePart type="main">Clotel;</titlePart>
<titlePart type="subtitle"> or,
<lb>
The President's Daughter:
<lb>
A Narrative of Slave Life
<lb>
in
<lb>
the United States.
<lb>
By
<lb>
William Wells Brown,
<lb>
A Fugitive Slave, Author of "Three Years in Europe."
<lb>
With a Sketch of the Author's Life. </titlePart>
</doctitle >
[The rest of the encoded title page (publisher, docdate, etc.)]
</titlePage>

Please note: In the example above, William Wells Brown's name is included as part of the title because a part of the title follows his name. E.g. "With a Sketch of the Author's Life." Consult with Lisa or Natasha as needed.

The Head Tag—Chapter Headings, figure Headings, Section Headings, etc.

In addition to the main title, many documents feature chapter or section headings. These headings should be surrounded by the <head> tag. No attributes need to be assigned for headings.

Example: Front section with an introduction and table of contents both with headings.

<front>
<div1 type="introduction">[This division surrounds the entire introduction]
<head>INTRODUCTION</head>
<p> [The first paragraph of the introduction]<p>
</div1>
<div1 type="contents">[This division surrounds the entire table of contents]
<head>TABLE OF CONTENTS </head>
<list>
<item>Chapter 1: The Beginning . . . . . page12 </item>
<item>Chapter 24: The Golden Years . . . . .page245<item>
<list>
</div1>
</front>

Example: First page of a document with the document title and chapter designation present.

<body>
<div1 type="text">[This division surrounds the entire body]
<head>Tales From the Life of a King </head>
<div2 type="chapter">[This division surrounds the entire first chapter]
<head>CHAPTER 1</head>
<p> [The first paragraph of the chapter.]</p>
</div2>
<div2 type="chapter"> [This division surrounds the entire second chapter]
<head>CHAPTER 2</head>
</div2>
etc...
</div1>
</body>

Example: First page of a document with only chapter title and designation present.

<body>
<div1 type="chapter">[This division surrounds the entire chapter]
<head>CHAPTER 1</head>
<head>The Beginning Years</head>
<p>[the text of the first paragraph is surrounded by the p tag.] </p>
</div1>
[remaining chapters...]
</body>

Example: Head tags in a book of poems where each poem is surrounded by <div1>.

<body>
<div1 type="poem">[This division surrounds the first poem]
<head>Title of first poem</head>
[the text of the poem follows this head]
</div1>
<div1 type="poem">[This division surround the second poem]
<head>Title of second poem </head>
[the text of the second poem follows this head]
</div1>
etc...
</body>

Example: Diary entry in a document divided into chapters with no document title on first page.

<body>
<div1 type="chapter">[This division surrounds chapter 1]
<head>CHAPTER 1</head>
<div2 type="diary entry">[This divisions surrounds the diary entry]
<head>November 1, 1864, Montgomery, Alabama </head>
[the text of the first diary entry of this chapter follows this head]
</div2>
[remaining entries in chapter will each be surrounded by a <div2>]
</div1>
</body>

Paragraphs

The <p> tag is the most common element in most books. Use <p> to surround each paragraph. No attributes need to be assigned for paragraph tags.

Please Note: When rules checking is on, you cannot surround a paragraph with the <p> tag unless the section in which the paragraph is located (chapter, diary entry, etc.) is first surrounded by a division tag.

Example: A chapter of text.

<div1 type="chapter">[This division surrounds the chapter]
<head> CHAPTER # </head>
<p> first paragraph of the chapter</p>
<p> second paragraph of the chapter</p>
<p> last paragraph of the chapter </p>
</div1>

Poems and Songs

Lines of poetry are not surrounded by the paragraph tag, but instead have their own group of tags. Verse (even if it is only one line) should be surrounded by the line group <lg> tag.

Each <lg> must be assigned an attribute for "type=". Valid attribute values are:

Once the section/stanza has been surrounded by the <lg> tag, each line of poetry needs to be surrounded by the line tag <l>, even if the line of poetry physically takes up more than one line in the original. The <l> surrounds a complete line of verse in the poem or song, and does not reflect the visual appearance of the verse on the page. No attributes need to be assigned for the <l> tag.

Example: A simple, one-stanza poem.

<lg type="poem"> [This tag surrounds a simple poem].
<head> title of poem </head>
<l> first line of poem</l>
<l> second line of poem</l>
<l> third line of poem</l>
<l> fourth line of poem</l>
</lg>

To accommodate many stanzas within a longer poem, you can nest line group elements for each stanza within the line group for the entire poem.

Example: A many-stanza poem.

<lg type="poem"> [This tag surrounds the whole poem.]
<head> title of poem </head>
<lg type="stanza"> [This tag surrounds the first stanza of the poem.]
<l> [first line of this stanza]</l>
<l> [second line of this stanza]</l>
<l> [third line of this stanza]</l>
<l> [fourth line of this stanza]</l>
</lg>
<lg type="stanza"> [This tag surrounds the second stanza of the poem.]
<l> [first line of this stanza]</l>
<l> [second line of this stanza]</l>
<l> [third line of this stanza]</l>
<l> [fourth line of this stanza]</l>
</lg>
</lg>

Please note: When one or few lines of verse are included in a paragraph of prose, these lines (or one line) of verse should be surrounded by <q> tag, followed by <lg> element and each line with <l> tag. For more explanation, see the section on quotations.

Please Note: Sometimes the poem is very long or the structure of the document makes it impossible to surround the entire poem in a line group. In these cases, assign <lg> to each stanza only. See Lisa or Natasha with any questions.

Dramatic Works

Occasionally, you may encounter texts that use the conventions of a printed drama. Examples include an entire work that is a drama, a book of collected works in which there is one or more dramas, or a book that excerpts from a drama or presents some of its dialog in dramatic style. In any of these cases, you will need to use a specific set of tags to markup the textual elements of drama. Most dramas include acts and scenes, stage directions, speaker headings, and speeches. Speeches may be in verse or prose.

Use <div1>, etc., to divide a dramatic work into acts, or scenes. Use the <HEAD> tag to surround the text that announces the act and scene.

Use <stage> to surround any stage directions or descriptions. These elements usually appear in italics in the original text.

Use <sp> to surround an entire speech made by a character, including the speaker's name at the beginning. Use <speaker> to surround the name of the person who is speaking at the beginning of the speech. If the character's speech is in verse, surround each line of verse with <L>. If the speech is in prose, surround the each paragraph in <P>.

Example: A drama in verse.

<sp>
<speaker>
<hi rend="italics">Pol.</hi>
</speaker>
<l> Thou wilt not fight with me didst say, Sir Count?</l>
<l>Shall I be baffled thus?—now this is well;</l>
<l>Didst say thou
<hi rend="italics">darest</hi>
not? Ha!</l>
</sp>
<stage>
<hi rend="italics">(while she speaks, a monk enters her apartment, and approaches unobserved.)</hi>
</stage>

Please Note: Oral history interviews also use the <sp> and <speaker> tags. See the section on Oral History Interviews below.

Figures (Illustrations, photographs, and other images from the original book or document.)

We provide images of all graphic materials that appear in the original. At the beginning of the text, this includes any front matter images, such as images of the cover, title page, title page verso, etc. Images from the original are scanned in-house and saved to the shared DocSouth drive, (i:\ drive). In the SGML file, you will encode a reference to the image file.

Each illustration is surrounded by the <figure> tag and assigned attributes. Each <figure> tag must be within a <p> tag.

Front matter images. All front matter images should be encoded directly after the teiHeader and before the transcriptions of any of the preliminary pages such as the dedication or title page. Each front matter image should be placed within its own <div1> tag (assign the appropriate "type=" attribute). Within the <div1> tag, insert a <p> tag. Nested within the <p> tag, insert the <figure> tag. Within the <figure> tag, insert the <p> tag for the caption.

Captions. You must include a caption for every image. Captions should be encoded within a <p> tag. Nest the <p> tag within the <figure> tag. If the original does not have a caption, please provide one in square brackets. You should capitalize the first letter of every significant word when you provide your own caption. Use the following captions for front matter images:

For small drawings or figures that appear at the end of a chapter or section, supply the caption "vignette" in square brackets. For most other illustrations, supply the caption "Illustration" in square brackets.

Captions and Headings. All images in any book or other item should have a caption. If the original has no caption, please supply an appropriate caption in square brackets. The caption is encoded within a <p> tag inside the <figure> tag. Occasionally, an image with have a heading. A heading is a description or designation that comes above the image. Headings are also encoded within the <figure> tag using the <head> tag.

Example: A few front matter images.

<div1 type="cover image">
<p>
<figure id="cover" entity="twaincv">
<p>[Cover Image]</p>
</figure>
</p>
</div1>
<div1 type="frontispiece image">
<p>
<figure id="frontis" entity="twainfp">
<p>The author in 1875.<lb>[Frontispiece Image]</p>
</figure>
</p>
</div1>

Example: A figure with a heading.

<p>
<figure id="ill3" entity="georg56">
<head>Map of Spain</head>
</figure>
</p>

Images between chapters. Sometimes the illustrations for a book will appear in between chapters or other section breaks. For these images, the <figure> will be in its own division.

Example: A figure the appears between chapters, where each chapter is a <div2>.

<div2 type="chapter">
<head>Chapter 3</head>
<p> text of a paragraph</p>
<p> text of the last paragraph of chapter 3</p>
</div2>
<div2 type="illustration">
<p>
<figure id="ill12" entity="davis75">
<p> Benji could stand the guilt no longer.</p>
</figure>
</p>
</div2>
<div2 type="chapter">
<pb id="p100" n="100">
<head>Chapter 4</head>
<p>[The text of the first paragraph of this chapter]</p>
[the rest of the chapter]
</div2>

Images within chapters. If the image is placed in the middle of a paragraph of text, insert the <figure> tag within that paragraph. If the image appears between paragraphs, first insert a <p> tag, then insert the <figure> tag within the <p>.

Example: An image in the middle of a paragraph within a chapter.

<div1 type="chapter">
<head> Chapter 3 </head>
<p>[the first portion of text of a paragraph]
<figure id="ill2" entity="jones123">
<p> Confederate Belles of 1851.</p>
</figure>
[the remaining portion of paragraph] </p>
[the rest of the encoded chapter]
</div1>

Example: A supplied caption for a drawing at the end of a book.

<p>
<figure id="ill7" entity="smith273">
<p>[Vignette]</p>
</figure>
</p>

If the image has a title (something that appears above the image), it should be surrounded by a <head> tag.

Create entities for images. Figure tags must be assigned values for the id and entity attributes. First, create an entity for each image associated with the document. Go to Entities>Edit Text Entities. In the Edit Text Entity screen, select the line that says #DEFAULT. For name type in the name of file without the extension. Look at the images on the i:\ drive to learn their filename. Next to content type in the full name of the image file (ex., smithcv.jpg). Click on the new button to add that image to the entity list. Please Note: Do not use upper case in assigning entities.

Assign attributes for each figure tag. With your cursor inside the <figure> tag, press F6 to edit the attributes for that tag. Each figure tag must be assigned a unique id and the correct entity must be selected from the drop down menu. All front images have a special "id=" according to the type of image. The following should be used for front images: cover, spine, frontis, title, verso. For the back cover, please assign 'id="back"'. If there is more than one title page or more than one frontispiece, see Lisa or Natasha.

All other images should be assigned an "id=" beginning with "ill" and followed by a number that reflects the order of the images in the book. The first illustration of a text should be assigned 'id="ill1"'. For the 24th illustration of a text, the id would be "ill24".

To assign a value to the entity attribute, select the entity that you've created for the illustration from the drop down list provided.

Encoding Lists

If a text includes a list, use the <list> tag to surround the entire list. Surround each item in the list with the <item> tag. If the list

Example: A basic list.

<div1 type="employees">
<head>North Carolina State Employees for 1921</head>
<p> In 1921, the employees of the state were very active. [the rest of the introducotry paragraph goes here]</p>
<list>State Employees
<item>[person's name here]</item>
<item>[person's name here]</item>
<item>[person's name here]</item>
[etc....]
</list>
</div1>

Tables

Encoding tables is very similar to the process of encoding lists. A basic table is arranged with a title on top, and with a series of columns running down, and rows running across. See this example:

Apex can confuse when it is appropriate to use a list instead of a table. Be alert for this type of mistake, and consult Lisa or Natasha with questions.

The table is divided into <row>(s) and the rows are divided into <cell>(s). The <table> tag must be inside a <p> tag. Make sure each row has a consistent number of cells. Even if a cell is empty, you must include an empty <cell></cell> in that row. For more complex tables, see Lisa or Natasha.

There are two attributes that must be assigned for tables. In most cases, Apex has correctly assigned them. Do spot-checks to make sure the attributes are okay. If they are not, consult with Lisa or Natasha. The <table> tag should be assigned attributes for "rows=" and "cols=", where "rows=" is followed by the number of rows in the table and "cols=" is followed by the number of columns in the table.

The <row> and <cell> elements must be assigned the attribute for "role=". The default value for these elements is "data". For a row across the top of the table that serves to label the data that comes below, the attribute will be 'role="label"'. For all the cells in a left-hand column that serves to label the information appearing to the right, assign each cell the attribute 'role="label"'.

Example: A simple table.

<div1 type="table">
<head>TAXATION.</head>
<p>
<table rows="13" cols="10">
<row role="label">
<cell role="data">States.</cell>
<cell role="data">Value of all Real Estate, includ'g Town Lots.</cell>
<cell role="data">Value of Town l'ts Alone</cell>
<cell role="data">Number of Negroes.</cell>
<cell role="data">Value of Negroes.</cell>
<cell role="data">Capit'l Invested in Trade, Mdze., Etc.</cell>
<cell role="data">Bank Capital.</cell>
<cell role="data">Railroad and Other Stocks.</cell>
<cell role="data">Money at Interest.</cell>
<cell role="data">Total.</cell>
</row>
<row role="data">
<cell role="label">Alabama,. . . . .</cell>
<cell role="data">143,765,708</cell>
<cell role="data">
<ref rend="sc" id="ref1" target="n1" n="1">§</ref>
</cell>
<cell role="data">435,473</cell>
<cell role="data">261,283,800</cell>
<cell role="data">41,362 517</cell>
<cell role="data">5,000,000</cell>
<cell role="data">20,975,639</cell>
<cell role="data">22,578,370</cell>
<cell role="data">494,966,034</cell>
</row>
<row role="data">
<cell role="label">Arkansas,. . . . .</cell>
<cell role="data">68 662 395</cell>
<cell role="data">5,227,689</cell>
<cell role="data">109,065</cell>
<cell role="data">65,439,000</cell>
<cell role="data">2,864,659</cell>
<cell role="data"> </cell>
<cell role="data">142,000</cell>
<cell role="data">1,334,6
<gap extent="1 character" reason="illegible">
1</cell>
<cell role="data">138,442 085</cell>
</row>
</table>
<note rend="sc" id="note1" place="foot" n="1">
<p>§ No means of separating the value of Town Lots from the aggregate value of “Real Estate, including Town Lots.”</p>
</note>
</p>
</div1>

There are many exceptions to the basic table. As you can see in the above example, footnotes must be placed at the end of the table and not inline. It is very important to make sure that all of the appropriate cells line up. This means accounting for empty cells and counting across. For a table, every row must have the same number of cells or the information can become distorted.

If a table runs on for multiple pages, you must break the table at the end of each page and restart it on the next page. If the column header or table title is repeated on the next page in the original, repeat it in the next page in the file. If the table title is not repeated, please provide the title in square brackets at the top of the next page. See Lisa or Natasha with any questions. You will need to change the attributes for the <table> tag if you break up tables as they were encoded by our outsourcing company.

Many tables have information that is not simply in a grid. They may have subdivided column headers, larger blank areas or columns that cease to have information, totals for certain columns but not others, etc. Each of these cases must be evaluated individually. It will be necessary in some cases to forego the encoded table and settle for an image but do not give up too easily. Please ask questions.

Quotations

Quotations that do not occur inline in the text and that are set off typographically in some way, should be encoded within the <q> element.

Italics

Once a section of text has been surrounded by a division tag and either paragraph or line/line group tags, it may be necessary to assign additional tags to highlight special features within the surrounded text. Frequently the text of a document contains words or phrases that appear in italics. These italicized words and phrases need to be surrounded by the highlight tag, <hi>. For the <hi> tag you must assign the rend attribute as italics.

Example: Sentence with the word "she" in italics.

<p>The man on the corner thought that
<hi rend="italics">she</hi>
was the one who had taken his newspaper...</p>

Bold

When boldface type occurs within the document title or chapter titles, the <head> tag is sufficient. However, words occasionally appear in bold type within the text of paragraphs, tables of contents, lists and/or poetry. In these cases, surround the word(s) with the <hi> tag and assign "bold" as the "rend=" attribute.

Example: Sentence with words "cats" and "birds" in bold type.

<p> The girl remembered that her list of spelling words included
<hi rend="bold">cats</hi>
and also
<hi rend="bold">birds</hi>.
Remaining sentences in paragraph...</p>

Milestones

It is common to see a string of asterisks or periods in the middle of poems, chapters, and other formal sections of a text. These informal divisions are called milestones, and are encoded using the <milestone> tag. The <milestone> tag must be assigned attributes for "unit=" and "n=". Always assign the attribute unit="typography". For "n=", type in the way the milestone looks in the original, using the exact number of periods or stars with or without spaces according to the way it appears in the original. Like <pb> and <lb>, the milestone tag should be empty—no characters can appear between the opening and closing tags of the milestone element.

Foreign words and phrases

Like words in bold type and italics, foreign words and phrases need to be assigned special tags. You must use your own judgment about whether words or phrases should be encoded as foreign words. Be judicious, not every common Latin or French phrase needs to be marked-up. The purpose of the <foreign> tag is to acknowledge that foreign words and phrases are used that may not be known to the general public. Different texts will require different uses of the foreign tag. Your goal is to balance alerting the reader to the use of foreign languages and maintaining the significance of marking a work with the foreign tag. If you begin working with a text that contains a lot of foreign words, consult with Lisa or Natasha for how to determine what to mark up and what to leave un-marked.

You must assign a Lang IDREF for each <foreign> tag. In the attributes dialog box, fill in a three letter code based on which language is used. Consult the Appendix for a list of language codes that includes the ISO 639 three-letter code and USMARC Code List for Languages (Library of Congress 2003) also available at http://www.loc.gov/marc/languages/langhome.html. Frequently used foreign language codes include:

Example: A sentence with the words "couteau de chasse"

<p> The negroes, meanwhile, still roped in pairs, had returned in safety to the wagon, and had been set free from their villein bonds by the ready
<foreign lang="fre">
<hi>couteau de chasse</hi>
</foreign>
of Lieutenant Frampton. [Remaining sentences in paragraph...]</p>

Letters

There are two types of letters you can encounter when encoding: (1) letters that are quoted within a chapter or other division and (2) letters that are presented as their own divisions or section. If the letter is quoted within a chapter or other division, you will need to encode it as a quotation.

The only attribute that should be assigned for letters is the designation "letter" next to type for the <q> and <div> tags.

Example: A letter quoted within a chapter.

<div1 type="chapter">
<head>Chapter 2</head>
<p>It was sometime that night that a letter was slipped under Mary's door, it read:
<q type="letter">
<text>
<body>
<div1 type="letter">
<opener>
<salute>Dear Mary,</salute>
</opener>
<p>[The first paragraph of the letter goes here]</p>
</div1>
</body>
</text>
</q>
</p>
[the remainder of the chapter]
</div1>

Please note that this is different from when you are dealing with a collection of letter, or a chapter that includes only letters. For these situations, you will not need to encode the letter as a quotation. Each letter must be its own division within the chapter or division.

Once a letter has been surrounded by a division (either as a div1 within a quotation or as a separate div within a chapter), you will use the following tags to encode the letter:

<opener> surrounds the opening elements of a letter including the date and/or place the letter was written and a salutation or greeting that appears at the beginning of the letter.

<closer> surrounds the closing elements of a letter including any date, location, salutation, or signature that appears at the close of a letter.

<dateline> surrounds the date, place, and/or time in which the letter was written (can appear at the beginning or end of a letter).

<date> surrounds the date of the letter. This element must be surrounded by the <dateline> tag.

<name> if the dateline includes the place from which the letter writer is writing, or the location of the addressee, surround the place name with the <name>. Assign the <name> tag 'type="place"'.

<salute> surrounds any salutation or greeting that opens or closes a letter and is not part of a paragraph within the letter.

<p> surrounds each paragraph (even if only a word or sentence long) that occurs within the letter.

<signed> surrounds the signature of the one who has written the letter.

Example: A simple letter included within a division.

<div1 type="chapter">
<head>Chapter Two</head>
<p> [This tag surrounds a paragraph in the chapter]</p>
<q>
<text>
<body>
<div1 type="letter">
<opener>
<dateline>
<date>March 23, 1864</date>
<name type="place">Boston, Mass.</name>
</dateline>
<salute>Dear Mary, </salute>
</opener>
<p> first paragraph of the text of the letter </p>
<p> last paragraph of the letter </p>
<closer>
<salute>Yours forever, </salute>
<signed>Edward Smith </signed>
</closer>
</div1>
</body>
</text>
</q>
<p>[this tag surrounds the paragraph in the chapter that follows the letter]</p>
[the rest of the chapter ...]
</div1>

Example: A letter in its own division in which the salutation appears as part of the first paragraph and a dateline appears in the closer.

<div1 type="letter">
<p>Dear Joseph, it has been a long time since...</p>
<p> text of concluding paragraph </p>
<closer>
<salute> Your loving sister,</salute>
<signed> Emma </signed>
<dateline> Richmond, Virginia </dateline>
</closer>
</div1>

Footnotes

Working with footnotes with electronic texts is different than footnotes in the print environment. From an encoding perspective, there are two parts to each footnote: the reference and the note. The reference is the superscript number or symbol within the main text on the page that alerts the reader that there is a pertinent note. The note is the text at the bottom of the page, beginning with the number or symbol and then a citation or some further information. To facilitate reading online, we move the note to the point of reference within the text. For example, an asterisk after the second sentence on the page points the reader to a note at the bottom of the page. In the encoded text, the note would be moved so that it appears right after the second sentence.

All numbers or symbols within the text that are references to footnotes must be encoded with the <ref> tag. Assign the following attributes for the <ref> tag: id, rend, target idrefs, n.

The text of the footnote, including whatever number or symbol identifies it, must be surrounded by the <note> tag. Assign the following attributes for the <note> tag: id, rend, anchored, place, target idrefs, n.

Please note: When placing one or more <p>s inside of a <note> element, make sure that there are NO spaces between the <p> tags, as well as between <note> and <p>.

Example: This is correct.

<note><p>This is the first paragraph.</p><p>This is the second paragraph.</p></note>

Example: This is NOT correct.

<note> <p>This is the first paragraph.</p> <p>This is the second paragraph.</p> </note>

Assigning Attributes for the <ref> and <note> Tags for Footnotes

As with figures, we will assign a unique "id=" for every <ref> and <note> in the text. The "id=" for each note should begin with the letter "n" and be followed by the note's ordinal number. The id for each reference will begin with the word "ref", followed by the ref's ordinal number. Thus, the first note in a book will have the 'id="n1"' and the 130th note will have the 'id="n130"'. The first reference in a book with have the 'id="ref1"' and the 130th reference will have the 'id="ref130"'.

ID. The note's "id=" is an assigned attribute and does not relate to whatever number or symbol appears as a label for the note. In this manner, if a footnote found on page 150 of a text is the 25th note in the entire text, it will be note id="n25" even if it is labeled in the text with a superscript 1. In order to insure that a footnote is assigned the correct number, it is important to keep track of all footnotes as you go through a text. It is a good idea to keep a running total on a notepad as you encode to make sure that you assign the correct number and do not assign the same number to two notes.

REND. For rend, enter sc. In addition, caps and bold are also occasionally used. Consult Lisa or Natasha if you are not sure.

TARGET id refs. This attribute creates a circle between the <ref> and the <note>. For each ref, the target should be the note that is associated with it. Thus, for <ref id="ref23"> the target will be "n23".

N. Enter the same number you used for the unique id as the n attribute. If your attributes for id and target id refs are ref23 and n23, then you will enter 23 next to n.

ANCHORED (for <note> only). Make sure "yes" is selected.

PLACE (for <note> only). For footnotes, enter "foot".

Example: A <ref> and <note> pair with attributes assigned.

<p>At that time, people still believed the world was flat
<ref rend="sc" id="ref23" target id refs="n23" n="23">1</ref>
and they were very afraid of traveling.
<note rend="sc" id="n23" anchored="yes" place="foot" target idrefs="ref32" n="23">
<p>1 It was Columbus's famous expedition to the new world that ended the belief that the world was flat.</p>
</note>
[The rest of the paragraph.]</p>

Marginal Notes

One or more marginal notes should be placed BEFORE the relevant paragraph or section. Each note should be surrounded by a separate <note> tag.

The <note> attributes are different for marginal notes. Assign id, rend, and n the same as for footnotes. Next to anchored choose no. Next to place, enter the word margin.

Epigraphs

In general, an epigraph is a quotation (anonymous or attributed) that appearing at the start of a section or chapter, or on a title page. Be careful not to confuse an epigraph with an argument. If you are unsure, consult with your colleagues, Lisa, or Natasha.

There are two different kinds of epigraphs—those that cite an author, and those that do not. For those with no author, the structure is fairly simple.

Example: A prose epigraph.

<epigraph>
<p>[prose goes here ...]</p>
</epigraph>

Example: A verse epigraph.

<epigraph>
<lg type="verse">
<l>[a line of poetry goes here]</l>
</lg>
</epigraph>

For epigraphs that cite an author, the structure is more formal. The <epigraph> element will generally contain <q> element and <bibl> tag for encoding a bibliographic reference.

Example: An attributed epigraph at the beginning of a chapter

<div1 type="chapter">
<head> Chapter 2 </head>
<epigraph>
<q>
<lg type="verse">
<l> Tiger! Tiger! burning bright </l>
<l>In the forests of the night,</l>
<l>What immortal hand or eye</l>
<l>Dare frame thy fearful symmetry?</l>
</lg>
<bibl>William Blake</bibl>
</q>
</epigraph>
text of chapter goes here
</div1>

Arguments

An argument usually appears at the beginning of a chapter and summarizes what occurs in that chapter. When using the <argument> tag, you should also surround paragraphs with a <p> tag. This will allow you to assign tags such as <hi> and <foreign> within the <argument> tag. Attributes do not need to be assigned for epigraphs or arguments.

Example: A chapter with an argument

<div1 type="chapter">
<head> Chapter 2 </head>
<argument>
<p>The young man meets his benefactor and enrolls in school. </p>
</argument>
the text of the chapter goes here
</div1>

Use of the <sic> tag

If the text contains mistakes such as incorrect spellings, missing or incorrect punctuation, pagination or other errors, DocSouth generally marks up such errors without correcting them. To mark up errors use the <sic> and <corr> tags.

The <sic> tag is used to surround errors found within the text (especially spelling errors). By surrounding the error, the encoder alerts readers to a mistake attributable to the author or publisher. In this way, the reader will be aware that the mistake is part of the original text and not an error made by the typist or encoder. To encode using the <sic> tag, highlight the error and insert the element <sic>.

Once the error has been surrounded by the <sic> tag, assign the attribute for "corr=". The attribute for "corr=" is the correct spelling of the word. For example, if you have surrounded the word "buisy" with the <sic> tag, the correct spelling "busy" will be entered next to "corr=" on the attributes screen. In this way, the error remains in the main text, but the correction is listed on the attributes screen.

Example: The <sic> tag.

<p> The winter weather of
<sic corr="1885">1985</sic>
was
<sic corr="terrible">terible</sic>
and we had to take care of
<sic corr="Laura's">Lauras</sic>
chickens to keep them from freezing...</p>

Always run spellcheck in Author/Editor when you have completed reviewing the file. Spellcheck in A/E is helpful, but far from perfect. A/E often is too sensitive and highlights words that actually are spelled correctly. Consult with online dictionaries, including the Oxford English Dictionary, available through the Library's home page by clicking "Articles & More." Correct spellings for many words have changed over time and it's not necessary to markup words that were correctly spelled in the author's time. In addition, if a word is consistently misspelled, it may not be necessary to use the <sic> tag.

Use of the <corr> tag

The <corr> tag works like the <sic> tag only in reverse. With the <corr> tag, you locate an error, make the correction to the text and then surround it with the <corr> tag. The <corr> tag is preferable for missing elements like punctuation. For other mistakes, please use the <sic> tag.

You must assign the "sic=" attribute for the <corr> tag. For example, if you have a sentence that is missing an end punctuation mark, you can add the punctuation mark and surround it with the <corr> tag. In this instance, assign 'sic="[no punctuation]".

Example: The <corr> tag.

Original:

The carriage was running out of control with no way to stop it and avoid disaster. "What can we do" Sally shouted. Hal didn't know but he hoped they would be rescued by someone?

Correction:

<p>The carriage was running out of control with no way to stop it and avoid disaster. "What can we do
<corr sic="[no punctuation]">?</corr>
" Sally shouted. Hal didn't know but he hoped they would be rescued by someone
<corr sic="?">.</corr>
</p>

Encoding Tables of Contents

Tables of contents are usually included in the front section of a text and should be surrounded by the <front> tag. The entire table of contents should be surrounded by the <div1> tag with a 'type="contents"' as the attribute. The heading of the table of contents should then be surrounded by the <head> tag. No attributes need to be assigned for the <head> tag. The list of contents should be encoded as a <list>. Surround each entry in the table of contents with the <item> tag.

Each item in the list will include a page number or range of page numbers. Each page number in the table of contents is encoded as a reference. Surround each page number with the <ref> tag. The only attribute that is assigned for the <ref> tag in the table of contents is "target=". In the attributes screen, next to the word target, type in the id of the page that is being referenced. For example, if the second chapter of a book by Smith begins on page 28, the id would be p28. If instead of a single page number, a range of pages is listed, use the number of the first page. As a result, a book by Smith that lists chapter three on pages 43-56 would have an id of p43.

Example: Table of Contents.

<div1 type="contents">
<head>Table of Contents </head>
<list>
<item>Chapter 1 - The Early Years.....
<ref target="p1">1</ref>
</item>
<item>Chapter 2 - The School Years.....
<ref target="p12">12-23</ref>
</item>
<item>Chapter 3 - Adulthood.....
<ref target="p24">24</ref>
</item>
</list>
</div1>

Lists of Illustrations

Encoding a list of illustrations is almost identical to encoding a table of contents. For the attribute "type=", assign "list of illustrations". In the <ref> tag, assign the figure id for the "target=" attribute, e.g., ill1, ill2.

Example: List of Illustrations.

<div1 type="list of illustrations">
<head>List of Illustrations </head>
<list>
<item>Gen. Robert Edward Lee ....
<ref target="ill1">90</ref>
</item>
<item>Gen. Thomas T. Munford ....
<ref target="ill2">100</ref>
</item>
</list>
</div1>

Indices

Indices are surrounded by <div1> tags with the attribute 'type="index"'. The page numbers will NOT need to be encoded, because of the full-text searching capacity provided by web browsers. If Apex has already encoded the references, spot-check to makes sure they were encoded correctly, and leave the ref tags. For an example of an index, see /southlit/greenfact/menu.html.

Entities

Left and right double quotation marks, left and right single quotation marks, ampersands, emdashes, and other special characters must be encoded as entities. First, make sure they are in the list of text entities and that they are defined. To define entities entered by Apex, go to Entities>Edit Text Entities. If you want to define the entity ldquo, select that from the menu and below make sure the NAME is ldquo. In the third box, CONTENT, type in &ldquo;. Click the CHANGE button. This will enable A/E to validate this document. If you click NEW instead of CHANGE, A/E will not let you overwrite the entity.

Left and right single quotation marks should also be encoded. Do NOT use entities for apostrophes or accent marks used in writing dialects.

Tip: To find entities use the Find>Find and Replace command. In the find box, type the entity with an ampersand and a semicolon. For example, if you want to find a left double quotation mark, type &ldquo;.

Frequently special characters are encountered during encoding, especially in foreign words and phrases. These special characters include letters with diacritics such as accent marks, tildes, circumflex, and umlauts. These special characters must receive special attention during encoding. Apex usually inserts all these entities, but spot-check some of them to make sure they are correct. You will need to define the entities for all these characters.

The teiHeader

At DocSouth, we use a similar template for all teiHeaders, but each project has its own slightly different template. To fill out the teiHeader, you will need to consult the Library's online catalog and the title page of the original book or document. Please fill out the teiHeader to the best of your ability, and ask your colleagues for assistance when you have questions.

The teiHeader is a collection of information at the beginning of the encoded text that tells about the electronic text and the original text it was created from, i.e. metadata. The teiHeader has several sections, and it takes practice to fill it out correctly. It is often best to review the teiHeader after you have reviewed all the other encoding for a document because by then you will be more familiar with the text.

The first section of the teiHeader is the <fileDesc>, which includes the <sourceDesc>. The first part of the <fileDesc> describes the electronic edition and the <sourceDesc> describes the original text. The remaining sections of the teiHeader describe the way the book was digitized, the editorial decisions that were made in the process of digitizing the book, the activities that were done in the digitization process, and the cataloging information, including languages.

Example: A teiHeader.

<tei.2>
<teiHeader type="True and Candid Compositions">[each project will have its own type. it should be assigned correctly by Apex.]
<fileDesc>
<titleStmt>
<title>
[Fill in this title tag with the title of the document using the DocSouth capitalization rules. Use the title that appears on the title page of the document. Do not use a period at the end of the title. Keep the colon after the title; notice that the colon is followed by the phrase "Electronic Edition," which is part of the title of the DocSouth TEI edition of the text.]:
Electronic Edition. </title>
<author>[Insert the author's name here. Use the name as it appears in the online catalog. It should include the author's dates, if available. This is referred to as the authority name.]</author>
<funder>Funding from the State Library of North Carolina supported the electronic publication of this title. [This statement will vary from project to project, but should be filled in already when you get the file.]</funder>
<respStmt>
<resp>Text transcribed by</resp>
<name>Apex Data Services, Inc. [Unless you are encoding the file from scratch, always put Apex as the transcriber.]</name>
</respStmt>
<respStmt>
<resp>Images scanned by</resp>
<name>[Ask Risa or Lisa for this information]</name>
</respStmt>
<respStmt>
<resp>Text encoded by </resp>
<name id="NS">Apex Data Services, Inc., [Insert your name here], and Elizabeth S. Wright</name>
</respStmt>
</titleStmt>
<editionStmt>
<edition>First edition, [All the files you are working on will be first edition. This edition statement refers to the electronic edition published by DocSouth.]
<date>2005 [Make sure this date is the year that you are doing the encoding.]</date>
</edition>
</editionStmt>
<extent>ca. [Insert the filesize in K]K</extent>
<publicationStmt>
<publisher>University Library, UNC-Chapel Hill</publisher>
<pubPlace>University of North Carolina at Chapel Hill, </pubPlace>
<date>2004. [Make sure this is the year that you are doing the encoding.]</date>
<availability>
<p>© This work is the property of the University of North Carolina at Chapel Hill. It may be used freely by individuals for research, teaching and personal use as long as this statement of availability is included in the text.</p>
</availability>
</publicationStmt>
<sourceDesc>
<biblFull>
<titleStmt>
<title type="title page"> [Fill in the first title tag in the "Source Description" section with the title exactly as it appears on the title page, using DocSouth capitalization rules. This title should match the title at the very top of the teiHeader.]</title>
<title type="cover">[You may add as many titles as necessary to reflect the different versions of the document's titles. A typical additional title is the "cover title" which is usually a shorter or slightly different title from that which appears on the title page. There is no need to include a title if it exactly matches what is on the title page. See Lisa or Natasha if you are uncertain about including a title.]</title>
<author>[Insert the Author's name exactly as it appears on the title page, following DocSouth capitalization rules.]</author>
</titleStmt>
<extent>1 p.</extent>
<publicationStmt>
<pubPlace>[Insert the publication place as it is printed on the title page. Use square brackets if the publication place is printed somewhere other than the title page. Refer to the online catalog record for guidance. Use DocSouth capitalization rules.]</pubPlace>
<publisher>[Insert the publisher's name inside this tag, following the same rules as for the pubPlace.]</publisher>
<date>[Insert the publication date following the same rules as for pubPlace.]</date>
<authority></authority>
</publicationStmt>
<notesStmt>
<note>Call number [Insert the call number of the book we used. Please take the catalog number from the spreadsheet on the i: drive to ensure accuracy. Often UNC holds multiple copies of the book with different call numbers.] ([Insert the full name of the collection or library that owns the book. Often this will be the North Carolina Collection or the Rare Book Collection, but please consult the spreadsheet.], University of North Carolina at Chapel Hill)</note>
</notesStmt>
</biblFull>
</sourceDesc>
</fileDesc>
<encodingDesc>
<projectDesc>
<p>The electronic edition is a part of the UNC-Chapel Hill digitization project,
<hi rend="italics">Documenting the American South.</hi>
</p>
</projectDesc>

<editorialDecl> [The following pargraph tags are a set of typical editorial declarations, you will need to add to or delete from these declarations according to the different features of the text. You must review these after you have finished the rest of the encoding.]


<p>The text has been entered using double-keying and verified against the original. </p>
<p>The text has been encoded using the recommendations for Level 4 of the TEI in Libraries Guidelines.</p>
<p>Original grammar, punctuation, and spelling have been preserved. Encountered typographical errors have been preserved, and appear in red type.</p>
<p>Any hyphens occurring in line breaks have been removed, and the trailing part of a word has been joined to the preceding line.</p>
<p>All quotation marks, em dashes and ampersand have been transcribed as entity references.</p>
<p>All double right and left quotation marks are encoded as &rdquo; and &ldquo; respectively.</p>
<p>All single right and left quotation marks are encoded as &rsquo; and &lsquo; respectively.</p>
<p>All em dashes are encoded as &mdash;</p>
<p>Indentation in lines has not been preserved.</p>
<p>Spell-check and verification made against printed text using Author/Editor (SoftQuad) and Microsoft Word spell check programs.</p>
</editorialDecl>

<classDecl>
<taxonomy id="LCSH">
<bibl>
<title>Library of Congress Subject Headings</title>
</bibl>
</taxonomy>
</classDecl>
</encodingDesc>

<profileDesc>
<langUsage>[If you used any foreign tags in the document add the language identification for each of those languages. Use the language id as discussed in the section about foreign tags.]
<language id="eng">English</language>
<language id="lat">Latin</language>
<language id="">[Insert any other language you encountered in its own language tag.]
</langUsage>

<textClass>
<keywords scheme="LCSH">[The Library of Congress Subject Headings will be filled by our catalogers, later.]
<list>
<item><! -- LC headings go here --></item>
</list>
</keywords>
</textClass>
</profileDesc>

<revisionDesc>
<change>
<date>2005-, </date>
<respStmt>
<name>Celine Noel and Wanda Gunther </name>
<resp></resp>
</respStmt>
<item> revised TEIHeader and created catalog record for the electronic edition.</item>
</change>
<change>
<date>2004-11-22, </date>
<respStmt>
<name> Elizabeth S. Wright, </name>
<resp></resp>
</respStmt>
<item>finished TEI-conformant encoding and final proofing.</item>
</change>
<change>
<date>[Enter the year, month and date that you finished the file (yyyy-mm-dd,).]</date>
<respStmt>
<name>[Enter your name here.]</name>
<resp></resp>
</respStmt>
<item> finished TEI/SGML encoding.</item>
</change>
<change>
<date>[Enter the date the batch was returned from Apex. Ask Risa or Lisa if you are not sure.]</date>
<respStmt>
<name>Apex Data Services, Inc.</name>
<resp></resp>
</respStmt>
<item> finished transcribing the text.</item>
</change>
</revisionDesc>
</teiHeader>
[the front, body, and back sections follow the teiHeader]
</tei.2>

Groups of Texts

Sometimes a book will include more than one title. See for example, Proceedings of the Bible Convention of the Confederate States of America, available at http://docsouth.unc.edu/imls/biblconv/menu.html. This pamphlet includes the "Proceedings..." as well as "The Word of God..." a sermon given at the convention. The sermon begins after page 24 of the "Proceedings..." and has its own title page as well as pagination. Although these items are bound together and are related, they have a stand-alone structure as well. We encode these separate items using the <group> tag. This way, within the <group> tag we can place multiple instances of the the <text> tag. Both "Proceedings..." and "The Word of God..." are encoded as their own "Text."

To encode a book as a group of texts, surround the entire file from immediately after the TEIHeader with a <text> tag—this is usual practice. Place all the front materials that relate to the entire book in a <front> tag immediately within the first <text>. Examples of front materials for entire books are: front images (such as the cover, frontispiece, etc.), a preface to the full edition, and similar items.

After the shared front matter, surround the rest of the volume, excluding any shared back material, with <group>. Within the <group> tag, surround each separate work with its own <text> tag. Inside the <text> tag you can assign <front> <body> and <back> sections to each separate work. This is especially helpful when there are different title pages for different works included in the book.

Example: How to encode a group of texts.

<teiHeader> [In the TEIHeader include all the descriptive information about the whole title]
<text> [This tag surrounds the entire book. For the example of the "Proceedings..." used above, this text tag surrounds the cover image, both the "Proceedings..." and "The Word of God...", and the back cover.]
<front> All shared front matter for the book. [For the example this includes only the image of the cover.] </front>
<group> [The <group> tag is a container for multiple <text> tags.]
<text>This tag surrounds the entire first text. It must include a <body> and may include both <front> and <back> sections.
<front>Front matter for the first text only. [For the example, this includes the image of the title page, and the <titlePage> section, and the introduction.]</front>
<body>All the main content for the first text is surrounded by this <body> tag, and further divided into divisions. [For the example, this includes the main content of the "Proceedings..."]</body>
</text>
<text>This tag surrounds the second text. [For the example, the second text is the sermon, "The Word of God...", it contains a <front> and a <body>.]
<front>This tag surrounds the front matter that pertains only to the second text. [For the example, the FRONT for the second text includes the image of the second title page, the transcription of the title page, and the prefatory letter.] </front>
<body> This tag surrounds the main content of the second text. [In the example, the BODY surrounds the main content of the sermon.] </body>
<back>This tag surrounds any back matter that pertains only to the second text. [In the example, there is no BACK tag for the second text.]</back>
</text>
</group>
<back>This tag surrounds back matter for the entire volume (index, list of illustrations, etc.) </back>
</text>

Proofreading

Check to see that all left and right single and double quotation marks have been correctly labeled and that no quotation marks are missing (quotation marks are frequently omitted in the scanning process and may not have been discovered during previous proofreadings). Use the Find function to check that each left double quotation mark has a matching right double quotation mark. If not, consult with Lisa or Natasha about how to fix it.

Double check to see that all images have been assigned a figure tag; all footnotes have a circular reference; all items in a list are tagged as items; all items in tables of contents, lists of illustrations, and indexes have the appropriate reference tags and attributes assigned; all poetry/verse are surrounded by the appropriate tags.

Once you have thoroughly checked the text perform a spell check on the text by going to Edit on the main toolbar and selecting Check Spelling. A thorough spell check is essential because the Author/Editor spell check often picks up mistakes that previously have been missed.

Occasionally you may work with a text that has a lot of typos in it that were introduced by Apex. Always spot-check for typos by proofreading a few different pages in their entirety. If you find persistent errors, please alert Lisa or Natasha. It may be necessary to proofread the document more carefully in these cases.

Search the text for <unclear> and [UNK]. Our outsourcing company uses this tag and phrase to mark things they could not transcribe. We try to remove all of these and replace them with the text transcription or with the <gap> tag when necessary.

Search the text for the character prime (`) (on the keyboard it is on the same button as the tilde (~)). This character is often mistakenly inserted by Apex. Most of the time you will replace the prime (`) with an apostrophe or a left single quotation mark.

Remove all empty head tags. Meanwhile, be on the lookout for any headings that have italics—Apex often missed the italics in headings.

Remove all instances of the <seg> tag.

In lists and other parts of the text, you may find dot leaders in the original. All strings of periods and hyphens in original works should be transcribed as five periods witha space in between each period: ". . . . .".

Validating the Document

Once you have added the TeiHeader and done the final proofreading, you can validate the document to find errors you might have missed. To validate the document, go to Special>Validate Document (or hit the F9 key).

If not, the program will take you to areas that need corrections. Examine these areas, make the appropriate corrections, and continue the validation process. If you do not know how to make a given correction, consult with your colleagues, Lisa, or Natasha. After you have successfully validated the document, save the file as an Author/Editor file. Append the phrase "-done" to the end of the filename before the extension. ".ae".

Note: you may validate sections of your file by highlighting it and choosing Special>Validate Selection

Turning Off the Rules

Rules checking is like a safety net: it keeps you from making any structural mistakes. However, when you begin editing a text in Author/Editor, you may need to turn rules checking off to perform certain edits. If Author/Editor gives you a warning that you will need to turn off rules checking, think through what you want to do and then turn off the rules. To turn off the rules, go to Special>Turn Rules Checking On/Off. Turn the rules on again as soon as you have fixed the problem. If you have trouble turning the rules back on, check with Natasha or Lisa.

Keyboard Shortcuts in Author/Editor

When using Markup>Insert Element, which provides you with a long list of options to choose from, you can either scroll through the list to find the one you want or hit the key on the keyboard that corresponds to the first letter of the item you are looking for. For example, if you are on the Insert Element dialog box and you want to select the <pb> tag, hit the p key on the keyboard and the screen will take you to the selections that begin with the letter p. If you type pb it will take you to the first tag beginning with p followed by b. You can type in as many letters as you want to move quickly to the correct tag.

There are many other useful tricks to speed up your work in Author/Editor. When you're ready to accelerate, see Lisa.

Additional Resources

These guidelines cover the majority of encoding practices used in the DocSouth digital initiative. For questions or further analysis, please make use of the following resources:

Oral History Interviews

Currently, DocSouth is working with the Southern Oral History Program and the Manuscripts Department to digitize several oral history interviews. These interviews are encoded according to the following guidelines.