From Investigation to Implementation

Building a Program
for the Large-Scale
Digitization of Manuscripts

Selection & Development

Selection Process

The digitization and online presentation of manuscript collections comprised of millions of documents present staggering challenges of scale. The SHC staff believes these challenges will best be addressed by applying archival theory and practice in the digital environment. In digitizing the collection, the SHC staff will employ the archival principle of provenance: organizing and maintaining the individual collections based on the origins of the materials, rather than piecing together new collections of selected documents based on topics, geography, or chronology, or other characteristics. Under this model, for example, if the topic to be digitized was "Slavery in North Carolina," the staff would digitize entire collections in the antebellum plantation holdings that may pertain in part to slavery in North Carolina—as opposed to digitizing a selection of individual manuscripts pulled from these different collections documenting slavery in North Carolina. This structure mirrors the physical organization of the SHC's holdings in the Wilson Library where items remain within their collection of origin.

Feedback from graduate students and scholars

In exploring the Archives of American Art's website, graduate students in the August 2007 focus group discovered that portions of the collection had not been digitized—but that didn't become apparent until they accessed the box or folder level. They thought that an explanation for omissions should have been included upfront in the collection's finding aid.

Scholars who attended the Southern Sources workshop echoed these sentiments. One said, "I do not like sites that select 'important' documents." Another stated emphatically that even if it's a box of tiny receipts, "do not sample. Digitize it all." One scholar observed:

"I don't think you should skip over [collections] simply because you have to omit portions of it. After all, the extent of any collection is determined by arbitrary factors, and in most cases the remaining portions of the collection will be useful on their own. But you should definitely explain why you have had to leave certain sections out. This would be especially important to researchers who might want to come in person to fill in gaps."

In the Exit Survey following the workshop, only a handful of the scholars (4 of 21) said that the SHC should prioritize digitizing specific genres within collections; the other 17 preferred that complete collections be digitized. Regarding those collections that cannot be fully reproduced due to copyright, privacy, or other restrictions: the majority of scholars (14 of 21) responded that the SHC should not prioritize those collections that can be digitized in their entirety.

In summary, this feedback indicated that, when possible, the SHC should digitize complete collections; but when some content needs to be omitted from digitization of a collection, that collection should remain in the digitization queue, and the finding aid should include reasons for any omissions.