Marina Georgieva presented a poster at Digital Preservation 2018. Please read on for a closer look at her work, one of the many great offerings from this year’s event. For more posters, please visit osf.io/view/ndsa2018/. Marina holds Master’s Degree in Library Science with Information Technology concentration from the University of Wisconsin – Milwaukee. She’s currently Visiting Digital Collections Librarian at the University of Nevada – Las Vegas. Her passion is large-scale digitization with cutting edge technologies. Her research interests include project management in large-scale digitization and approaches for achieving higher digitization efficiency such as staffing and training, development of workflow, procedures and guidelines. Marina is also involved in metadata and authority work as well as metadata remediation projects. The digital librarian: the liaison between digital collections and digital preservation Overview At UNLV Libraries, the role of the Digital Collections Librarian goes beyond the traditional routine tasks of digitization, metadata management, project management, workflow development and team management. Digital Collections Librarians serve as links between digitization and digital preservation and do everything in between to draft sustainable digital preservation workflows alongside their colleagues in the Special Collections Technical Services Department. Technical Services Librarians are responsible for the preservation of born-digital archival materials, whereas the Digital Collections Librarians’ roles entail being information architects directly engaged in the process of preparing master files of in-house and outsourced reformatted materials for digital preservation. In recent years, the UNLV Libraries Digital Collections Department has completed numerous large-scale digitization projects that yielded hundreds of thousands new archival digital objects that require long-term preservation. Currently all these archival files are stored on a server, referred to as ‘The Digital Vault’. One of the invisible, often overlooked, yet very important roles of the Digital Librarian is to verify that all images from completed digitization projects are properly organized in meaningful easy-to-navigate directories and that all files are in the appropriate file format. It is common practice for folder directories (created and organized during the actual process of digitization) to remain intact and be moved to the Digital Vault for long-term storage in their original order. There they get merged in the collection-appropriate existing folders or, if necessary, a new folder is created. Additionally, UNLV Digital Collections has thousands of images from legacy collections stored in the Digital Vault. All of these digital objects live on the Digital Collections website, but some of the archival master folders have redundant data; others are saved in inappropriate file formats, and still others have non-normalized file naming. In the recent years, there has been an effort to clean up and restructure these legacy folders in order to make the archival files easily discoverable and to optimize the storage space before the content of the Digital Vault gets migrated to a new more robust system (UNLV Special Collections and Archives is currently building an instance of Islandora CLAW that will back up files in Amazon Glacier). The role The role of the UNLV Libraries Digital Librarian that relates directly to the digital preservation is outlined in the poster presented at 2018 NDSA DigiPres Forum (click here for access). Here we will just briefly touch upon few of the major responsibilities: File naming conventions For current digitization projects, file naming has been normalized and it happens in a structured and logical way depending on the type of collection being digitized. During the process of preparing collections for digitization, the librarian analyzes the content, makes decisions regarding the grouping of the digital objects and assigns collection-level and item-level digital identifiers. To achieve consistency and logical arrangement, the digital librarian maintains and updates spreadsheets with assigned and available digital identifiers. For example, if the collection consists of archival photographic materials, the assigned digital collection alias will be ‘PHO’ with the sequential numeric identifiers. These identifiers will logically follow the structure and numbering of all other previously digitized photo collections. As mentioned earlier, most of the newly digitized collections remain in the original directory structure that was developed during the scanning process. The digital librarian ensures that the file naming on directory level and on file level is accurate and the data set is ready to be moved to the Digital Vault. It is important to mention that often digital librarians need to deal with and manage more identifiers beyond those that identify archival structure (collection, folder) and those that identify the intellectual unit (item) so that they can accurately reflect the structure of materials. So they also need to create a third type which may involve multiple image files that comprise a single digital object; for example, back and front of a printed item or multiple items on a page in a scrapbook. Legacy collections bring more challenge and sometimes need some clean up as their file naming may be inconsistent. Depending on the project, the digital librarian may decide to keep the file structure intact or to rearrange the folders in more normalized way that follows the current preservation practices. Decisions on archival file formats UNLV Libraries Digital Collections have chosen TIFF file format for long-term preservation of archival master files. TIFF is the preferred format for in-house digitized reflective materials and transparencies. The file format for digitized periodicals may vary depending on the project. In-house digitized periodicals and newspaper clippings are preserved in TIFF just as photographs and films, while periodicals digitized as part of the National Digital Newspaper Program are stored in the original Library of Congress approved data sets. These data sets include newspaper pages in JP2, PDF and TIFF formats along with the accompanying metadata encoded in XML METS/Alto schema. Legacy collections may contain files in JPG format. This usually applies to collections accessioned as already digitized materials. The reason why they usually they remain in this format is that UNLV Libraries Special Collections do not have holdings of the original materials and therefore, it is impossible to re-digitize the items in the proper archival format. Building directories in the Digital Vault Current digitization and digital preservation efforts follow Read More
The post Marina Georgieva on the liaison between digital collections and digital preservation appeared first on DLF.