High-Throughput Workflow for Computer-Assisted Human Parsing of Biological Specimen Label Data

Show full item record

Please use this identifier to cite or link to this item: http://hdl.handle.net/1853/28412

Title: High-Throughput Workflow for Computer-Assisted Human Parsing of Biological Specimen Label Data
Author: Amin, Aliasgar ; Arsiwala, Zainab ; Best, Jason ; Huang, Jane Q. ; McCotter, Melody ; Moen, William E. ; Neill, Amanda
Abstract: Hundreds of thousands of specimens in herbaria and natural history museums worldwide are potential candidates for digitization, making them more accessible to researchers. An herbarium contains collections of preserved plant specimens created for scientific use. Herbarium specimens are ideal natural history objects for digitization, as the plants are pressed flat and dried, and mounted on individual sheets of paper, creating a nearly two-dimensional object. Building digital repositories of herbarium specimens can increase use and exposure of the collections while simultaneously reducing physical handling. As important as the digitized specimens are, the data contained on the associated specimen labels provide critical information about each specimen (e.g., scientific name, geographic location of specimen, etc.). The volume and heterogeneity of these printed label data present challenges in transforming them into meaningful digital form to support research. The Apiary Project is addressing these challenges by exploring and developing transformation processes in a systematic workflow that yields high-quality machine-processable label data in a cost- and time-efficient manner. The University of North Texas's Texas Center for Digital Knowledge (TxCDK) and the Botanical Research Institute of Texas (BRIT), with funding from an Institute of Museum and Library Services National Leadership Grant, are conducting fundamental research with the goal of identifying how human intelligence can be combined with machine processes for effective and efficient transformation of specimen label information. The results of this research will yield a new workflow model for effective and efficient label data transformation, correction, and enhancement.
Description: 4th International Conference on Open Repositories This presentation was part of the session : Conference Posters
Type: Proceedings
URI: http://hdl.handle.net/1853/28412
Date: 2009-05
Contributor: Botanical Research Institute of Texas
University of North Texas
Texas Center for Digital Knowledge
Relation: OR09. Conference Posters
Publisher: Georgia Institute of Technology
Subject: Digital libraries
Digital repositories
Botanical specimens

All materials in SMARTech are protected under U.S. Copyright Law and all rights are reserved, unless otherwise specifically indicated on or in the materials.

Files in this item

Files Size Format View Description
176-669-1-PB.docx 15.61Kb Unknown View/ Open MS Word Extended Abstract
176-670-1-PB.pdf 29.94Kb PDF View/ Open PDF Extended Abstract

This item appears in the following Collection(s)

Show full item record