Exploring Disciplinary Metadata and Documentation Practices to Support Data Reuse Dataset
MetadataShow full item record
Whether to comply with funding agency requirements or to share freely with others, researchers increasingly deposit data into repositories for long-term preservation and access. In 2010, the Georgia Tech Library first rolled out our research data services, eventually establishing a data archiving service where researchers could deposit small, final datasets into our institutional repository SMARTech. As the rate of data deposit increases, and the Library accepts research data from a wider array of disciplines, we want to ensure that deposited research data are adequately described and documented. Because datasets are rarely self-describing or uniformly structured like publications, additional metadata is necessary to make certain that the data can be used in the future. Like many of our peers, Georgia Tech now asks data depositors to provide a “README” file with their deposit, in order to capture this additional metadata. This is particularly important since our repository currently only supports Dublin Core metadata, which cannot hold the full breadth of metadata needed for most datasets. We provide depositors with a “README” template , to provide guidance as to the types of supplemental metadata the repository hopes to capture. However, we have noticed that the generic, one-size-fits-all template does not adequately meet the needs of our community. For some researchers, the template does not address vital pieces of documentation, and for others, the template includes too much information that is not relevant to their dataset. While recognizing that our patrons’ individual needs will continue to vary widely even within their discipline, we sought to create more specialized “README” templates, based on discipline and data type, to better accommodate disciplinary differences. For example, a biologist preparing a dataset for deposit would receive a template designed with biologists and standard forms of biological data in mind, including metadata standards like Darwin Core or Ecological Metadata Language. This template would differ from one given to a Materials Scientist, who would have a template with metadata fields specific to Materials Engineering. In order to create specialized metadata templates, a group of librarians with diverse but complementary skills and experiences convened to explore differences in metadata creation and use. Members of the project team included subject librarians, the research data librarian, and the repository metadata librarian. Through a combination of document analysis, interviews with researchers, and exploration of existing standards, the Library has begun to determine the level of specialized, structured metadata that can be collected and indexed in the repository, as well as the amounts and forms of supplemental information that will need to be captured in a “README.” The datasets below were collected in support of this project.