DesignSafe: The Platform for Natural Hazards Engineering Data

Published on July 15, 2019

 

Maria Esteva, Research Associate/Data Archivist, University of Texas, Austin
Texas Advanced Computing Center / Natural Hazards Engineering Research Infrastructure

NSF-funded cyberinfrastructure has become the premiere source for NHE data

Natural hazards researchers may recall a time when experimental, simulation and field research data sets were like footnotes to papers published — useful but definitely secondary.

Today, thanks to advances in technology and the crusade to reuse results, data sets can be as valuable as the papers themselves — requiring their own permanent landing page on the web and a Digital Object Identifier, or DOI. In fact, more and more journals require authors to reference the datasets they generate through a citation, including a DOI.

THE DOI FOR CAREER-BUILDING

In her career as a data curator, Maria Esteva has witnessed the growing importance of publishing data. In her work with the NSF- funded NHERI award and its DesignSafe cyberinfrastructure, she notes that funding agencies are setting greater value on published research products that applicants show in their resumes, making those data DOIs increasingly valuable for career advancement.

“Simply put, a published data set is evidence of your work,” Esteva says. “When researchers reference a data DOI in their papers, they can get more citations. Plus, they can market and distribute their data easily, by using the DOI in social media or when communicating with colleagues.” In her capacity working with natural hazards researchers, she also emphasizes the educational value of data publishing.

“Through DesignSafe we make natural hazards engineering data available for the world to see,” she says. “DesignSafe’s DataDepot is where your peers can come to find, preview, and reuse data — so that they can enhance their knowledge about what’s being done in NHERI’s experimental facilities.”

IMPORTANCE OF PUBLISHING DATA

For instance, Erica Fischer, assistant professor of civil and construction engineering at Oregon State University, does damage reconnaissance after natural disasters. She co-chairs the Earthquake Engineering Research Institute (EERI) Virtual Earthquake Reconnaissance Team (VERT), which shares all its data products on DesignSafe.

Fischer says sharing that data via DesignSafe allows reconnaissance teams to improve the quality of their work based on empirical evidence of knowledge gaps and then benchmark that research on real- life situations.

“After a disaster, VERT publishes a report on DesignSafe within 48 hours. We provide a high-level overview detailing the performance of community infrastructure — including hospitals, transportation networks, applicable codes, geotechnical failures and many other topics,” Fischer says. “By sharing this data with the DesignSafe community, we know we are providing the information that researchers and institutions can use for making informed decisions on how to respond.”

CURATING NATURAL HAZARDS DATA

Using DesignSafe’s interfaces, researchers can manage the entire lifecycle of their data, Esteva explains. They can simultaneously work on computational research, curate, or explore, and reuse their data or data from other publications.

For researchers, curating data entails tasks such as selecting, categorizing, describing and relating data using one of five data models suitable to their research method. Once a data set is curated, the DesignSafe interface guides researchers through the publishing steps— including reviewing the data and its description, assigning authorship, licensing the data and requesting a DOI.

Naturally, natural hazards research can produce enormous and complex data sets. The publication landing page, the DOI, provides different ways to access and navigate such data, including a tree representation showing the different data and documentation components of a research project and how they relate.

CURATING RECONNAISSANCE DATA

OSU’s Fischer confirms the importance of providing shared data. “Just as real disaster scenarios cannot be timed or planned, laboratory testing is expensive and time consuming,” she says.

Having access to numerical models is necessary for researchers to perform simulations of laboratory tests or community-level performance, Fischer says. Without that data to benchmark numerical modeling techniques, simulations are simply not possible. “Sharing of laboratory testing data is crucial; it allows other researchers to build off of laboratory tests to investigate the influence of other parameters.”

A FLEXIBLE DATA PLATFORM

Esteva and the DS team enable researchers to curate data in a way that is not overly complex, but still sophisticated enough to show consistent, clear and accurate data representations of their work.

“Within DesignSafe, researchers hve a lot of freedom to publish their dara as they see appropriate,” Esteva explains. “Users can arrange the data in different ways and decide what files they want to publish to the world. But there is some structure to it, which helps make the data reusable.”

An experimental project in the Data Depot. From the landing page (above), users can explore the files within each of the dataset components. The tree view (below) provides the compete structure of the dataset.