Specs & Tooling

Specifications and Tooling

Research Object Crate

RO-Crate is a community effort to establish a lightweight approach to packaging research data with their metadata. It is based on schema.org annotations in JSON-LD, and aims to make best-practice in formal metadata description accessible and practical for use in a wider variety of situations, from an individual researcher working with a folder of data, to large data-intensive computational research environments.

RO-Crate is the marriage of Research Objects with DataCrate. It aims to build on their respective strengths, but also to draw on lessons learned from those projects and similar research data packaging efforts. For more details, see RO-Crate background.

The RO-Crate specification details how to capture a set of files and resources as a dataset with associated metadata – including contextual entities like people, organizations, publishers, funding, licensing, provenance, workflows, geographical places, subjects and repositories.

A growing list of RO-Crate tools and libraries simplify creation and consumption of RO-Crates, including the graphical interface Describo.

Join the RO-Crate community to help shape the specification or get help with using it!

Research Object Model specifications

RO core

  • ro overview - Overview of the Research Object Model vocabularies
  • ro ontology - The core concepts of Research Objects, identity, aggregation, and annotation, are captured in the specification of the ro model. The specification of this core Research Object model is described in the ro ontology under the namespace http://purl.org/wf4ever/ro#.
  • ro-bundle - Research Object Bundle, a zip-based serialization of the Research Object model based upon the Adobe Universal Container Format (UCF).
  • bagit-ro - A profile for Research Object as BagIt archives, which can be serialized as zip, tar or tar.gz _- basis for BDBag (_Big Data Bags) developed by FAIR Research.
  • LDP4ROs - Alignment between the Research Object model and the W3C Linked Data Platform (LDP).

RO extensions

We know of the following ontology extensions to the core Research Object model:

  • roevo model describing the evolution of a Research Object and its aggregated resources, which is also based on the W3C PROV Ontology recommendation
  • wfprov provenance model describing the execution of a scientific workflow, which is based on the latest W3C PROV Ontology recommendation.
  • wfdesc model describing scientific workflow protocols to facilitate interpretation and reuse of scientific workflow
  • wf4ever model describes common workflow service types and properties
  • RO-Opt ontology is designed for representing optimizations done to workflows and their provenance
  • roterms vocabulary defines terms useful for typing and annotation of resources in a Research Object, e.g. Hypothesis, ExampleRun, technicalContact, exampleValue
  • MINIM minimum information model for defining checklists for Research Objects. A Minim model defines a list of MUST/SHOULD/MAY requirements, associated with rules that express how to satisfy the requirement, e.g. by requiring certain resources to exist in the RO, or a more detailed query that must be fulfilled in its annotations.

RO Model tooling

  • ROHub - a web application for creating, sharing and inspecting Research Objects.
  • bdbag - a Python library and command line for creating and manipulating bagit-ro archives
  • ro-python - a Python library and command line to create/modify/inspect research object directories and RO Bundles - based on RO Manager
  • RO Manager - A git-like command-line tool that can be used create research objects in your local file directory.
  • RO bundle API - A Java library that can be used to generate and inspect the zip-based Research Object Bundles archives and their metadata.
  • ruby-ro-bundle - a Ruby library for creating/inspecting Research Object Bundles and their metadata.
  • Combine archive conversion - A tool for converting a COMBINE archive into a RO bundle, and vice versa. COMBINE archives can be browsed and modified using CombineArchiveWeb.
  • Latex2RO - A simple tool designed to help creating Research Objects (ROs) from LaTeX papers. Given a LaTeX file, the RO creator will extract its title and metadata and fill partially a structured HTML page annotated in RDF-a with these metadata.
  • LDP4RO - A LDP4J extension for creating, accessing and browsing Research Objects. See also the LDP4RO demo.
  • ro-show - a web application for viewing Research Objects (under development)

RO Model tutorials

Research Object Ontologies and Vocabularies Primer introduces the Research Object model and ontology, using an example of a Workflow Research Object described as Linked Data.

Research Object Tutorials provide a step-to-step example of creating a Research Object bundle, explaining the bundle structure and manifest.