Modern Research Data Portals

Data portals provide a way for scientists to search for, discover, access, download, analyze, and publish scientific data. They are incredibly valuable for large collaborations, research groups, and indeed for entire fields.

Historically, a science data portal consisted of little more than a web server, a database, and some storage - a web browser provided the graphical interface for the portal (for free!), and the resulting capability was far better than the previous state of the art, which was command-line File Transfer Protocol (FTP). However, as data sets grew in scale and the number of data objects in them grew as well, the legacy data portal model began to fall behind in scalability and performance.

The Modern Research Data Portal (MRDP) is a design pattern that makes use of the Science DMZ model and DTNs to scale up the data transfer functionality of a data portal. When the data portal gives the user references to data objects, the references point to a well-configured DTN (or DTN cluster) in a Science DMZ, typically using a data transfer platform which can perform job management, fault recovery, and other modern functions. A paper describing the MRDP design pattern, written in collaboration with members of the Globus team, was published in PeerJ in 2017.