Science DMZ National Oceanic and Atmospheric Administration
The National Oceanic and Atmospheric Administration (NOAA) in Boulder houses the Earth System Research Lab, which supports a "reforecasting" project. The initiative involves running several decades of historical weather forecasts with the same current version of NOAA's Global Ensemble Forecast System (GEFS). Among the advantages associated with a long reforecast dataset is that model forecast errors can be diagnosed from the past forecasts and corrected, thereby dramatically increasing the forecast skill, especially in forecasts of relatively rare events and longer-lead forecast.
In 2010, the NOAA team received an allocation of 14.5 million processor hours at NERSC to perform this work. In all, the 1984--2012 historical GEFS dataset totaled over 800 TB, stored on the NERSC HPSS archival system. Of the 800 TB at NERSC, the NOAA team sought to bring about 170 TB back to NOAA Boulder for further processing and to make it more readily available to other researchers.
When the NOAA team tried to use an FTP server located behind NOAA's firewall for the transfers, they discovered that data trickled in at about 1--2 MB/s.
Working with ESnet and NERSC, the NOAA team leveraged the Science DMZ design pattern to set up a new, dedicated transfer node enabled with Globus Online to create a data path unencumbered by legacy firewalls. Immediately the team saw a throughput increase of nearly 200 times. The team was able to transfer 273 files with a total size of 239.5 GB in just over 10 minutes---approximately 395 MB/s.