FAQ for Case Study Authors
ESnet is collecting networking requirements from its constituents to facilitate planning for capacity, technology and services. Given that the funding and construction cycles for upgrades to national network infrastructure can amount to several years, we must plan for changes several years in advance.
ESnet is asking for requirements input from all major stakeholders. These include, but are not limited to, the following:
- DOE Office of Science Program Managers
- National Laboratory CIOs
- Operations Directors and Research Directors at the National Laboratories
- Authoritative representatives from major national and international research programs (e.g. Fusion, LHC)
- Key research or technical staff at National Laboratories, Supercomputer Centers, and the like
- ESnet Site Coordinators
- Key individuals at non-DOE institutions with significant DOE collaborators
Questions about the Case Studies
A science case study serves to help collect networking requirements for a major research endeavor within the DOE Office of Science. Therefore, people with authoritative knowledge of the science, or the ways in which the science makes use of the network, should provide input to the case study. The goal is to build an accurate network-centric picture of the science and how it is done.
ESnet is interested in collecting the information necessary to accurately predict the future needs of its customers. This includes (of course) raw bandwidth numbers. However, it also includes the ways in which the network will be used. Many scientists don't know exactly how their research, experiments and collaborators will make use of the network - that's OK. Case study authors should try to describe the things that their research needs from the network (e.g. movement of data sets of a given size between facilities, remote instrument control, real-time collaboration using video conferencing, etc). If we need clarification we'll ask for it. ESnet is interested in three different time horizons for the information provided in the survey:
- Now, or in the next 12 months (near term)
- 2-5 years from now (medium term)
- 5+ years (long term)
These three time horizons provide insight into future needs, growth rate, upcoming significant events, etc.
No. ESnet is not asking that networking requirements be submitted as detailed network diagrams, routing policies, and so on. Rather, case study authors should try to explain the science that they are doing, the ways in which their tools use the network, how much data they produce and consume, what services they need from the network, etc.
For example, if a scientist's research involves moving experimental data sets from an instrument site (e.g. Fermilab) to a supercomputer center (e.g. the ORNL Leadership Facility), ESnet would like to know the size of the data sets, the time envelope for the transfers (e.g. must be done in 12 hours, 1 week duty cycle, etc) and how the data sets are likely to increase in size over time. Other good information includes how often the transfers occur, when other computing facilities (e.g. NERSC, ANL) might be added to the workflow, and anything else that happens that involves moving data, results, visualization, or control messages over the network.
If clarification is needed, ESnet will ask for more information. However, the key idea is that ESnet is trying to elicit networking requirements from the user community from the users' perspective rather than asking physicists, chemists and materials scientists to become network experts in order to communicate their needs.
In the case studies and elsewhere, we refer to networking requirements that come from "Instruments and Facilities." In this context, we consider the data-generating components of experimental apparatus to be Instruments. Therefore, "Instruments" is a broad category composed of such things as detectors at particle accelerators (e.g. STAR on the RHIC accelerator or the CMS detector on the LHC), confocal microscopes, data servers from satellite downlinks, telescopes, Tokamaks, and so on. The term "Facilities" refers to Supercomputer Centers (e.g. NERSC and the Leadership Facilities), other institutions that post-process instrument data, laboratory clusters that run simulations, etc. Note that one could make the argument for many sites to be both Instruments and Facilities - the goal is not to give them separate definitions, but to include the types of "scientific hardware" that these definitions collectively represent in a common category. The Instruments and Facilities represent the raw data sources and sinks that drive large network traffic volumes, and thus their capabilities and requirements inform the requirements for networks that serve them.
The "Process of Science" refers to the process used by researchers for knowledge discovery. In the context of a case study, it includes information about the way in which scientists and researchers interact with and use the resources of the Instruments and Facilities. In many cases, an instrument might have a given data rate, but the way in which that instrument is used by scientists might give it a network traffic profile that is not adequately described by raw data volumes. It is important to learn how the scientists use their tools, so as to better understand how the scientists and their tools will use the network.