Menu

Inder Monga

InderMohan Oct2023 BartNagelPhoto 5466rev v2
Inder Monga
Executive Director, ESnet; Division Director, Scientific Networking
Phone: +1 510 486 6531

Indermohan (Inder) S. Monga is the Director of the Scientific Networking Division at Lawrence Berkeley National Laboratory (Berkeley Lab) and Executive Director of Energy Sciences Network (ESnet), the Department of Energy’s high-performance network user facility. Optimized for large-scale science, ESnet connects and provides services to more than 50 DOE research sites, including the entire National Laboratory system, its supercomputing facilities, and its major scientific instruments, as well as peers with 271 research and commercial networks worldwide. In addition to managing ESnet, Inder works to advance the science of networking for collaborative and distributed research applications, as well as contributes to ongoing research projects tackling quantum networking and the programmability, analytics, and quality of experience driving convergence between application layer and the network. Prior to joining Berkeley Lab, Inder worked in network engineering for Wellfleet Communications, Bay Networks, and Nortel. The holder of 25 patents, he received a B.S. in electrical/electronics engineering from Indian Institute of Technology in Kanpur, India, and a master’s in computer engineering from Boston University.

Journal Articles

Mariam Kiran, Scott Campbell, Fatema Bannat Wala, Nick Buraglio, Inder Monga, “Machine learning-based analysis of COVID-19 pandemic impact on US research networks”, ACM SIGCOMM Computer Communication Review, December 3, 2021,

This study explores how fallout from the changing public health policy around COVID-19 has changed how researchers access and process their science experiments. Using a combination of techniques from statistical analysis and machine learning, we conduct a retrospective analysis of historical network data for a period around the stay-at-home orders that took place in March 2020. Our analysis takes data from the entire ESnet infrastructure to explore DOE high-performance computing (HPC) resources at OLCF, ALCF, and NERSC, as well as User sites such as PNNL and JLAB. We look at detecting and quantifying changes in site activity using a combination of t-Distributed Stochastic Neighbor Embedding (t-SNE) and decision tree analysis. Our findings bring insights into the working patterns and impact on data volume movements, particularly during late-night hours and weekends.

Inder Monga, Chin Guok, John MacAuley, Alex Sim, Harvey Newman, Justas Balcas, Phil DeMar, Linda Winkler, Tom Lehman, Xi Yang, “Software-Defined Network for End-to-end Networked Science at the Exascale”, Future Generation Computer Systems, April 13, 2020,

Abstract

Domain science applications and workflow processes are currently forced to view the network as an opaque infrastructure into which they inject data and hope that it emerges at the destination with an acceptable Quality of Experience. There is little ability for applications to interact with the network to exchange information, negotiate performance parameters, discover expected performance metrics, or receive status/troubleshooting information in real time. The work presented here is motivated by a vision for a new smart network and smart application ecosystem that will provide a more deterministic and interactive environment for domain science workflows. The Software-Defined Network for End-to-end Networked Science at Exascale (SENSE) system includes a model-based architecture, implementation, and deployment which enables automated end-to-end network service instantiation across administrative domains. An intent based interface allows applications to express their high-level service requirements, an intelligent orchestrator and resource control systems allow for custom tailoring of scalability and real-time responsiveness based on individual application and infrastructure operator requirements. This allows the science applications to manage the network as a first-class schedulable resource as is the current practice for instruments, compute, and storage systems. Deployment and experiments on production networks and testbeds have validated SENSE functions and performance. Emulation based testing verified the scalability needed to support research and education infrastructures. Key contributions of this work include an architecture definition, reference implementation, and deployment. This provides the basis for further innovation of smart network services to accelerate scientific discovery in the era of big data, cloud computing, machine learning and artificial intelligence.

Marco Ruffini, Kasandra Pillay, Chongjin Xie, Lei Shi, Dale Smith, Inder Monga, Xinsheng Wang, and Jun Shan Wey, “Connected OFCity Challenge: Addressing the Digital Divide in the Developing World”, Journal of Optical Communications and Networking, June 20, 2019, 11:354-361,

Jonathan B. Ajo-Franklin, Shan Dou, Nathaniel J. Lindsey, Inder Monga, Chris Tracy, Michelle Robertson, Veronica Rodriguez Tribaldos, Craig Ulrich, Barry Freifeld, Thomas Daley and Xiaoye Li, “Distributed Acoustic Sensing Using Dark Fiber for Near-Surface Characterization and Broadband Seismic Event Detection”, Nature, February 4, 2019,

Inder Monga, Prabhat, “Big-Data Science: Infrastructure Impact”, Proceedings of the Indian National Science Academy, June 15, 2018,

The nature of science is changing dramatically, from single researcher at a lab or university laboratory working with graduate students to a distributed multi- researcher consortiums, across universities and research labs, tackling large scientific problems. In addition, experimentalists and theorists are collaborating with each other by designing experiments to prove the proposed theories. ‘Big Data’ being produced by these large experiments have to verified against simulations run on High Performance Computing (HPC) resources.

The trends above are pointing towards

  1. Geographically dispersed experiments (and associated communities) that require data being moved across multiple sites. Appropriate mechanisms and tools need to be employed to move, store and archive datasets from such experiments.

  2. Convergence of simulation (requiring High Performance Computing) and Big Data Analytics (requiring advanced on-site data management techniques) into a small number of High Performance Computing centers. Such centers are key for consolidating software and hardware infrastructure efforts, and achieving broad impact across numerous scientific domains.

    The trends indicate that for modern science and scientific discovery, infrastructure support for handling both large scientific data as well as high-performance computing is extremely important. In addition, given the distributed nature of research and big-team science, it is important to build infrastructure, both hardware and software, that enables sharing across

 

institutions, researchers, students, industry and academia. This is the only way that a nation can maximize the research capabilities of its citizens while maximizing the use of its investments in computer, storage, network and experimental infrastructure.

This chapter introduces infrastructure requirements of High-Performance Computing and Networking with examples drawn from NERSC and ESnet, two large Department of Energy facilities at Lawrence Berkeley National Laboratory, CA, USA, that exemplify some of the qualities needed for future Research & Education infrastructure.

RK Shyamasundar, Prabhat Prabhat, Vipin Chaudhary, Ashwin Gumaste, Inder Monga, Vishwas Patil, Ankur Narang, “Computing for Science, Engineering and Society: Challenges, Requirement, and Strategic Roadmap”, Proceedings of the Indian National Science Academy, June 15, 2018,

Ilya Baldin, Tilman Wolf, Inder Monga, Tom Lehman, “The Future of CISE Distributed Research Infrastructure”, ACM SIGCOMM Computer Communication Review, May 1, 2018,

Shared research infrastructure that is globally distributed and widely accessible has been a hallmark of the networking community. We present a vision for a future mid-scale distributed research infrastructure aimed at enabling new types of discoveries. The “lessons learned” from constructing and operating the Global Environment for Network Innovations (GENI) infrastructure are the basis for our attempt to project future concepts and solutions. Our aim is to engage the community to contribute new ideas and to inform funding agencies about future research directions.

M Kiran, E Pouyoul, A Mercian, B Tierney, C Guok, I Monga, “Enabling intent to configure scientific networks for high performance demands”, Future Generation Computer Systems, August 2, 2017,

Kim Roberts, Qunbi Zhuge, Inder Monga, Sebastien Gareau, and Charles Laperle, “Beyond 100 Gb/s: Capacity, Flexibility, and Network Optimization”, Journal of Optical Communication Network, April 1, 2017, Volume 9,

In this paper, we discuss building blocks that enable the exploitation of optical capacities beyond 100 Gbs. Optical networks will benefit from more flexibility and agility in their network elements, especially from co- herent transceivers. To achieve capacities of 400 Gbs and more, coherent transceivers will operate at higher symbol rates. This will be made possible with higher bandwidth components using new electro-optic technologies imple- mented with indium phosphide and silicon photonics. Digital signal processing will benefit from new algorithms. Multi-dimensional modulation, of which some formats are already in existence in current flexible coherent transceiv- ers, will provide improved tolerance to noise and fiber non- linearities. Constellation shaping will further improve these tolerances while allowing a finer granularity in the selection of capacity. Frequency-division multiplexing will also provide improved tolerance to the nonlinear charac- teristics of fibers. Algorithms with reduced computation complexity will allow the implementation, at speeds, of direct pre-compensation of nonlinear propagation effects. Advancement in forward error correction will shrink the performance gap with Shannons limit. At the network con- trol and management level, new tools are being developed to achieve a more efficient utilization of networks. This will also allow for network virtualization, orchestration, and management. Finally, FlexEthernet and FlexOTN will be put in place to allow network operators to optimize capac- ity in their optical transport networks without manual changes to the client hardware. 

Ashwin Gumaste, Tamal Das, Kandarp Khandwala, and Inder Monga, “Network Hardware Virtualization for Application Provisioning in Core Networks”, IEEE Communications Magazine, February 1, 2017,

Service providers and vendors are moving toward a network virtualized core, whereby multiple applications would be treated on their own merit in programmable hardware. Such a network would have the advantage of being customized for user requirements and allow provisioning of next generation services that are built speci cally to meet user needs. In this article, we articulate the impact of network virtualization on networks that provide customized services and how a pro- vider’s business can grow with network virtualization. We outline a decision map that allows mapping of applications with technology that is supported in network-virtualization--oriented equipment. Analogies to the world of virtual machines and generic virtualization show that hardware supporting network virtualization will facilitate new customer needs while optimizing the provider network from the cost and performance perspectives. A key conclusion of the article is that growth would yield sizable revenue when providers plan ahead in terms of supporting network-virtualization-oriented technology in their networks. To be precise, providers have to incorporate into their growth plans network elements capable of new service deployments while protecting network neutrality. A simulation study validates our NV-induced model. 

Peter Hinrich, P Grosso, Inder Monga, “Collaborative Research Using eScience Infrastructure and High Speed Networks”, Future Generation Computer Systems, April 2, 2015,

Neal Charbonneau, Vinod M. Vokkarane, Chin Guok, Inder Monga, “Advance Reservation Frameworks in Hybrid IP-WDM Networks”, IEEE Communications Magazine, May 9, 2011, 59, Issu:132-139,

Tom Lehman, Xi Yang, Nasir Ghani, Feng Gu, Chin Guok, Inder Monga, and Brian Tierney, “Multilayer Networks: An Architecture Framework”, IEEE Communications Magazine, May 9, 2011,

Inder Monga, Chin Guok, William E. Johnston, and Brian Tierney, “Hybrid Networks: Lessons Learned and Future Challenges Based on ESnet4 Experience”, IEEE Communications Magazine, May 1, 2011,

Conference Papers

Lloyd Brown, Emily Marx, Dev Bali, Emmanuel Amaro, Debnil Sur, Ezra Kissel, Inder Monga, Ethan Katz-Bassett, Arvind Krishnamurthy, James McCauley, Tejas Narechania, Aurojit Panda, Scott Shenker, “An Architecture For Edge Networking Services”, ACM SIGCOMM '24: Proceedings of the ACM SIGCOMM 2024 Conference, August 4, 2024, 645-660,

Inder Monga, Erhan Saglamyurek, Ezra Kissel, Hartmut Haffner, Wenji Wu, “QUANT-NET: A testbed for quantum networking research over deployed fiber”, SIGCOMM QuNet'23, ACM, September 10, 2023, QuNet'23:31-37,

Zhongfen Deng, Kesheng Wu, Alex Sim, Inder Monga, Chin Guok, et al, “Analyzing Transatlantic Network Traffic over Scientific Data Caches”, 6th ACM International Workshop on ​System and Network Telemetry and Analysis, July 31, 2023,

Large scientific collaborations often share huge volumes of data around the world. Consequently a significant amount of network bandwidth is needed for data replication and data access. Users in the same region may possibly share resources as well as data, especially when they are working on related topics with similar datasets. In this work, we study the network traffic patterns and resource utilization for scientific data caches connecting European networks to the US. We explore the efficiency of resource utilization, especially for network traffic which consists mostly of transatlantic data transfers, and the potential for having more caching node deployments. Our study shows that these data caches reduced network traffic volume by 97% during the study period. This demonstrates that such caching nodes are effective in reducing wide-area network traffic.

Caitlin Sim, Kesheng Wu, Alex Sim, Inder Monga, Chin Guok, et al, “Predicting Resource Utilization Trends with Southern California Petabyte Scale Cache”, 26th International Conference on Computing in High Energy & Nuclear Physics, May 8, 2023,

Large community of high-energy physicists share their data all around world making it necessary to ship a large number of files over wide-area networks. Regional disk caches such as the Southern California Petabyte Scale Cache have been deployed to reduce the data access latency. We observe that about 94% of the requested data volume were served from this cache, with-out remote transfers, between Sep. 2022 and July 2023. In this paper, we show the predictability of the resource utilization by exploring the trends of recent cache usage. The time series based prediction is made with a machine learning approach and the prediction errors are small relative to the variation in the input data. This work would help understanding the characteristics of the resource utilization and plan for additional deployments of caches in the future.

Caitlin Sim, Kesheng Wu, Alex Sim, Inder Monga, Chin Guok, et al, “Predicting Resource Usage Trends with Southern California Petabyte Scale Cache”, 26th International Conference on Computing in High Energy & Nuclear Physics, May 8, 2023,

Caitlin Sim, Kesheng Wu, Alex Sim, Inder Monga, Chin Guok, “Effectiveness and predictability of in-network storage cache for Scientific Workflows””, IEEE International Conference on Computing, Networking and Communication, February 20, 2023,

Large scientific collaborations often have multiple scientists accessing the same set of files while doing different analyses, which create repeated accesses to the large amounts of shared data located far away. These data accesses have long latency due to distance and occupy the limited bandwidth available over the wide-area network. To reduce the wide-area network traffic and the data access latency, regional data storage caches have been installed as a new networking service. To study the effectiveness of such a cache system in scientific applications, we examine the Southern California Petabyte Scale Cache for a high-energy physics experiment. By examining about 3TB of operational logs, we show that this cache removed 67.6% of file requests from the wide-area network and reduced the traffic volume on wide-area network by 12.3TB (or 35.4%) an average day. The reduction in the traffic volume (35.4%) is less than the reduction in file counts (67.6%) because the larger files are less likely to be reused. Due to this difference in data access patterns, the cache system has implemented a policy to avoid evicting smaller files when processing larger files. We also build a machine learning model to study the predictability of the cache behavior. Tests show that this model is able to accurately predict the cache accesses, cache misses, and network throughput, making the model useful for future studies on resource provisioning and planning.

 

Wenji Wu, Liang Zhang, Qiming Lu, Phil DeMar, Robert Illingworth, Joe Mambretti, Se-Young Yu, Jim Hao Chen, Inder Monga, Xi Yang, Tom Lehman, Chin Guok, John MacAuley, “ROBIN (RuciO/BIgData Express/SENSE) A Next-Generation High-Performance Data Service Platform”, 2020 IEEE/ACM Innovating the Network for Data-Intensive Science (INDIS), IEEE/ACM, December 31, 2020,

Wenji Wu, Liang Zhang, Qiming Lu, Phil DeMar, Robert Illingworth, Joe Mambretti, Se-young Yu, Jim Hao Chen, Inder Monga, Xi Yang, others, “ROBIN (RuciO/BIgData Express/SENSE) A Next-Generation High-Performance Data Service Platform”, 2020 IEEE/ACM Innovating the Network for Data-Intensive Science (INDIS), 2020, 33--44,

Verónica Rodríguez Tribaldos, Shan Dou, Nate Lindsey, Inder Monga, Chris Tracy, Jonathan Blair Ajo-Franklin, “Monitoring Aquifers Using Relative Seismic Velocity Changes Recorded with Fiber-optic DAS”, AGU Meeting, December 10, 2019,

Qiming Lu, Liang Zhang, Sajith Sasidharan, Wenji Wu, Phil DeMar, Chin Guok, John Macauley, Inder Monga, Se-Young Yu, Jim Hao Chen, Joe Mambretti, Jin Kim, Seo-Young Noh, Xi Yang, Tom Lehman, Gary Liu, “BigData Express: Toward Schedulable, Predictable, and High-Performance Data Transfer”, 2018 IEEE/ACM Innovating the Network for Data-Intensive Science (INDIS), IEEE/ACM, February 24, 2019,

Inder Monga, Chin Guok, John Macauley, Alex Sim, Harvey Newman, Justas Balcas, Phil DeMar, Linda Winkler, Xi Yang, Tom Lehman, “SDN for End-to-end Networked Science at the Exascale (SENSE)”, INDIS Workshop SC18, November 11, 2018,

The Software-defined network for End-to-end Networked Science at Exascale (SENSE) research project is building smart network services to accelerate scientific discovery in the era of ‘big data’ driven by Exascale, cloud computing, machine learning and AI. The project’s architecture, models, and demonstrated prototype define the mechanisms needed to dynamically build end-to-end virtual guaranteed networks across administrative domains, with no manual intervention. In addition, a highly intuitive ‘intent’ based interface, as defined by the project, allows applications to express their high-level service requirements, and an intelligent, scalable model-based software orchestrator converts that intent into appropriate network services, configured across multiple types of devices. The significance of these capabilities is the ability for science applications to manage the network as a firstclass schedulable resource akin to instruments, compute, and storage, to enable well defined and highly tuned complex workflows that require close coupling of resources spread across a vast geographic footprint such as those used in science domains like high-energy physics and basic energy sciences.

Qiming Lu, Liang Zhang, Sajith Sasidharan, Wenji Wu, Phil DeMar, Chin Guok, John Macauley, Inder Monga, Se-young Yu, Jim Hao Chen, others, “Bigdata express: Toward schedulable, predictable, and high-performance data transfer”, 2018 IEEE/ACM Innovating the Network for Data-Intensive Science (INDIS), January 1, 2018, 75--84,

A Mercian, M Kiran, E Pouyoul, B Tierney, I Monga, “INDIRA:‘Application Intent’ network assistant to configure SDN-based high performance scientific networks”, Optical Fiber Communication Conference, July 1, 2017,

M Kiran, E Pouyoul, A Mercian, B Tierney, C Guok, I Monga, “Enabling Intent to Configure Scientific Networks for High Performance Demands”, 3nd International Workshop on Innovating the Network for Data Intensive Science (INDIS) 2016, SC16., November 10, 2016,

Mariam Kiran, Peter Murphy, Inder Monga, Jon Dugan, Sartaj Baveja, “Lambda Architecture for Cost-effective Batch and Speed Big Data processing”, First Workshop on Data-Centric Infrastructure for Big Data Science (DIBS), October 29, 2015,

This paper presents an implementation of the lambda architecture design pattern to construct a data-handling backend on Amazon EC2, providing high throughput, dense and intense data demand delivered as services, minimizing the cost of the network maintenance. This paper combines ideas from database management, cost models, query management and cloud computing to present a general architecture that could be applied in any given scenario where affordable online data processing of Big Datasets is needed. The results are presented with a case study of processing router sensor data on the current ESnet network data as a working example of the approach. The results showcase a reduction in cost and argue benefits for performing online analysis and anomaly detection for sensor data

Adrian Lara, Byrav Ramamurthy, Eric Pouyoul, Inder Monga, “WAN Virtualization and Dynamic End-to-End Bandwidth Provisioning Using SDN”, Optical Fiber Communication Conference 2015, March 22, 2015,

We evaluate a WAN-virtualization framework in terms of delay and scalability and demonstrate that adding a virtual layer between the physical topology and the end-user brings significant advantages and tolerable delays

Karel van der Veldt, Inder Monga, Jon Dugan, Cees de Laat, Paola Grosso, “Carbon-aware path provisioning for NRENs”, International Green Computing Conference, November 3, 2014,

 

National Research and Education Networks (NRENs) are becoming keener in providing information on the energy consumption of their equipment. However there are only few NRENs trying to use the available information to reduce power consumption and/or carbon footprint. We set out to study the impact that deploying energy-aware networking devices may have in terms of CO2 emissions, taking the ESnet network as use case. We defined a model that can be used to select paths that lead to a lower impact on the CO2 footprint of the network. We implemented a simulation of the ESnet network using our model to investigate the CO2 footprint under different traffic conditions. Our results suggest that NRENs such as ESnet could reduce their network’s environmental impact if they would deploy energy- aware hardware combined with paths setup tailored to reduction of carbon footprint. This could be achieved by modification of the current path provisioning systems used in the NREN community. 

 

Henrique Rodriguez, Inder Monga, Abhinava Sadasivarao , Sharfuddin Sayed, Chin Guok, Eric Pouyoul, Chris Liou,Tajana Rosing, “Traffic Optimization in Multi-Layered WANs using SDN”, 22nd Annual Symposium on High-Performance Interconnects, Best Student Paper Award, August 27, 2014,

Wide area networks (WAN) forward traffic through a mix of packet and optical data planes, composed by a variety of devices from different vendors. Multiple forwarding technologies and encapsulation methods are used for each data plane (e.g. IP, MPLS, ATM, SONET, Wavelength Switching). Despite standards defined, the control planes of these devices are usually not interoperable, and different technologies are used to manage each forwarding segment independently (e.g. OpenFlow, TL-1, GMPLS). The result is lack of coordination between layers and inefficient resource usage. In this paper we discuss the design and implementation of a system that uses unmodified OpenFlow to optimize network utilization across layers, enabling practical bandwidth virtualization. We discuss strategies for scalable traffic monitoring and to minimize losses on route updates across layers. We explore two use cases that benefit from multi-layer bandwidth on demand provisioning. A prototype of the system was built open using a traditional circuit reservation application and an unmodified SDN controller, and its evaluation was per-formed on a multi-vendor testbed.

http://blog.infinera.com/2014/09/05/henrique-rodrigues-wins-best-student-paper-at-ieee-hot-interconnects-for-infinerabrocadeesnet-multi-layer-sdn-demo/

http://esnetupdates.wordpress.com/2014/09/05/esnet-student-assistant-henrique-rodrigues-wins-best-student-paper-award-at-hot-interconnects/

 

 

Malathi Veeraraghavan, Inder Monga, “Broadening the scope of optical circuit networks”, International Conference On Optical Network Design and Modeling, Stockholm, Sweden, May 22, 2014,

 

Advances in optical communications and switching technologies are enabling energy-efficient, flexible, higher- utilization network operations. To take full advantage of these capabilities, the scope of optical circuit networks can be increased in both the vertical and horizontal directions. In the vertical direction, some of the existing Internet applications, transport-layer protocols, and application-programming interfaces need to be redesigned and new ones invented to leverage the high-bandwidth, low-latency capabilities of optical circuit networks. In the horizontal direction, inter-domain control and management-protocols are required to create a global-scale interconnection of optical circuit-switched networks. 

 

Abhinava Sadasivarao, Sharfuddin Syed, Chris Liou, Ping Pan, Andrew Lake, Chin Guok, Inder Monga, “Open Transport Switch - A Software Defined Networking Architecture for Transport Networks”, August 17, 2013,

 

There have been a lot of proposals to unify the control and management of packet and circuit networks but none have been deployed widely. In this paper, we propose a sim- ple programmable architecture that abstracts a core transport node into a programmable virtual switch, that meshes well with the software-defined network paradigm while leverag- ing the OpenFlow protocol for control. A demonstration use-case of an OpenFlow-enabled optical virtual switch im- plementation managing a small optical transport network for big-data applications is described. With appropriate exten- sions to OpenFlow, we discuss how the programmability and flexibility SDN brings to packet-optical backbone networks will be substantial in solving some of the complex multi- vendor, multi-layer, multi-domain issues service providers face today. 

 

Baris Aksanli, Jagannathan Venkatesh, Tajana Rosing, Inder Monga, “A Comprehensive Approach to Reduce the Energy Cost of Network of Datacenters”, International Symposium on Computers and Communications, Best Student Paper award, July 7, 2013,

Best Student Paper

Several studies have proposed job migration over the wide area network (WAN) to reduce the energy of networks of datacenters by taking advantage of different electricity prices and load demands. Each study focuses on only a small subset of network parameters and thus their results may have large errors. For example,  datacenters usually have long-term power contracts instead of paying market prices. However, previous work neglects these contracts, thus overestimating the energy savings by 2.3x. We present a comprehensive approach to minimize the energy cost of networks of datacenters by modeling performance of the workloads, power contracts, local renewable energy sources, different routing options for WAN and future router technologies. Our method can reduce the energy cost of datacenters by up to 28%, while reducing the error in the energy cost estimation by 2.6x.

Inder Monga, Eric Pouyoul, Chin Guok, “Software Defined Networking for big-data science (paper)”, SuperComputing 2012, November 11, 2012,

 

University campuses, Supercomputer centers and R&E networks are challenged to architect, build and support IT infrastructure to deal effectively with the data deluge facing most science disciplines. Hybrid network architecture, multi-domain bandwidth reservations, performance monitoring and GLIF Open Lightpath Exchanges (GOLE) are examples of network architectures that have been proposed, championed and implemented successfully to meet the needs of science. Most recently, Science DMZ, a campus design pattern that bypasses traditional performance hotspots in typical campus network implementation, has been gaining momentum. In this paper and corresponding demonstration, we build upon the SC11 SCinet Research Sandbox demonstrator with Software-Defined networking to explore new architectural approaches. A virtual switch network abstraction is explored, that when combined with software-defined networking concepts provides the science users a simple, adaptable network framework to meet their upcoming application requirements. 

 

Jon Dugan, Gopal Vaswani, Gregory Bell, Inder Monga, “The MyESnet Portal: Making the Network Visible”, TERENA 2012 Conference, May 22, 2012,

 

ESnet provides a platform for moving large data sets and accelerating worldwide scientific collaboration. It provides high-bandwidth, reliable connections that link scientists at national laboratories, universities and other research institutions, enabling them to collaborate on some of the world's most important scientific challenges including renewable energy sources, climate science, and the origins of the universe.

ESnet has embarked on a major project to provide substantial visibility into the inner-workings of the network by aggregating diverse data sources, exposing them via web services, and visualizing them with user-centered interfaces. The portal’s strategy is driven by understanding the needs and requirements of ESnet’s user community and carefully providing interfaces to the data to meet those needs. The 'MyESnet Portal' allows users to monitor, troubleshoot, and understand the real time operations of the network and its associated services.

This paper will describe the MyESnet portal and the process of developing it. The data for the portal comes from a wide variety of sources: homegrown systems, commercial products, and even peer networks. Some visualizations from the portal are presented highlighting some interesting and unusual cases such as power consumption and flow data. Developing effective user interfaces is an iterative process. When a new feature is released, users are both interviewed and observed using the site. From this process valuable insights were found concerning what is important to the users and other features and services they may also want. Open source tools were used to build the portal and the pros and cons of these tools are discussed.

 

Baris Aksanli, Tajana Rosing, Inder Monga, “Benefits of Green Energy and Proportionality in High Speed Wide Area Networks Connecting Data Centers”, Design, Automation and Test in Europe (DATE), March 5, 2012,

Abstract: Many companies deploy multiple data centers across the globe to satisfy the dramatically increased computational demand. Wide area connectivity between such geographically distributed data centers has an important role to ensure both the quality of service, and, as bandwidths increase to 100Gbps and beyond, as an efficient way to dynamically distribute the computation. The energy cost of data transmission is dominated by the router power consumption, which is unfortunately not energy proportional. In this paper we not only quantify the performance benefits of leveraging the network to run more jobs, but also analyze its energy impact. We compare the benefits of redesigning routers to be more energy efficient to those obtained by leveraging locally available green energy as a complement to the brown energy supply. Furthermore, we design novel green energy aware routing policies for wide area traffic and compare to state-of-the-art shortest path routing algorithm. Our results indicate that using energy proportional routers powered in part by green energy along with our new routing algorithm results in 10x improvement in per router energy efficiency with 36% average increase in the number of jobs completed.

 

Book Chapters

Baris Aksanli, Jagannath Venkatesh, Inder Monga, Tajana Rosing, “Renewable Energy Prediction for Improved Utilization and Efficiency in Data Centers and Backbone Networks”, ( May 30, 2016)

The book at hand gives an overview of the state of the art research in Computational Sustainability as well as case studies of different application scenarios. This covers topics such as renewable energy supply, energy storage and e-mobility, efficiency in data centers and networks, sustainable food and water supply, sustainable health, industrial production and quality, etc. The book describes computational methods and possible application scenarios.

Presentation/Talks

Inder Monga, FABRIC: integration of bits, bytes, and xPUs, JET meeting, March 17, 2020,

Presenting NSF-funded FABRIC project to the JET community

Inder Monga, Chin Guok, SDN for End-to-End Networking at Exascale, February 16, 2016,

Traditionally, WAN and campus networks and services have evolved independently from each other. For example, MPLS traffic engineered and VPN technologies have been targeted towards the WAN, while the LAN (or last mile) implementations have not incorporated that functionality. These restrictions have resulted in dissonance in services offered in the WAN vs. the LAN. While OSCARS/NSI virtual circuits are widely deployed in the WAN, they typically only run from site boundary to site boundary, and require painful phone calls, manual configuration, and resource allocation decisions for last mile extension. Such inconsistencies in campus infrastructures, all the way from the campus edge to the data-transfer hosts, often lead to unpredictable application performance. New architectures such as the Science DMZ have been successful in simplifying the variance, but the Science DMZ is not designed or able to solve the end-to-end orchestration problem. With the advent of SDN, the R&E community has an opportunity to genuinely orchestrate end-to-end services - and not just from a network perspective, but also from an end-host perspective. In addition, with SDN, the opportunity exists to create a broader set of custom intelligent services that are targeted towards specific science application use-cases. This proposal describes an advanced deployment of SDN equipment and creation of a comprehensive SDN software platform that will help with bring together the missing end-to-end story. 

Inder Monga, Plenary Keynote - "Design Patterns: Scaling up eResearch", Web Site, February 9, 2016,

Inder Monga, Network Operating Systems and Intent APIs for SDN Applications, Technology Exchange Conference, October 6, 2015,

Philosophy of Network Operating Systems and Intent APIs

Inder Monga, ICN roadmaps for the next 2 years, 2nd ACM Conference on Information-Centric Networking (ICN 2015), October 1, 2015,

Panelists: Paul Mankiewich (Cisco), Luca Muscariello (Orange), Inder Monga (ESnet), Ignacio Solis (PARC), GQ Wang(Huawei), Jeff Burke (UCLA)

Inder Monga, Science Data and the NDN paradigm, NDN Community Meeting (NDNcomm 2015): Architecture, Applications, and Collaboration, September 28, 2015,

Abhinava Sadasivarao, Sharfuddin Syed, Ping Pan, Chris Liou, Andy Lake, Chin Guok, Inder Monga, Open Transport Switch: A Software Defined Networking Architecture for Transport Networks, Workshop, August 16, 2013,

Presentation at HotSDN Workshop as part of SIGCOMM 2013

Inder Monga, Network Abstractions: The first step towards a programmable WAN, TIP 2013, January 14, 2013,

University campuses, Supercomputer centers and R&E networks are challenged to architect, build and support IT infrastructure to deal effectively with the data deluge facing most science disciplines. Hybrid network architecture, multi-domain bandwidth reservations, performance monitoring and GLIF Open Lightpath Exchanges (GOLE) are examples of network architectures that have been proposed, championed and implemented successfully to meet the needs of science. This talk explores a new "one virtual switch" abstraction leveraging software-defined networking and OpenFlow concepts, that provides the science users a simple, adaptable network framework to meet their future application requirements. The talk will include the high-level design that includes use of OpenFlow and OSCARS as well as implementation details from demonstration planned for super-computing.

Inder Monga, Introduction to Bandwidth on Demand to LHCONE, LCHONE Point-to-point Service Workshop, December 13, 2012,

Introducing Bandwidth on Demand concepts to the application community of CMS and ATLAS experiments.

Inder Monga, Software Defined Networking for big-data science, Worldwide LHC Grid meeting, December 2012,

Inder Monga, Eric Pouyoul, Chin Guok, Software Defined Networking for big-data science, SuperComputing 2012, November 15, 2012,

 

The emerging era of “Big Science” demands the highest possible network performance. End-to-end circuit automation and workflow-driven customization are two essential capabilities needed for networks to scale to meet this challenge. This demonstration showcases how combining software-defined networking techniques with virtual circuits capabilities can transform the network into a dynamic, customer-configurable virtual switch. In doing so, users are able to rapidly customize network capabilities to meet their unique workflows with little to no configuration effort. The demo also highlights how the network can be automated to support multiple collaborations in parallel.

 

Inder Monga, Programmable Information Highway, November 11, 2012,

 

Suggested Panel Questions:

- What do you envision will have dramatic impact in the future networking and data management?  What research challenges do you expect in achieving your vision? 

- Do we need to re-engineer existing tools and middleware software? Elaborate on network management middleware in terms of virtual circuits, performance monitoring, and diagnosis tools.

- How do current applications match increasing data sizes and enhancements in network infrastructure? Please list a few network-aware application.  What is the scope of networking in the application domain?

- Resource management and scheduling problems are gaining importance due to  current developments in utility computing and high interest in Cloud infrastructure. Explain your vision.  What sort of algorithms/mechanisms will practically be used in the future?

- What are the main issues in designing/modelling cutting edge dynamic networks for large-scale data processing? What sort of performance problems do you expect?

- What necessary step do we need to implement to benefit from next generation high bandwidth networks? Do you think there will be radical changes such as novel APIs or new network stacks?

 

I. Monga, E. Pouyoul, C. Guok, Software-Define Networking for Big-Data Science – Arthictectural Models from Campus to the WAN, SC12: IEEE HPC, November 2012,

Inder Monga, Software-defined networking (SDN) and OpenFlow: Hot topics in networking, Masters Class at CMU, NASA Ames, October 2012,

Paola Grosso, Inder Monga, Cees DeLaat, GreenSONAR, GLIF, October 12, 2012,

Inder Monga, Bill St. Arnaud, Erik-Jan Bos, Defining GLIF Architecture Task Force, GLIF, October 11, 2012,

12th Annual LambdaGrid Workshop in Chicago

Inder Monga, Network Service Interface: Concepts and Architecture, I2 Fall Member Meeting, September 2012,

Inder Monga, Architecting and Operating Energy-Efficient Networks, September 10, 2012,

The presentation outlines the network energy efficiency challenges, the growth of network traffic and the simulation use-case to build next-generation energy-efficient network designs.

Inder Monga, Eric Pouyoul, Chin Guok, Eli Dart, SDN for Science Networks, Summer Joint Techs 2012, July 17, 2012,

Inder Monga, Marching Towards …a Net-Zero Network, WIN2012 Conference, July 2012,

Inder Monga, A Data-Intensive Network Substrate for eResearch, eScience Workshop, July 2012,

Inder Monga, Energy Efficiency starts with measurement, Greentouch Meeting, June 2012,

Inder Monga, ESnet Update: Networks and Research, JGNx and NTT, June 2012,

C. Guok, I. Monga, IDCP and NSI: Lessons Learned, Deployments and Gap Analysis, OGF 34, March 2012,

Inder Monga, Enabling Science at 100G, ON*Vector Conference, February 2012,

Inder Monga, John MacAuley, GLIF NSI Implementation Task Force Presentation, Winter GLIF Tech Meeting at Baton Rouge, LA, January 26, 2012,

Chaitanya S. K. Vadrevu, Massimo Tornatore, Chin P. Guok, Inder Monga, A Heuristic for Combined Protection of IP Services and Wavelength Services in Optical WDM Networks, IEEE ANTS 2010, December 2010,

C. Guok, I. Monga, Composible Network Service Framework, ESCC, February 2010,

Reports

Inder Monga, Chin Guok, Arjun Shankar, “Federated IRI Science Testbed (FIRST): A Concept Note”, DOE Office of Science, December 7, 2023, LBNL LBNL-2001553

The Department of Energy’s (DOE’s) vision for an Integrated Research Infrastructure (IRI) is to empower researchers to smoothly and securely meld the DOE’s world-class user facilities and research infrastructure in novel ways in order to radically accelerate discovery and innovation. Performant IRI arises through the continuous interoperability of research workflows with compute, storage, and networking infrastructure, fulfilling researchers’ quests to gain insight from observational and experimental data. Decades of successful research, pilot projects, and demonstrations point to the extraordinary promise of IRI but also indicate the intertwined technological, policy, and sociological hurdles it presents. Creating, developing, and stewarding the conditions for seamless interoperability of DOE research infrastructure, with clear value propositions to stakeholders to opt into an IRI ecosystem, will be the next big step.

The IRI testbed will tie together experimental and observational instruments, ASCR compute facilities for largescale analysis, and edge computing for data reduction and filtering using Energy Sciences Network (ESnet), the high performance network and DOE user facility. The testbed will provide pre-production capabilities that are beyond a demonstration of technology.

Governance, funding, and resource allocation are beyond the scope of this document: it seeks to provide a high-level view of potential benefits, focus areas, and the working groups whose formation would further define the testbed’s design, activities, and goals.

 

Eli Dart, Jason Zurawski, Carol Hawk, Benjamin Brown, Inder Monga, “ESnet Requirements Review Program Through the IRI Lens”, LBNL, October 16, 2023, LBNL 2001552

The Department of Energy (DOE) ensures America’s security and prosperity by addressing its energy, environmental, and nuclear challenges through transformative science and technology solutions. The DOE’s Office of Science (SC) delivers groundbreaking scientific discoveries and major scientific tools that transform our understanding of nature and advance the energy, economic, and national security of the United States. The SC’s programs advance DOE mission science across a wide range of disciplines and have developed the research infrastructure needed to remain at the forefront of scientific discovery.

The DOE SC’s world-class research infrastructure — exemplified by the 28 SC scientific user facilities — provides the research community with premier observational, experimental, computational, and network capabilities. Each user facility is designed to provide unique capabilities to advance core DOE mission science for its sponsor SC program and to stimulate a rich discovery and innovation ecosystem.

Research communities gather and flourish around each user facility, bringing together diverse perspectives. A hallmark of many facilities is the large population of students, postdoctoral researchers, and early-career scientists who contribute as full-fledged users. These facility staff and users collaborate over years to devise new approaches to utilizing the user facility’s core capabilities. The history of the SC user facilities has many examples of wildly inventive researchers challenging operational orthodoxy to pioneer new vistas of discovery; for example, the use of the synchrotron X-ray light sources for study of proteins and other large biological molecules. This continual reinvention of the practice of science — as users and staff forge novel approaches expressed in research workflows — unlocks new discoveries and propels scientific progress.

Within this research ecosystem, the high performance computing (HPC) and networking user facilities stewarded by SC’s Advanced Scientific Computing Research (ASCR) program play a dynamic cross-cutting role, enabling complex workflows demanding high performance data, networking, and computing solutions. The DOE SC’s three HPC user facilities and the Energy Sciences Network (ESnet) high-performance research network serve all of the SC’s programs as well as the global research community. Argonne Leadership Computing Facility (ALCF), the National Energy Research Scientific Computing Center (NERSC), and Oak Ridge Leadership Computing Facility (OLCF) conceive, build, and provide access to a range of supercomputing, advanced computing, and large-scale data-infrastructure platforms, while ESnet interconnects DOE SC research infrastructure and enables seamless exchange of scientific data. All four facilities operate testbeds to expand the frontiers of computing and networking research. Together, the ASCR facilities enterprise seeks to understand and meet the needs and requirements across SC and DOE domain science programs and priority efforts, highlighted by the formal requirements reviews (RRs) methodology.

In recent years, the research communities around the SC user facilities have begun experimenting with and demanding solutions integrated with HPC and data infrastructure. This rise of integrated-science approaches is documented in many community and high-level government reports. At the dawn of the era of exascale science and the acceleration of artificial intelligence (AI) innovation, there is a broad need for integrated computational, data, and networking solutions.

In response to these drivers, DOE has developed a vision for an Integrated Research Infrastructure (IRI): To empower researchers to meld DOE’s world-class research tools, infrastructure, and user facilities seamlessly and securely in novel ways to radically accelerate discovery and innovation.

The IRI vision is fundamentally about establishing new data-management and computational paradigms within which DOE SC user facilities and their research communities work together to improve existing capabilities and create new possibilities by building bridges across traditional silos. Implementation of IRI solutions will give researchers simple and powerful tools with which to implement multi-facility research data workflows.

In 2022, SC leadership directed the Advanced Scientific Computing Research (ASCR) program to conduct the Integrated Research Infrastructure Architecture Blueprint Activity (IRI ABA) to produce a reference framework to inform a coordinated, SC-wide strategy for IRI. This activity convened the SC science programs and more than 150 DOE national laboratory experts from all 28 SC user facilities across 13 national laboratories to consider the technological, policy, and sociological challenges to implementing IRI.

Through a series of cross-cutting sprint exercises facilitated by the IRI ABA leadership group and peer facilitators, participants produced an IRI Framework based on the IRI Vision and comprising:

  • IRI Science Patterns spanning DOE science domains;
  • IRI Practice Areas needed for implementation;
  • IRI blueprints that connect Patterns and Practice Areas;
  • Overarching principles for realizing the DOE-wide IRI ecosystem.

The resulting IRI framework and blueprints provide the conceptual foundations to move forward with organized, coordinated DOE implementation efforts. The next step is to identify urgencies and ripe areas for focused efforts that uplift multiple communities.

Upon completion of the IRI ABA framework, ESnet applied the IRI Science Patterns lens and undertook a metaanalysis of ESnet’s Requirements Reviews (RRs), the core strategic planning documents that animate the multiyear partnerships between ESnet and five of the DOE SC programs. Between 2019 and 2023, ESnet completed a new round of RRs with the following SC programs: Nuclear Physics (2019-20), High Energy Physics (2020-21), Fusion Energy Sciences (2021-22), Basic Energy Sciences (2021-22), and Biological and Environmental Research (2022-23). Together these ESnet RRs provide a rich trove of insights into opportunities for immediate IRI progress and investment.

Our meta-analysis of 74 high-priority case studies reveals that:

  • -There are a significant number of research workflows spanning materials science, fusion energy, nuclear physics, and biological science that have a similar structure. Creation of common software components to improve these workflows’ performance and scalability will benefit researchers in all of these areas.
  • There is broad opportunity to accelerate scientific productivity and scientific output across DOE facilities by integrating them with each other and with high performance computing and networking.
  • The ESnet RRs’ blending of retrospective and prospective insight affirms that the IRI patterns are persistent across time and likely to persist into the future, offering value as a basis for analysis and strategic planning going forward.

 

Nicholas A Peters, Warren P Grice, Prem Kumar, Thomas Chapuran, Saikat Guha, Scott Hamilton, Inder Monga, Raymond Newell, Andrei Nomerotski, Don Towsley, Ben Yoo, “Quantum Networks for Open Science (QNOS) Workshop”, DOE Technical Report, April 1, 2019,

Others

Inder Monga, Liang Zhang, Yufeng Xin, Designing Quantum Routers for Quantum Internet, ASCR Basic Research Needs in Quantum Computing and Networking Workshop, July 11, 2023,

Yufeng Xin, Inder Monga, Liang Zhang, Hybrid Quantum Networks: Modeling and Optimization, ASCR Basic Research Needs in Quantum Computing and Networking Workshop, July 11, 2023,

ANL – Linda Winkler, Kate Keahey, Caltech – Harvey Newman, Ramiro Voicu, FNAL – Phil DeMar, LBNL/ESnet – Chin Guok, John MacAuley, LBNL/NERSC – Jason Hick, UMD/MAX – Tom Lehman, Xi Yang, Alberto Jimenez, SENSE: SDN for End-to-end Networked Science at the Exascale, August 1, 2015,

Funded Project from DOE