Resource Data

Making data democratic | @theU

This story originally appeared on the engineering school website.

The National Science Foundation (NSF) awarded a $ 5.6 million project to a team of researchers led by School of Computing professor Valerio Pascucci (pictured), who is also director of the College’s Center for Extreme Data Management of Engineering, to build the infrastructure needed to connect large-scale experimental and computing facilities and recruit others for data-driven science.

With the pilot project, called the National Science Data Fabric (NSDF), the team will deploy the first infrastructure capable of bridging the gap between the sources of massive scientific data – including state-of-the-art physics labs generating mountains of data – and the computing resources capable of processing their results, including network connectivity. Internet 2, and a wide range of high-performance computing facilities and commercial cloud resources across the country.

The figure shows the sites involved in the proposed NSDF test bed, including the top five development sites and three Minority Service Institutions (MSIs) with the computing environments of each campus, the Texas Advanced Computing Center (TACC) and the Massachusetts Green High Performance Computing Center (MGHPCC); data sources include the Cornell High Energy Synchrotron Source (CHESS), the IceCube facility and the XENON dark matter experiment. The sites are connected by a high-speed backbone network provided by Internet2 and interact with OSG StashCaches and other resources.

“By democratizing access to data for large-scale scientific surveys, NSDF will reduce the cost of entry for scientists who wish to participate in cutting-edge research,” said Pascucci. “Piloting this innovative component of cyber infrastructure will connect a much larger community of researchers to massive data sources that only a few can manage today. “

Pascucci says that advancements in technological advancements that benefit society will require a cyber infrastructure that allows high-speed access to data generated by large experimental facilities across the world. It designates several data generation centers that will benefit from the project: the Ice Cube neutrino detector at the South Pole, the XENONnT dark matter detector in Italy, and the Cornell High Energy Synchrotron Source (CHESS).

Students will be able to work with data transmitted directly from high-energy synchrotron sources, he added. Institutions will be able to innovate in education, workforce training and research. Pascucci predicts that scientific discovery will accelerate with the infrastructure in place.

Pascucci’s team includes Co-Principal Investigators (PIs): Frank Würthwein, Acting Director of the San Diego Supercomputer Center; Michela Taufer, professor at the University of Tennessee, Knoxville; Alex Szalay, professor of physics, astronomy and computer science at Johns Hopkins University; and John Allison, professor of materials science and engineering at the University of Michigan at Ann Arbor. The team will partner with NSF-funded efforts, such as Fabric and Open Science Grid (OSG), and industry partners, including IBM Cloud and Google Cloud.

“The National Science Data Fabric is an effort to transform the end-to-end data management lifecycle with advances in storage and network connections; deployment of scalable tools for data processing, analysis and visualization; and support for safety and reliability, ”said Würthwein.

Source link


Your email address will not be published. Required fields are marked *