News Archive

Inaugural Workshop Showcases Use of Commercial Cloud for Research

Published September 7, 2021

Shava Smallen (SDSC) demonstrates the account management dashboard of the CloudBank portal during the opening session of RRoCCET21.  Credit: UC San Diego

By Kimberly Mann Bruch

Organized by the San Diego Supercomputer Center (SDSC), in association with the University of Washington (UW), UC Berkeley, UC San Diego and the West Big Data Innovation Hub, the inaugural Research Running on Cloud Compute and Emerging Technologies workshop (RRoCCET21) recently took place with university participants, commercial cloud providers, federal agencies and the private sector.

RRoCCET21 was intended for researchers seeking to gain access to the expanded research capabilities provided by CloudBank, an NSF-funded cloud access initiative. The methods-focused event featured solution paths including tractable implementation plans, which were informed by case studies ranging from wildfire maps to COVID-19 and social media information.

“Our goal was to inspire the community to use the cloud for research and education, and to show them how,” said Ed Lazowska, CloudBank co-principal investigator and RRoCCET21 co-organizer, who is a professor in UW’s Paul G. Allen School of Computer Science and Engineering. “Presenters shared how cloud adoption in their domain enabled them to move their research computing ahead – providing a glimpse of what is possible with cloud computing.”

The workshop started with an introduction to CloudBank by Rob Fatland of UW and included a CloudBank portal demonstration by co-principal investigator Shava Smallen of SDSC, an overview of CloudBank training and user support by Naomi Alterman of UW and an overview of running classes in the cloud using the Berkeley Data Stack by Eric Van Dusen of UC Berkeley. Participants gained an understanding of how CloudBank can reduce the plethora of pain points in using the cloud through efficient, multi-cloud account provisioning, usage monitoring, spending alerts and other functions. By removing these barriers, researchers and educators can focus on their science and teaching rather than cloud administration.

“RRoCCET21 was a first-of-its-kind in bringing together researchers, cloud providers and research computing facilitators to discuss best practices regarding public cloud usage across a wide range of domains,” said Alterman. “The workshop was a great success, with a wide variety of participants who all brought such enthusiasm and experience to the table, and to amplify their voices we’ve made recordings of all the talks publicly available on CloudBank’s website.”

The workshop encompassed speakers from a wide variety of institutions, from research universities like MIT, to industry research teams at companies like Google, to international science organizations like CERN. For instance, Raghu Kancherla and Fahad Khan of the University of Central Florida discussed the ground-breaking work they are doing modeling the physics of supercritical carbon dioxide, which can be used by power plants to more efficiently capture energy from turbines. Their work involves running large computational fluid dynamics (CFD) simulations, which can be turbo-charged when run in the cloud as opposed to the compute resources available locally. View their full talk here.

CloudBank user Vanessa Frias-Martinez (University of Maryland) described her group's public transit monitoring system, BALTO, which uses cloud infrastructure and community outreach to understand and improve the quality of transit solutions in Baltimore. This project takes the same technical infrastructure used to build private transit products like Lyft and Waze and uses it to improve the quality of public transit for everyone. View Frias-Martinez’s full talk here.

Niema Moshiri (UC San Diego) presented his group's use of the cloud to sequence the genomes of COVID-19 viruses as samples come in from virus testing centers. The scalable nature of the cloud allows them to cost-effectively track the spread of viral mutations in real time, providing on-the-ground, emerging information about details such as the Delta variant. His full talk is here.

Another presentation featured Satra Ghosh (MIT, Harvard Medical School), who discussed use of the cloud to store a massive hub of data regarding the human brain via their DANDI archive. Ghosh described how the DANDI archive takes advantage of the inherently distributed and fluid nature of cloud resources to facilitate the sharing of data among neuroscientists across institutions around the world. His full talk is here.

“This year’s inaugural RRoCCET21 workshop was absolutely a ton of fun, for several reasons. First, we showed the value of the commercial cloud as a research computing platform. We got to enjoy learning about the presenters' projects – which were amazing – and we coordinated with them over the value of their talks to the conference attendees,” said Rob Fatland, Director of Cloud and Data Solutions at UW and a member of the RRoCCET21 organizing committee.  “The underlying message is simple enough: successful migration to the cloud means having a good sense of what you are getting yourself into. This means understanding the benefits of tapping into a cloud provider's unlimited computing resources, as well as understanding the necessary time investment to learn to use those resources effectively. And now, based on the positive feedback, we're looking forward to hosting RRoCCET22 next year with an in-person component that will facilitate networking, collaboration, and knowledge sharing.”

CloudBank is supported by the National Science Foundation (award no. 1925001).

About SDSC 

The San Diego Supercomputer Center (SDSC) is a leader and pioneer in high-performance and data-intensive computing, providing cyberinfrastructure resources, services and expertise to the national research community, academia and industry. Located on the UC San Diego campus, SDSC supports hundreds of multidisciplinary programs spanning a wide variety of domains, from astrophysics and earth sciences to disease research and drug discovery. SDSC's newest National Science Foundation-funded supercomputer, Expanse, supports SDSC's theme of "Computing without Boundaries" with a data-centric architecture, commercial cloud integration and state-of-the art GPUs for incorporating experimental facilities and edge computing.

About the University of Washington eScience Institute

Established in 2008, the University of Washington eScience Institute was one of the first campus units nationally and internationally focused on advancing data-intensive scientific discovery through the close coupling of methodology and applications research and education.

About Berkeley’s Division of Computing, Data Science, and Society

Berkeley’s Division of Computing, Data Science, and Society connects computing, statistics, ethics, the humanities, and social and natural sciences to accelerate breakthrough education and research across scientific and technological frontiers. Berkeley’s Data Science Education Program pioneered the use of the cloud at scale, providing thousands of students easy access to computational resources and serving as a model for universities across the world.