Published January 25, 2022
The Sherlock Division at the San Diego Supercomputer Center (SDSC), the University of California Office of the President’s (UCOP) risk and technology delivery services groups and Kwartile partnered to successfully re-architect and migrate the UCOP Risk Services Data Management System (RDMS 1.0) from an on-premise, Hadoop-based platform to a serverless, data lake platform in the Amazon Web Services (AWS) Cloud (RDMS 2.0).
The 18-month production deployment of RDMS 2.0 – a result of the strong and dedicated collaborative effort among SDSC’s Sherlock, UCOP Risk Technology Services, UCOP Technology Delivery Services (TDS) and Kwartile – enabled the team to efficiently meet its goal and recently deliver the Cloud-based RDMS 2.0.
The initial RDMS 1.0 service was conceptualized in 2015 and hosted within Sherlock’s secure enclave at SDSC as a Hadoop-based data platform. As the project and its needs evolved, the natural progression of RDMS 1.0 was to refactor it to a commercial Cloud to allow for the adoption and integration of new Cloud-based technologies and services that would modernize the data platform, yield significant cost savings, enhance security and improve scalability. Driven by these goals, the team decided to undertake a Proof of Value (POV) effort that validated the feasibility and benefits of the technical approach while securing the necessary buy-in from the stakeholders. This was followed by a longer, more detailed project engagement to perform the full migration of the current platform to the new Cloud-based solution.
“This was an excellent collaboration and a well-coordinated effort between the various teams supporting the RDMS transition from Sherlock’s on-premise tenant to its AWS cloud enclave. Due to license renewal constraints, the project was completed within an accelerated timeline. This project is a win-win for the executive sponsor, resulting in both the modernization of the data platform and cost savings achieved through eliminating licensing and other operational costs.” said Nilofeur Samuel, director of Risk Technology Services at UCOP.
Sherlock and its partners’ overall objective for RDMS 2.0 was to create a well-architected solution that focused on delivering value to customers while addressing the following key attributes:
Adopt a Cloud-native, serverless data management stack that leverages the high availability and performance of AWS Cloud including:
A defense-in-depth strategy was employed to secure data including:
The migration from RDMS 1.0 to RDMS 2.0 is projected to save the program approximately $2M over the next five years. These cost savings are primarily achieved by reducing licensing costs, eliminating large capital investment in physical hardware and realizing efficiencies in staffing resulting from the move from on-premise to the Cloud. Specifically:
“As custodians of systemwide data for the university, it is incumbent upon us to continuously explore options for managing data securely, more economically and with greater flexibility and scalability. In recent years, the offerings by commercial cloud service providers, such as AWS, have become viable options for managing data that are congruent with the aforementioned tenets of our mission. Through a strong partnership between Sherlock, Kwartile and UCOP Technology Delivery Services, we were able to leverage the core competencies of each team toward a successful implementation of a modern, cloud-based and highly secure data management platform that could serve as an all-encompassing, strategic and forward-looking approach to data management,” said Hooman Pejman, data architect at UCOP. “In my view, the key to our success was our collective diligence in exploring, identifying, selecting and orchestrating the appropriate services offered by AWS, based on a serverless architecture and a pay-as-you-go model."
Kwartile’s data engineering solutions provided automated tools for Data and Metadata migration and helped update and optimize the data curation jobs to run on AWS cloud native services. “These migration tools provided a comparison report of source and target, which improved data quality and significantly reduced data validation time. Projects of this nature are complex and would not have been successful without the proper collaboration and technology expertise provided by Sherlock and UCOP teams. This was a true team effort from everyone involved, and an outcome of our long-standing partnership,” said Krishna Katikaneni from Kwartile.
According to Sandeep Chandra, executive director of Sherlock Cloud at SDSC, while the individual Cloud services are reliable, the real work is in the orchestration and configuration of these services which are sufficiently complex that no human could correctly and reliably maintain their state. “Sherlock provided a platform that allowed the team to define infrastructure as code with automated deployments based on changes to a shared code base, including manual approval processes as gate keepers to control configuration changes. This assures the solution is repeatable, auditable, can be rolled back to a previous state and can easily adapt to frequent incremental change,” he explained. “This re-usable infrastructure as code paradigm allows Sherlock to use the same building blocks and processes adapted and customized to the specific needs of different projects across various engagements.”
About SDSC’s Sherlock Division
SDSC’s Sherlock Division focuses on providing innovative, secure information technology and data services for academia, and state and federal government agencies. It is an SDSC Center of Excellence for secure HIPAA- and FISMA-compliant managed Cloud hosting, and recently added NIST CUI- and CSF-compliant managed Cloud hosting to its offerings. Launched under the brand Sherlock, its major services – Cloud, Compliance, Cybersecurity, and Data Lab – provide a secure foundation for a wide range of research and data collection initiatives. The Sherlock Division supports a variety of entities including the Centers for Medicare and Medicaid Services (CMS), National Institutes of Health (NIH), and University of California Systems. For more information please visit the Sherlock website.
About SDSC
SDSC, located at UC San Diego, is considered a leader in data-intensive computing and cyberinfrastructure, providing resources, services and expertise to the national research community, including industry and academia. Cyberinfrastructure refers to an accessible, integrated network of computer-based resources and expertise, focused on accelerating scientific inquiry and discovery. SDSC supports hundreds of multidisciplinary programs spanning a wide variety of domains, from earth sciences and biology to astrophysics, bioinformatics and health IT.
Share