SDSC offers complete data science solutions in a breadth of specialities via training, service contracts, and joint research collaborations.
SDSC keeps close tabs on the latest advances in compute hardware, memory, storage, and networking, along with the latest techniques to manage data and computation. We benchmark systems to look deeply into how they work, where they run into bottlenecks, and how to improve their performance.
Responding to a dramatic growth in acquisition of data that has been critical to scientific progress and knowledge, SDSC's Advanced Cyberinfrastructure Development (ACID) Lab leverages Big Data technologies and platforms to provide efficient access to invaluable data and High-Performance Computing systems to meet the processing demands for generating key insights and promoting scientific breakthroughs.
Data integration can be challenging enough, but responding to the complexities of integration across sources hosted outside of one’s environment can be daunting. SDSC experts are versed in best-of-breed techniques for integrating virtual data sources along with local data sets. Many of our data scientists are domain experts who can suggest additional datasets for increased data value.
SDSC excels at integrating data, especially big or “messy” data. We understand that data arrives in many states, from very trusted to unmonitored crowdsourced data. Some data is generated for different uses and may not have factored in the metadata needs of your current project. Researchers may have images, text and streaming data to integrate. SDSC experts specialize in all aspects of data integration, including ontology building.
Whether one is new to managing data or has a current project that could benefit from optimization and better design, SDSC experts are eager to help. We specialize in building effective, flexible data schemas and architectures based on modeling with test data. We can generate test data, especially for projects where testing with live data is suboptimal, e.g. data governed by compliance regimes. SDSC can also monitor current projects over time and suggest incremental improvements for continuous optimization.
Graph Analytics is a rapidly developing area of research where a combination of graph-theoretic, statistical, and database techniques are applied to model, store, retrieve, and performance analyses on graph-structured data. These techniques enable researchers to understand the structure of a network and how it changes in different conditions, find paths between pairs of entities that satisfy different constraints, identify clusters or closely interacting subgroups inside a graph, or find subgraphs that are similar to a given patter.
Leveraging extensive experience built over years of collaborating with data scientists to create scalable data science platforms, SDSC Research Data Services offers a complete suite of data science services, from infrastructure hosting and complex data storage, to FAIR (Findable, Accessible, Interoperable, Reusable) data consulting, all of which enables data scientists to focus on their science.
SDSC's Spatial Information Systems Lab conducts research and develops technologies and infrastructure that enable users to access, integrate, and manage spatial information. Application domains range from hydrology and environmental sciences, to neuroscience.
Gathering data is easy. In fact, it’s so easy it’s exceeding our capacity to validate, analyze, visualize, store and curate. And, many of our critical scientific problems can only be solved by harnessing this data. SDSC's Predictive Analytics Center of Excellence (PACE) nurtures a rich collaborative learning environment to cultivate a national community of data scientists that embodies innovation through diversity of thought. Predicting future trends and behaviors – from the epic to the everyday – allows for proactive, knowledge-driven decisions.
SDSC data science experts specialize in storing, integrating, and analyzing all data types, including time series data. SDSC has aided researchers in the areas of eHealth, climate sensing, and Smart Cities, just to name a few. Such data sources hold additional challenges because of large volumes, and SDSC excels in the challenges of computing at scale.