Skip to content

NPACI DATA DISCOVERY | Contents | Next

Proteins, Dendrites, Neurons, and Brains

CASTING SEMANTIC NETS
A NEUROSCIENCE DISCOVERY ENVIRONMENT

hanks to initiatives by individual scientists and programs such as the Human Brain Project and the Protein Data Bank, vast quantities of neuroscience information are becoming available in the form of databases and Web resources-everything from molecular structures and 3-D cellular information through anatomical studies and movies of functional brain imaging. "No neuroscientist needs to be reminded of the information explosion," said Maryann Martone, a researcher at UCSD's National Center for Microscopy and Imaging Research (NCMIR). "How can we integrate data from various research disciplines, with different experimental approaches, concerning completely different species of organisms, and at widely varying levels of resolution, into a unified body of knowledge that we can navigate and mine for new insights?"

Figure 1. Where the Proteins Are

Using a tool for Windows clients called AxioMap, SDSC's Ilya Zaslavsky created this Kind query retrieval visualization of a neuron image with color-coded overlays that represent protein localization information from multiple sources.

A solution to this dilemma comes from the intersection of NPACI's Data-Intensive Computing Environments (DICE) and Neuroscience thrusts. NPACI researchers at SDSC and UCSD have developed a prototype information mediator system called KIND as part of their effort to federate neuroscience databases. The project will create a knowledge retrieval environment in which a scientist can query the mediator, retrieve information from across a number of information sources, and use the results to perform custom analyses and data mining.

Scientific information from many sources is now accessible almost instantly via the Internet, but these sources typically use different interfaces and export their data in incompatible formats. The brute-force approach of trying to create a single, enormous database that encompasses every possible attribute of nervous systems just isn't feasible. A more realistic way is to federate-cross-reference-separate data sources, taking their differences into account. Mediator systems assist users by seeking information of interest and providing integrated views of the data they want.

Wrapper-mediator systems use the Extensible Markup Language (XML) as a view-definition language to both model and interchange information across incompatible data sources. An XML data wrapper associated with each source exports an XML view of the data. The mediator selects the outputs of independent sources, restructures and merges them, and provides an integrated XML view of the information.

NPACI's DICE thrust has developed wrappers and mediators for a variety of information sources, including relational databases, geographical information systems, and Web sites, and is applying mediator systems for distributed digital libraries, government agencies, and scientific data collections.

Top | Contents | Next

CASTING SEMANTIC NETS

"Some of our recent work has been driven by the need to integrate the scientific databases of the Neuroscience Workbench, where source data comes from different 'semantic worlds' that might share few or no attributes," said DICE researcher Amarnath Gupta. Gupta and Bertram Ludaescher at SDSC, collaborating with Martone and NCMIR Director Mark Ellisman, have developed KIND, a mediator that extends current approaches by incorporating semantic models of information sources.

"More traditional mediators work with sources that share the same domain of discourse, but here we need to integrate across different domains like neuroanatomy, protein properties, and ion-currents in nerves," Ludaescher said. "A novel feature of our architecture is the use of domain maps-semantic nets of terms and relationships." The domain map draws upon anatomical relations between brain regions and their cellular and subcellular components.

A first prototype establishing the viability of the approach is operational. The researchers developed their mediation-based approach using NCMIR's 3-D Cell Centered Database, which contains information about neuronal structure and protein distribution. Using data from other sources in XML wrappers, NCMIR researchers have accessed other Web-based neuroscience resources, including the SenseLab database at Yale, Synapse Web at Boston University, and the EF-Hand Calcium-Binding Proteins Data Library at Vanderbilt University.

Top | Contents | Next

A NEUROSCIENCE DISCOVERY ENVIRONMENT

"Our objective is to develop a discovery environment for neuroscientists," Martone said. The environment is intended to deal with cellular and subcellular morphology, molecular distributions in cellular and subcellular structures, and physiological responses that reveal functional properties of single cells and structures as well as cellular and subcellular environments. "With the KIND mediator we'll be able to perform ad hoc queries and compare properties across resolution levels, experimental conditions, cell populations, and species populations."

"This brings us closer to bridging the gap between experimental and scientific disciplines and toward a unified model of nervous systems," Martone said. -MG

www.npaci.edu/DICE/Neuro


RESEARCHERS

Amarnath Gupta,
Bertram Ludaescher,
Ilya Zaslavsky
SDSC

Maryann Martone,
Mark Ellisman
UCSD