For questions please contact:
IMPORTANT! Submissions are no longer being accepted.
Dan Graham, Teradata, Director
The Data Renaissance: Critical Big Data Technologies
Who knew Kryder’s Law outruns Moore’s Law? Disk areal density doubles every 18 months – faster than silicon wafers doubling every 24. Result: petabyte clusters are becoming commonplace. Which has spawned dozens of new technologies to manage and distill the data. In this session we will tour the state-of-the-art data management, especially data architectures and the mining of large data sets. We will tour trends in SIMD vector processing, self-service data wrangling, the now famous data lakes, SQL-on-Hadoop, analytic graph engines, hot and cold data, data curation, and late-binding schemas. Real world examples of large scale sensor data, eCommerce, and healthcare diagnosis will illustrate key points in the data renaissance. Last we ask ourselves: what’s next?
Dan Graham has over 30 years in IT, Dan joined Teradata Corporation in 1989 where he was the senior product manager for the DBC/1012 parallel database computer. He then joined IBM where he wrote product plans and launched the RS/6000 SP parallel server. He then became Strategy Executive for IBM's Global Business Intelligence Solutions. As Enterprise Systems General Manager at Teradata, Dan was responsible for strategy, go-to-market success, and competitive differentiation for the Active Enterprise Data Warehouse platform. He currently leads Teradata's technical marketing activities.
Michael Carey, UC Irvine, Bren Professor
AsterixDB: A Counter but Intuitive Approach to Big Data Management
We are living in the Big Data era, and we are witnessing a shift in the role of data management system. Rather than “just” being the systems of record at the heart of traditional enterprises, modern Big Data management systems must model, capture, track, and react to the current state of the world. Doing so requires the ingestion of event data, arriving from a variety of devices, as well as enabling query access to the history of captured data over time. These requirements span a variety of scientific disciplines, including the handling of data produced by a variety sensors in health care, environmental monitoring applications, traffic monitoring, dynamic social network data, and many other domains.
AsterixDB is an open source Big Data Management System (BDMS) with a feature set that’s very different than those of other platforms in today's Big Data ecosystem. The system was initially co-developed by UC Irvine and UC Riverside, starting in 2009 and leading eventually to its first beta release in mid-2013. It has recently moved to Apache, where AsterixDB is now an active incubating project. Many of the system’s key design decisions relate to the aforementioned shift. This talk will first briefly review AsterixDB’s data model, query language, and scale-out architecture. It will then examine a number of counter-cultural aspects of the AsterixDB system, including where its data lives, its runtime architecture, its approach to streaming data, its view of transactions, and its features for handling time-based data.
Michael J. Carey is a Bren Professor of Information and Computer Sciences at UC Irvine. Before joining UCI in 2008, Carey worked at BEA Systems for seven years and led the development of BEA's AquaLogic Data Services Platform product for virtual data integration. He also spent a dozen years teaching at the University of Wisconsin-Madison, five years at the IBM Almaden Research Center working on object-relational databases, and a year and a half at e-commerce platform startup Propel Software during the infamous 2000-2001 Internet bubble. Carey is an ACM Fellow, a member of the National Academy of Engineering, and a recipient of the ACM SIGMOD E.F. Codd Innovations Award. His current interests all center around data-intensive computing and scalable data management (a.k.a. Big Data).
This year we are encouraging submissions consisting of case studies related to the Special Topic including but not limited to:
Additional submission topics of interest may include but are not limited to the following as they relate to scientific and statistical data management: