August 2012 – Carlo Curino

2012: Benchmarking OLTP/Web Databases in the Cloud: The OLTP-Bench Framework

I will be giving the keynote at CloudDB 2012 on relational database benchmarking. Below you can find the abstract of the invited paper.

Benchmarking is a key activity in building and tuning data manage- ment systems, but the lack of reference workloads and a common platform makes it a time consuming and painful task. The need for such a tool is heightened with the advent of cloud computing— with its pay-per-use cost models, shared multi-tenant infrastruc- tures, and lack of control on system configuration. Benchmarking is the only avenue for users to validate the quality of service they receive and to optimize their deployments for performance and re- source utilization.

In this talk, we present our experience in building several ad- hoc benchmarking infrastructures for various research projects tar- geting several OLTP DBMSs, ranging from traditional relational databases, main-memory distributed systems, and cloud-based scal- able architectures. We also discuss our struggle to build mean- ingful micro-benchmarks and gather workloads representative of real-world applications to stress-test our systems. This experience motivates the OLTP-Bench project, a “batteries-included” bench- marking infrastructure designed for and tested on several relational DBMSs and cloud-based database-as-a-service (DBaaS) offerings. OLTP-Bench is capable of controlling transaction rate, mixture, and workload skew dynamically during the execution of an ex- periment, thus allowing the user to simulate a multitude of prac- tical scenarios that are typically hard to test (e.g., time-evolving access skew). Moreover, the infrastructure provides an easy way to monitor performance and resource consumption of the database under test. We also introduce the ten included workloads, derived from either synthetic micro benchmarks, popular benchmarks, and real world applications, and how they can be used to investigate various performance and resource-consumption characteristics of a data management system. We showcase the effectiveness of our benchmarking infrastructure and the usefulness of the workloads we selected by reporting sample results from hundreds of side-by- side comparisons on popular DBMSs and DBaaS offerings.

See http://oltpbenchmark.com for the source code of our benchmark.

2012: Lookup Tables: Fine-Grained Partitioning for Distributed Databases

The standard way to get linear scaling in a distributed OLTP DBMS is to horizontally partition data across several nodes. Ideally, this partitioning will result in each query being executed at just one node, to avoid the overheads of distributed transactions and allow nodes to be added without increasing the amount of required coordination. For some applications, simple strategies, such as hashing on primary key, provide this property. Unfortunately, for many applications, including social networking and order-fulfillment, many-to-many relationships cause simple strategies to result in a large fraction of distributed queries. Instead, what is needed is a fine-grained partitioning, where related individual tuples (e.g., cliques of friends) are co-located together in the same partition. Maintaining such a fine-grained partitioning requires the database to store a large amount of metadata about which partition each tuple resides in. We call such metadata a lookup table, and present the design of a data distribution layer that efficiently stores these tables and maintains them in the presence of inserts, deletes, and updates. We show that such tables can provide scalability for several difficult to partition database workloads, including Wikipedia, Twitter, and TPC-E. Our implementation provides 40% to 300% better performance on these workloads than either simple range or hash partitioning and shows greater potential for further scale-out.

@inproceedings{tatarowicz2012lookup,   author    = {Aubrey Tatarowicz and                Carlo Curino and                Evan P. C. Jones and                Sam Madden},   title     = {Lookup Tables: Fine-Grained Partitioning for Distributed                Databases},   booktitle = {ICDE},   year      = {2012},   pages     = {102-113},   ee        = {http://doi.ieeecomputersociety.org/10.1109/ICDE.2012.26}, }

2012: How Clean Is Your Sandbox? – Towards a Unified Theoretical Framework for Incremental Bidirecti

Abstract. Bidirectional transformations (bx) constitute an emerging mechanism for maintaining the consistency of interdependent sources of information in soft- ware systems. Researchers from many different communities have recently in- vestigated the use of bxto solve a large variety of problems, including relational view update, schema evolution, data exchange, database migration, and model co-evolution, just to name a few. Each community leveraged and extended dif- ferent theoretical frameworks and tailored their use for specific sub-problems. Unfortunately, the question of how these approaches actually relate to and differ from each other remains unanswered. This question should be addressed to re- duce replicated efforts among and even within communities, enabling more effec- tive collaboration and fostering cross-fertilization. To effectively move forward, a systematization of these many theories and systems is now required. This paper constitutes a first, humble yet concrete step towards a unified theoretical frame- work for a tractable and relevant subset of bx approaches and tools. It identifies, characterizes, and compares tools that allow the incremental definition of bidi- rectional mappings between software artifacts. Identifying similarities between such tools yields the possibility of developing practical tools with wide-ranging applicability; identifying differences allows for potential new research directions, applying the strengths of one tool to another whose strengths lie elsewhere.

@inproceedings{terwilliger2012howclean,   author    = {James F. Terwilliger and                Anthony Cleve and                Carlo Curino},   title     = {How Clean Is Your Sandbox? - Towards a Unified Theoretical                Framework for Incremental Bidirectional Transformations},   booktitle = {ICMT},   year      = {2012},   pages     = {1-23},   ee        = {http://dx.doi.org/10.1007/978-3-642-30476-7_1},
}

2012: Mobius: unified messaging and data serving for mobile apps

Mobile application development is challenging for several reasons: intermittent and limited network connectivity, tight power constraints, server-side scalability concerns, and a number of fault-tolerance issues. Developers handcraft complex solutions that include client-side caching, conflict resolution, disconnection tolerance, and backend database sharding. To simplify mobile app development, we present Mobius, a system that addresses the messaging and data management challenges of mobile application development. Mobius introduces MUD (Messaging Unified with Data). MUD presents the programming abstraction of a logical table of data that spans devices and clouds. Applications using Mobius can asynchronously read from/write to MUD tables, and also receive notifications when tables change via continuous queries on the tables. The system combines dynamic client-side caching (with intelligent policies chosen on the server-side, based on usage patterns across multiple applications), notification services, flexible query processing, and a scalable and highly available cloud storage system. We present an initial prototype to demonstrate the feasibility of our design. Even in our initial prototype, remote read and write latency overhead is less than 52% when compared to a hand-tuned solution. Our dynamic caching reduces the number of messages by a factor of 4 to 8.5 when compared to fixed strategies, thus reducing latency, bandwidth, power, and server load costs, while also reducing data staleness.

@inproceedings{chun2012mobius,   author    = {Byung-Gon Chun and                Carlo Curino and                Russell Sears and                Alexander Shraer and                Samuel Madden and                Raghu Ramakrishnan},   title     = {Mobius: unified messaging and data serving for mobile apps},   booktitle = {MobiSys},   year      = {2012},   pages     = {141-154},   ee        = {http://doi.acm.org/10.1145/2307636.2307650},   }

2012: Skew-aware automatic database partitioning in shared-nothing, parallel OLTP systems

The advent of affordable, shared-nothing computing systems portends a new class of parallel database management systems (DBMS) for on-line transaction processing (OLTP) applications that scale without sacrificing ACID guarantees [7, 9]. The performance of these DBMSs is predicated on the existence of an optimal database design that is tailored for the unique characteristics of OLTP workloads. Deriving such designs for modern DBMSs is difficult, especially for enterprise-class OLTP systems, since they impose extra challenges: the use of stored procedures, the need for load balancing in the presence of time-varying skew, complex schemas, and deployments with larger number of partitions.

To this purpose, we present a novel approach to automatically partitioning databases for enterprise-class OLTP systems that significantly extends the state of the art by: (1) minimizing the number distributed transactions, while concurrently mitigating the effects of temporal skew in both the data distribution and accesses, (2) extending the design space to include replicated secondary indexes, (4) organically handling stored procedure routing, and (3) scaling of schema complexity, data size, and number of partitions. This effort builds on two key technical contributions: an analytical cost model that can be used to quickly estimate the relative coordination cost and skew for a given workload and a candidate database design, and an informed exploration of the huge solution space based on large neighborhood search. To evaluate our methods, we integrated our database design tool with a high-performance parallel, main memory DBMS and compared our methods against both popular heuristics and a state-of-the-art research prototype [17]. Using a diverse set of benchmarks, we show that our approach improves throughput by up to a factor of 16x over these other approaches.


@inproceedings{pavlo2012skew,   author    = {Andrew Pavlo and                Carlo Curino and                Stanley B. Zdonik},   title     = {Skew-aware automatic database partitioning in shared-nothing,                parallel OLTP systems},   booktitle = {SIGMOD Conference},   year      = {2012},   pages     = {61-72},   ee        = {http://doi.acm.org/10.1145/2213836.2213844},   }

2011: No bits left behind

This is a fun vision paper I wrote with Eugene Wu and Sam Madden, on how we should be more careful about leaving no bits behind (in the memory hierarchy), and suggest few ideas/improvements we can make to DBMSs today.

ABSTRACT

One of the key tenets of database system design is making efficient use of storage and memory resources. However, existing database system implementations are actually extremely wasteful of such resources; for example, most systems leave a great deal of empty space in tuples, index pages, and data pages, and spend many CPU cycles reading cold records from disk that are never used. In this paper, we identify a number of such sources of waste, and present a series of techniques that limit this waste (e.g., forcing better memory locality for hot data and using empty space in index pages to cache popular tuples) without substantially complicating interfaces or system design. We show that these techniques effectively reduce memory requirements for real scenarios from the Wikipedia database (by up to 17.8×) while increasing query performance (by up to 8×).

You can find the paper here: http://www.cidrdb.org/cidr2011/Papers/CIDR11_Paper23.pdf

@inproceedings{wu2011nobits,   author    = {Eugene Wu and                Carlo Curino and                Samuel Madden},   title     = {No bits left behind},   booktitle = {CIDR},   year      = {2011},   pages     = {187-190},   ee        = {http://www.cidrdb.org/cidr2011/Papers/CIDR11_Paper23.pdf}, }