2011: “RelationalCloud: a Database Service for the cloud”

Conference: CIDR 2011

Authors:

Carlo Curino, Evan P. C. Jones, Raluca Ada Popa, Nirmesh Malviya, Eugene Wu, Sam Madden, Hari Balakrishnan, Nickolai Zeldovich.

 

ABSTRACT: This paper introduces a new transactional “database-as-a-service” (DBaaS) called Relational Cloud. A DBaaS promises to move much of the operational burden of provisioning, configuration, scaling, performance tuning, backup, privacy, and access control from the database users to the service operator, offering lower overall costs to users. Early DBaaS efforts include Amazon RDS and Microsoft SQL Azure, which are promising in terms of establish- ing the market need for such a service, but which do not address three important challenges: efficient multi-tenancy, elastic scalability, and database privacy. We argue that these three challenges must be overcome before outsourcing database software and management becomes attractive to many users, and cost-effective for service providers. The key technical features of Relational Cloud include: (1) a workload-aware approach to multi-tenancy that identifies the workloads that can be co-located on a database server, achieving higher consolidation and better performance than existing approaches; (2) the use of a graph-based data partitioning algorithm to achieve near-linear elastic scale-out even for complex transactional workloads; and (3) an adjustable security scheme that enables SQL queries to run over encrypted data, including ordering operations, aggregates, and joins. An underlying theme in the design of the components of Relational Cloud is the notion of workload awareness: by monitoring query patterns and data accesses, the sys- tem obtains information useful for various optimization and security functions, reducing the configuration effort for users and operators.

 

p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 9.1px Helvetica} span.s1 {font: 9.0px Helvetica} span.s2 {font: 8.9px Helvetica}

2011: “Workload-aware Database Monitoring and Consolidation”

Conference: SIGMOD 2011

Authors:

Carlo Curino, Evan P. C. Jones, Sam Madden, Hari Balakrishnan

 

ABSTRACT:

In most enterprises, databases are deployed on dedicated database servers. Often, these servers are underutilized much of the time. For example, in traces from almost 200 production servers from different organizations, we see an average CPU utilization of less than 4%. This unused capacity can be harnessed to consolidate multiple databases on fewer machines, reducing hardware and operational costs. Virtual machine (VM) technology is one popular way to approach this problem. However, as we demonstrate in this paper, VMs fail to adequately support database consolidation, because databases place a unique and challenging set of demands on hardware resources, which are not well-suited to the assumptions made by VM-based consolidation.

Our system for database consolidation, named Kairos, uses novel techniques to measure the hardware requirements of database workloads, as well as models to predict the combined resource utilization of those workloads. We formalize the consolidation problem as a non-linear optimization program, aiming to minimize the number of servers and balance load, while achieving near-zero performance degradation. We compare Kairos against virtual machines, showing up to a factor of 12× higher throughput on a TPC-C-like benchmark. We also tested the effectiveness of our approach on real-world data collected from production servers at Wikia.com, Wikipedia, Second Life, and our institution, showing absolute consolidation ratios ranging between 5.5:1 and 17:1.