cloud computing – Carlo Curino

I am co-organizer with Ashraf Aboulnaga for the Second Workshop on Data Management in the Cloud, co-located with ICDE (Brisbane, Australia). We expect it to be a highly interactive forum for both practitioners and academics interested in the space of data management and cloud computing, and we welcome both novel research and industry experience papers.

CFP and details at: http://db.uwaterloo.ca/dmc2013/ and below the high level idea of the workshop:

Cloud computing has emerged as a promising computing and business model. By decoupling the management of the infrastructure (cloud providers) from its use (cloud tenants), and by allowing the sharing of massive infrastructures, cloud computing delivers unprecedented economical and scalability benefits for existing applications and enables many new scenarios. This comes at the cost of increased complexity in managing a highly multi-tenant infrastructure and limited visibility/access posing new questions on attribution, pricing, isolation, scalability, fault-tolerance, load balancing, etc. This is particularly challenging for stateful, data-intensive applications.

This unique combination of opportunities and challenges attracted much attention from both academia and industry. The DMC workshop aims at bringing researchers and practitioners in cloud computing and data management systems together to discuss the research issues at the intersection of these two areas, and also to draw more attention from the larger data management and systems research communities to this new and highly promising field.

See you in Australia!!!

The advent of cloud computing and hosted software as a service is creating a novel market for data management. Cloud-based DB services are starting to appear, and have the potential to attract customers from very diverse sectors of the market, from small businesses aiming at reducing the total cost of ownership (perfectly suited for multi-tenancy solutions), to very large enterprises (seeking high-profile solutions spanning on potentially thousands of machines and capable of absorbing unexpected burst of traffic). At the same time, the database community is producing new architectures and systems tackling in novel ways various classes of typical DB workloads. These novel, dedicated approaches often outshine the traditional DBMS trying to provide an unreachable “one-size-fit-all” dream. Such increase in the number of available data management products is exacerbating the problem of selecting, deploying and tuning in-house solutions for data management.

In this context we are building a cloud-based database architecture, enabling the implementation of DB-as-a-service. We envision a database provider that (i) provides the illusion of infinite resources, by continuously meeting user expectations under evolving workloads, and (ii) minimizes the operational costs associated to each user by amortizing administrative costs across users and developing software techniques to automate the management of many databases running inside of a single data center. Thus, the traditional provisioning problem (what resources to allocate for a single database) becomes an optimization issue, where a large user base, multiple DBMS engines, and a very large data center provide an unprecedented opportunity to exploit economy of scale, smart load balancing, higher power efficiency, and principled overselling.

The “relational cloud” infrastructure has several strong points for database users: (i) predictable and lower costs, proportional to the quality of service and actual workloads, (ii) reduced technical complexity, thanks to a unified access interface and the delegation of DB tuning and administration, and (iii) elasticity and scalability, providing the perception of virtually infinite resources ready at hand. In order to achieve this, we are working on harnessing many recent technological advances in data management by efficiently exploiting multiple DBMS engines (targeting different types of workloads) in a self-balancing solution that optimizes the assignment of resources of very large data centers to serve potentially thousands of users with very diverse needs. Among the critical research challenges to be faced we have identified: (i) automatic database partitioning, (ii) multi-db, multi-engine workload balancing, (iii) scalable multi-tenancy strategies, (iv) high profile distributed transaction support, (v) automated replication and backup. These and many others are the key ingredients of the RelationalCloud (http://relationalcloud.com) prototype currently under development.