VSDB: Very Small Data Base

 

This was one of my first topic of interest. As a Bachelor thesis i developed, guided by Cristiana Bolchini and Letizia Tanca the PoLiDBMS tool discussed in the following.

The use of handheld devices, such as Smart Cards, Portable Data Assistants (PDA), Palm PCs and Cell Phones, to store data locally and to issue transactions against both local and remote data from Information Systems is being widely discussed in recent times.

Features required by portable devices in order to manage data are, for some aspects, similar to those found in Embedded Database systems, and range from very simple file system functions to a full set of database management capabilities, including some ACID transactions properties. Databases for very small devices – henceforth called Very Small Data Bases (VSDB) – are useful in various circumstances:

The personal (micro)information system, the so-called citizen’s card, which records administrative personal data like driver’s license and car information, passport, judicial register, etc.;

The personal medical record, reporting the owner’s clinical history complete with all the past clinical tests and diagnoses; this is most useful with patients suffering from some form of physical handicap or needing some critical treatment like periodical dialysis;

The traveling salesman database, i.e. the “clients portfolio”, storing visit schedule and purchase orders along with the interesting information about each client’s particular needs;

The personal travel database, recording all the travel (e.g. touristic) information considered interesting by the device owner.

 

In this scenario a twofold research project began a couple of years ago, to tackle the problems of a) efficiently managing data stored locally on devices with limited resources – the VSDB DataBase Management System – and b) designing and selecting the portion of data to be held – the VDSB Design Methodology -.

The two aspects of the problem are strictly related and the research has at first investigated the opportunity to define new physical and logical data structures to exploit the technological characteristics of the digital mobile devices hosting the DBMS and the data. On top of such ad-hoc data structures a Portable Light DBMS has been designed and a prototype is currently available (namecode: PoLiDBMS), to process query and manage the stored data.

On the other hand, we also worked on the design methodology to define the DB to be stored locally on the device and be readily available to the user.

More specifically, the methodology we propose for Very Small Data Base design is based on the classical three levels of the ANSI-SPARC model, sharing many issues with the methodologies for distributed/federated database design. However, three main differences w.r.t. the traditional design methodologies are introduced: first, since most interesting microdevices are portable, the main mobility issues are to be considered along with data distribution; second, context awareness is included in the data design issues to allow a full exploitation of context sensitive application functionalities; third, the peculiarities of the storage device(s) must be taken into account from the early steps, thus a logistic phase is added after the usual conceptual and logical phases, which supports the designer in the physical design task by taking into account the logistic aspects of data storage.

By examining these three aspects together we delineate the “VSDB ambient”, which isthe set of personal and environmental characteristics determining the portion of data that must be stored on the portable device.

About me

My Personal Interest
I play keyboard (piano, synth and samplers) in a Rock-Pop Band named Moksha. I love sports, i’ve practiced Water-polo and Judo at agonistic level. During summer Beach-Volley becomes a must, together with Kite-Surf and Canoa. I play as an actor in the small drama company “La Corte Dei Miracoli”. I’m currently part of the no-profit organization “Ingegneria Senza Frontiere”. I’m currently studying Modern Dancing (Hip-Hop and Jazz).

Context-ADDICT

One my main research topic is the Context-Aware Data Tailoring, the goal of this research is to exploit contextual information, properly captured by a model we are developing, to filter information coming from heterogeneous data sources. The result will be common semantic view over the relevant portion of the available data. The system we are developing is named Context-ADDICT (Context-Aware Data Design Integration Customization and Tailoring), more details can be found here.

A presentation of Context-ADDICT is available here.

Curriculum Vitae

 

PERSONAL DETAILS

Name: Carlo Aldo Curino

Present Citizenship: Italian

E-mail(s): curino@mit.edu, carlo.curino@polimi.it, carlo@curino.us

Homepage: http://carlo.curino.us/

RESEARCH INTERESTS:

  • Seeking to develop novel systems and techniques for data and knowledge management. My recent research interests include:  Schema Evolution, Data Migration, Query/Update Rewriting, Data Integration,  Partitioning, Database as a Service, and Cloud Computing.

EDUCATION

  • 2009 PhD at Politecnico di Milano, GPA 4.0/4.0 Thesis: “Panta Rhei: Database Evolution and Integration from Practice to Vision”, advisors: Prof. Letizia Tanca (Polimi) and Prof. Carlo Zaniolo (UCLA)
  • 2006 Master in Computer Science at University of Illinois at Chicago (UIC), GPA 4.0/4.0, Thesis title: “Context integration for mobile data design”, advisor: Prof. Clement Yu.
  • 2005 Laurea Specialistica (Master of Science) in Ingegneria Informatica (Computer Science) at Politecnico di Milano 110/110 cum laude. Thesis title: “Context integration for mobile data design”, advisor: Prof. Letizia Tanca
  • 2003 Laurea di Primo Livello in Ingeneria Informatica (Computer Science Bachelor) at Politecnico di Milano with the degree of 106/110. Thesis title: “Design and Prototype Development of a DBMS for Portable devices”, advisor: Prof. Letizia Tanca.

RESEARCH EXPERIENCE

  • 2012-present Senior Research Scientist Microsoft, Mountain View
  • 2011-2012 Research Scientist at Yahoo! Research, Santa Clara
  • 2009-2011 Post-Doc Associate at CSAIL, Massachusetts Institute of Technology (MIT),  working on database and cloud computing  collaborating with Prof. Sam Madden (MIT) and Prof. Hari Balakrishnan (MIT).
  • 2007-2009 Visiting Researcher at University of California, Los Angeles (UCLA), working on schema evolution and temporal databases collaborating with Prof. Carlo Zaniolo (UCLA), and Prof. Alin Deutsch (UCSD).

TEACHING EXPERIENCE

  • Fall 2010 Primary Lecturer at CSAIL-MIT for 6.830/6.814 Database Systems (course taught in collaboration with Micheal Stonebraker) 
  • 2006-2007 Teaching Assistant at Politecnico di Milano teaching (Bachelor Level) course “Informatica 2” (network progamming and HW architectures).
  • 2006-2007 Teaching Assistant at Politecnico di Milano teaching, in english, the (Master Level) course “Technologies for Information Systems” (data integration)
  • 2005-2006 Laboratory Lecturer at Politecnico di Milano, teaching “Software Engineering in Java”.
  • 2003-2005 Laboratory Assistant at Politecnico di Milano, teaching “C Programming”.
  • 2001-2003 Private Teaching experience (high school students on Math and Physics).

WORK EXPERIENCE

  • 2003-2005 Consultant (Java Programming and Linux Administration) for Forma.Service srl ( Via Spoleto 4 20100 Milano, Italy).
  • 2002-2005 Consultant (Programming and Linux Administration) for Verbano Informatica di Paolo Garlassi ( C.so Roma, 65 28883 Gravellona Toce, VB, Italy).
  • 2002-2003 Development of small office utilities and deployment of Linux-based solutions.
  • 1998-2001 part-time Website Designer.

PUBLICATIONS

Books:

  • 2010 “Panta Rhei: Database Evolution and Integration from Practice to Vision”, Carlo A. Curino, publisher LAP Lambert Academic Publishing AG & Co.  ISBN 978-3-8383-3721-0

Journals:

  • 2012 Automating the Database Schema Evolution Process” Carlo Curino, Hyun Jin Moon, Alin Deutsch, and Carlo Zaniolo, VLDB Journal “Best of VLDB 2011”
  • 2009 “And what can Context do for Data?” Cristiana Bolchini, Carlo A. Curino, Giorgio Orsi, Elisa Quintarelli, Rosalba Rossato, Fabio A. Schreiber, Letizia Tanca accepted for publication in the Communication of ACM
  • 2009 “Context Information for Knowledge Reshaping” Cristiana Bolchini, Carlo A. Curino, Elisa Quintarelli, Fabio A. Schreiber, Letizia Tanca, Journal of Web Engineering and Technology (IJWET) Topic on “Web-based Knowledge Representation and Management”
  • 2007 “A Data-oriented Survey of Context Models” Cristiana Bolchini, Carlo A. Curino, Elisa Quintarelli, Fabio A. Schreiber, Letizia Tanca, SIGMOD Record, Vol. 34, Num. 4
  • 2005 “Mobile Data Collection in Sensor Networks: The TinyLime Middleware”, Carlo A. Curino, Matteo Giani, Marco Giorgetta, Alessandro Giusti, Gian Pietro Picco, Amy L. Murphy. Special Issue of Pervasive and Mobile Computing Journal (PerCom Journal) on “Security in Wireless Mobile Computing Systems”, vol. 4, no. 1, pp. 446-469, Elsevier

Conferences:

  • 2013: “Performance and Resource Modeling in Highly-Concurrent OLTP Workloads”, Barzan Mozafari, Carlo Curino, Alekh Jindal, Sam Madden (SIGMOD)
  • 2013: “DBSeer: Resource and Performance Prediction for Building a Next Generation Database Cloud” Barzan Mozafari, Carlo Curino, Sam Madden (CIDR)
  • 2012: “Skew-Aware Automatic Database Partitioning in Shared-Nothing, Parallel OLTP Systems”, Andrew Pavlo, Carlo Curino, Stan Zdonick, (SIGMOD)
  • 2012: “Lookup Tables: Fine-Grained Partitioning for Distributed Databases” Aubrey Tatarowicz, Carlo Curino, Evan Jones, Sam Madden, (ICDE)
  • 2011: “Workload-aware Database Monitoring and Consolidation” Carlo Curino, Evan P. C. Jones, Sam Madden, Hari Balakrishnan, accepted to the International Conference on Management of Data (SIGMOD)
  • 2011:  “RelationalCloud: a Database Service for the cloud” Carlo Curino, Evan P. C. Jones, Raluca Ada Popa, Nirmesh Malviya, Eugene Wu, Sam Madden, Hari Balakrishnan, Nickolai Zeldovich. Conference on Innovative Database Research (CIDR)
  • 2011:  “No bits left behind” Eugene Wu, Carlo Curino, Sam Madden. (short paper) Conference on Innovative Database Research (CIDR)
  • 2011: “Update Rewriting and Integrity Constraint Maintenance in a Schema Evolution Support System: PRISM++”, Carlo Curino, Hyun J. Moon, Alin Deutsch, Carlo Zaniolo, accepted for publication at: Proceedings of Very Large Data Base (PVLDB)
  • 2010: “Schism: a Workload-Driven Approach to Database Replication and Partitioning”, Carlo Curino, Yang Zhang, Evan Jones, Sam Madden, accepted for publication at: Proceedings of Very Large Data Base (PVLDB)
  • 2010: “Scalable Architecture and Query Optimization for Transaction-time DBs with Evolving Schemas”, Hyun J. Moon, Carlo Curino, Carlo Zaniolo,  International Conference on Management of Data 2010 (SIGMOD)
  • 2009: “Accessing and Documenting Relational Databases through OWL ontologies”, Carlo Curino, Giorgio Orsi, Emanuele Panigati and Letizia Tanca, Flexible Query Answer Systems (FQAS)
  • 2008 “Graceful database schema evolution: the prism workbench” Carlo A. Curino, Hyun J. Moon, and Carlo Zaniolo. Very Large Data Base (PVLDB)
  • 2008 “Managing and querying transaction-time databases under schema evolution” Hyun J. Moon, Carlo A. Curino, Alin Deutsch, C.-Y. Hou, and Carlo Zaniolo. Very Large Data Base (PVLDB)
  • 2008 “Schema Evolution in Wikipedia: toward a Web Information System Benchmark” Carlo A. Curino, Hyun J. Moon, Letizia Tanca, Carlo Zaniolo, International Conference on Enterprise Information Systems (ICEIS)
  • 2008 “The Shining embedded system design methodology based on self dynamic reconfigurable architectures”, Carlo A. Curino, Vincenzo Rana, Marco Domenico Santambrogio, Francesco Redaelli, Donatella Sciuto, at “The 13th Asia and South Pacific Design Automation Conference” (ASP-DAC)
  • 2007 “X-SOM: Ontology Mapping and Inconsistency Resolution ” Carlo A. Curino, Giorgio Orsi, Letizia Tanca, Poster at European Semantic Web Conferente (ESWC)
  • 2006 “Context Integration for Mobile Data Tailoring” (Extended Abstract) Cristiana Bolchini, Carlo A. Curino, Fabio A. Schreiber, Letizia Tanca, Proceedings of the Italian Symposium on Advanced Database Systems (SEBD)
  • 2006 “Context integration for mobile data tailoring” Cristiana Bolchini, Carlo A. Curino, Fabio A. Schreiber, Letizia Tanca, Mobile Data Management (MDM)
  • 2005 “TinyLIME: Bridging Mobile and Sensor Networks through Middleware”, Carlo A. Curino, Matteo Giani, Marco Giorgetta, Alessandro Giusti, Gian Pietro Picco, Amy L. Murphy. IEEE Int. Conf. on Pervasive Computing and Communications. (PerCom)
  • 2005 “Mining Officially Unrecognized Side effects of Drugs by Combining Web Search and Machine Learning”, Carlo A. Curino, Bruce Lambert, Patricia M. West, Yuanyuan Liu, Clement Yu, ACM Conference on Information and Knowledge Management (CIKM)
  • 2004 “PoLiDBMS: Design and Prototype implementation of a DBMS for Portable Devices” C. Bolchini, C. Curino, M. Giorgetta, A. Giusti, A. Miele, F. A. Schreiber, L. Tanca. Proceedings of the Twelfth Italian Symposium on Advanced Database Systems, (SEBD)

Demos:

  • 2009 “PRIMA: Archiving and Querying Historical Data with Evolving Schemas” Hyun J. Moon, Carlo A. Curino,  MyungWon Ham, Carlo Zaniolo, accepted as demo at International Conference on Management of Data ’09 (SIGMOD)
  • 2009 “The PRISM Workwench: Database Schema Evolution Without Tears” Carlo A. Curino, Hyun J. Moon, MyungWon Ham, Carlo Zaniolo, accepted as demo at International Conference on Data Engineering ’09 (ICDE)
  • 2007 “CADD: The Context-ADDICT Designer tool for context modeling and data tailoring” Cristiana Bolchini, Carlo A. Curino, Giorgio Orsi, Elisa Quintarelli, Fabio A. Schreiber, Letizia Tanca, Demo Paper at Mobile Data Management (MDM)

Workshops:

  • 2012: “Benchmarking OLTP/Web Databases in the Cloud: the OLTP-Bench Framework”, Carlo Curino, Djellel Difallah, Andrew Pavlo, Phil Cudre-Mauroux (CloudDB)
  • 2010: “RelationalCloud: The case for a database service” Carlo Curino, Evan Jones, Yang Zhang, Eugene Wu, Sam Madden. New England Database Summit (NEDS)
  • 2009 “Automating Database Schema Evolution in Information System Upgrades” Carlo A. Curino, Hyun J. Moon, Carlo Zaniolo, Hot Software Upgrade (HotSWUp) workshop col-ocated with OOPSLA.
  • 2008 “Improving search and navigation by combining Ontologies and Social Tags”, Silvia Bindelli, Claudio Criscione, Carlo A. Curino, Mauro L. Drago, Davide Eynard, Giorgio Orsi, OTM Workshop: Ambient Data Integration (ADI)
  • 2008 “Managing the History of Metadata in support for DB Archiving and Schema Evolution”, Carlo A. Curino, Hyun J. Moon, Carlo Zaniolo, ER International Workshop on Evolution and Change in Data Management (ECDM)
  • 2008 “Research meets Education: DRESD, a virtuous circle”, Carlo A. Curino, Marco D. Santambrogio, Donatella Sciuto, European Workshop on Microeletronics Education (EWME)
  • 2008 “Information Systems Integration and Evolution: Ontologies at Rescue”, Carlo A. Curino, Letizia Tanca, Carlo Zaniolo International Workshop on Semantic Technologies for System Maintenance (STSM)
  • 2007 “X-SOM: A Flexible Ontology Mapper” Carlo A. Curino, Giorgio Orsi, Letizia Tanca, DEXA Workshop on Semantic Web Architectures For Enterprises (SWAE)
  • 2007 “Context-aware views for mobile users” Cristiana Bolchini, Carlo A. Curino, Giorgio Orsi, Elisa Quintarelli, Rosalba Rossato, Fabio A. Schreiber and Letizia Tanca, (Extended Abstract) 10th DELOS Thematic Workshop on Personalized Access, Profile Management, and Context Awareness in Digital Libraries (PERSDL)
  • 2007 “X-SOM results for OAEI 2007” Carlo A. Curino, Giorgio Orsi, Letizia Tanca, ISWC Workshop on Ontology Matching (OM)
  • 2007 “Emergent Semantics and Cooperation in MultiKnowledge Environments: the ESTEEM Architecture.” Esteem Team. In Proc. of the VLDB Int. Workshop on Semantic Data and Service Integration (SDSI), Vienna, Austria
  • 2006 “Ontology-Based Information Tailoring” Carlo A. Curino, Elisa Quintarelli, Letizia Tanca, ICDE WorkshopInterDB

 

Internal Reports:

  • 2008 “Architecture and Optimization of Transaction-time DBs with Evolving Schemas (Extended Version)”. Hyun J. Moon, Carlo A. Curino, Carlo Zaniolo, UCLA CSD Technical Report 2008-08
  • 2008 “Managing and querying transaction-time databases under schema evolution” H. J. Moon, C. A. Curino, A. Deutsch, C.-Y. Hou, and C. Zaniolo. In UCLA CSD Technical Report 080007, March 2008.
  • 2008 “Schema Evolution in Wikipedia: toward a Web Information System Benchmark (Extended Version)” Carlo A. Curino, Hyun J. Moon, Letizia Tanca, Carlo Zaniolo, UCLA CSD Technical Report 080006, Feb. 2008
  • 2008 “Graceful database schema evolution: the prism workbench” Carlo A. Curino, Hyun J. Moon, and Carlo Zaniolo. In UCLA CSD Technical Report 080008, March 2008.
  • 2007 “The Shining embedded system design methodology based on self dynamic reconfigurable architectures”, Carlo A. Curino, Alessio Montone, Vincenzo Rana, Francesco Redaelli, Marco D. Santambrogio, Donatella Sciuto, Politecnico di Milano Internal Report 2007.44 “
  • 2007 “Java Based Hardware Languages: Integration in a hardware design environment for reconfigurable systems”, Carlo A. Curino, Politecnico di Milano Internal Report 2007.45
  • 2006 “Context-ADDICT” Cristiana Bolchini, Carlo A. Curino, Elisa Quintarelli, Fabio A. Schreiber, Letizia Tanca, Politecnico di Milano Internal Report 2006.05
  • 2003 “MIPS implementation of some very small database data structures” Carlo A. Curino, Matteo Giani, Marco Giorgetta, Alessandro Giusti, Marco Trincavelli, Politecnico di Milano Internal Report 2003.45
  • 2003 “Portable Light DBMS: PoLiDBMS White Paper” Carlo A. Curino, Marco Giorgetta, Alessandro Giusti, Antonio Miele, Politecnico di Milano Internal Report 2003.46

RELEASED SOFTWARE

  • 2009 Schema Evolution Toolsuite: an analysis tool for statistic analysis of long schema evolution histories, LGPL license
  • 2008 PRISM: a system for graceful schema evolution. Demo on-line http://yellowstone.cs.ucla.edu/schema-evolution/index.php/Prism
  • 2008 PRIMA: temporal query on archives under schema evolution. Demo on-line http://yellowstone.cs.ucla.edu/demo/prima/index.html
  • 2008 microJena: porting of the Jena ontology API for mobile phones, J2ME. Part of the Jena Contrib package, and available for download at: http://jena.sourceforge.net/contrib/contributions.html, LGPL license
  • 2008 HMM: a history metadata manager, a collection of tools to assist the schema evolution process. Contact me to obtain the tool.
  • 2008 TagOnto: an ontology-based system to integrate tag-centric websites. Demo on-line http://kid.dei.polimi.it/tagonto/, GPL license
  • 2007 CADDTool: A tool to support context-aware data design. Contact me for a copy of the tool.
  • 2006 DrugSearch: Neural network based tool to retrieve and filter drugs side effects from the web. Contact me for a copy of the tool, GPL license
  • 2005 TinyLime: Integration of Sensor Networks in the Lime middleware, available at http://lime.sourceforge.net, LGPL license
  • 2004 PoLiDBMS: database management system for small devices, available at http://prometeo.elet.polimi.it/, GPL license

AWARDS/SCHOLARSHIPS

  • 2005-2008 Governmental Ph.D. three years scholarship “Informatica Avanzata Multimediale Distribuita” (Advanced Multimedia Distributed Computer Science).
  • 2007 PERSDL 2007 student bursary
  • 2006 DASI winter school PhD Student travel grant
  • 2005 Accenture Best Master Thesis Award, a 2,500 euros prize to the best engineering thesis of 2005 at Politecnico di Milano, offered by Accenture Ltd.
  • 2003-2005 Politecnico di Milano Academic Merit Fee Waiver

INVITED TALK / KEYNOTE

 

  • 2012: Keynote at CloudDB 2012 “Benchmarking OLTP/Web Databases in the Cloud: the OLTP-Bench Framework”

 

PROFESSIONAL ACTIVITIES

  • 2012 Program Committee for WebDB 2012
  • 2012 Program Committee for ESWC 2012
  • 2011 Program Committee for DMC 2012
  • 2011 Reviewer for Communications of ACM
  • 2011 Reviewer for TKDE
  • 2011 Reviewer for VLDB Journal
  • 2011 Program Committee member for PVLDB 2012
  • 2011 Workshop Organizer for the Data Lifecycle (DaLi) workshop held at ICDE 2011
  • 2010 PC member for Hot topic on SoftWare Upgrade (HotSWUp)
  • 2009 Reviewer for TLTP Journal, Special Issue on “Logic Programming in Databases: from Datalog to Semantic Web Rules”
  • 2009 Program Committee member for the 20th Database and Expert Systems Applications (DEXA)
  • 2009 Reviewer for SIGMOD 2009
  • 2008 Program Committee member for the 20th Database and Expert Systems Applications (DEXA) Conference 2009
  • 2008 Reviewer for SDM 09, SAC 09, DaWak 08
  • 2007 Reviewer for the Data and Knowledge Engineering Journal (DKE), SEBD 08 conference

LANGUAGES

  • ITALIAN (native)
  • ENGLISH (fluent)
  • SPANISH (basic)

REFERENCES

Available upon request.

My MySQL database does not support UTF-8. Do I have a problem?

No you don’t. Versions of MySQL lower than 4.1 do not have built in UTF-8 support. However, Joomla! 1.5 has made provisions for backward compatibility and is able to use UTF-8 on older databases. Let the installer take care of all the settings and there is no need to make any changes to the database (charset, collation, or any other).

Homepage

 

Welcome to my homepage. Following the links you can find information about my research activity, but also about teaching and personal interests.

Random thoughts, things not be said, suggestions to the world, general arrogant ego-intensive kind of stuff are available in my blog.