Authors: Marco Vogt, Alexander Stiemer, Heiko Schuldt
Cloud providers are more and more confronted with very diverse and heterogeneous requirements their customers impose on the management of data. First, these requirements stem from service-level agreements that specify a desired degree of availability and a guaranteed latency. As a consequence, Cloud providers replicate data across data centers or availability zones and/or partition data and place it close to the location of their customers. Second, the workload at each Cloud data center or availability zone is diverse and may significantly change over time – e. g., an OLTP workload during regular business hours and OLAP analyzes over night. For this, polystore and multistore databases have recently been introduced as they are intrinsically able to cope with such mixed and varying workloads. While the problem of heterogeneous requirements on data management in the Cloud is either addressed at global level by replicating and partitioning data across data centers or at local level by providing polystore systems in a Cloud data center, there is no integrated solution that leverages the benefits of both approaches. In this paper, we present the Polypheny-DB vision of a distributed polystore system that seamlessly combines replication and partitioning with local polystores and that is able to dynamically adapt all parts of the system when the workload changes. We present the basic building blocks for both parts of the system and we discuss open challenges towards the implementation of the Polypheny-DB vision.