{"id":2847,"date":"2026-03-25T14:42:21","date_gmt":"2026-03-25T13:42:21","guid":{"rendered":"https:\/\/marcovogt.de\/?p=2847"},"modified":"2026-04-13T10:54:52","modified_gmt":"2026-04-13T08:54:52","slug":"polypipe-mergingdata-pipelines-and-multi-model-databases","status":"publish","type":"post","link":"https:\/\/marcovogt.de\/index.php\/2026\/03\/25\/polypipe-mergingdata-pipelines-and-multi-model-databases\/","title":{"rendered":"PolyPipe: Merging Data Pipelines and Multi-Model Databases"},"content":{"rendered":"\n<p><strong>Authors:<\/strong> David Lengweiler, Tobias Weber, Heiko Schuldt, Marco Vogt<\/p>\n\n\n\n<p>Modern data is characterized by its high-volume and inherent heterogeneity, primarily managed by systems tailored to three distinct modeling paradigms: the relational model, which enforces strict schema and high structural integrity; the document model, which offers schema flexibility for semi-structured data; and the graph model, which prioritizes modeling complex relationships between entities. While the database industry is trending toward multi-model systems that incorporate features from all paradigms, data management practices still lag behind. Data scientists rely on manual, multi-stage and labor-intensive workflows to integrate disparate data sources. This process forces users to switch tools, results in high data shipping costs, and forfeits database-level optimizations and structural guarantees, leading to complex, brittle and non-reusable &#8220;one-off&#8221; solutions.<br>We argue that embedding data pipelines directly into a multi-model database offers significant benefits, including streamlining, simplification, and improved maintainability, by utilizing declarative, database-native operators.<\/p>\n\n\n\n<p>This paper presents PolyPipe, an extension to the Polypheny multi-model database system. Poly-Pipe integrates data pipeline functionality as a first-class citizen, allowing the construction of complex pipelines using a hybrid of database and classical operators within a single system.<\/p>\n\n\n\n<p><strong>Link:<\/strong> <a href=\"https:\/\/openproceedings.org\/2026\/conf\/edbt\/paper-311.pdf\">https:\/\/openproceedings.org\/2026\/conf\/edbt\/paper-311.pdf<\/a><\/p>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Authors: David Lengweiler, Tobias Weber, Heiko Schuldt, Marco Vogt Modern data is characterized by its high-volume and inherent heterogeneity, primarily&#8230; <\/p>\n<div class=\"readmore\"><a href=\"https:\/\/marcovogt.de\/index.php\/2026\/03\/25\/polypipe-mergingdata-pipelines-and-multi-model-databases\/\" class=\"lnk\">Read more<\/a><\/div>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"elementor_theme","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[22],"tags":[],"class_list":["post-2847","post","type-post","status-publish","format-standard","hentry","category-publication"],"acf":[],"_links":{"self":[{"href":"https:\/\/marcovogt.de\/index.php\/wp-json\/wp\/v2\/posts\/2847","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/marcovogt.de\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/marcovogt.de\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/marcovogt.de\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/marcovogt.de\/index.php\/wp-json\/wp\/v2\/comments?post=2847"}],"version-history":[{"count":4,"href":"https:\/\/marcovogt.de\/index.php\/wp-json\/wp\/v2\/posts\/2847\/revisions"}],"predecessor-version":[{"id":2890,"href":"https:\/\/marcovogt.de\/index.php\/wp-json\/wp\/v2\/posts\/2847\/revisions\/2890"}],"wp:attachment":[{"href":"https:\/\/marcovogt.de\/index.php\/wp-json\/wp\/v2\/media?parent=2847"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/marcovogt.de\/index.php\/wp-json\/wp\/v2\/categories?post=2847"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/marcovogt.de\/index.php\/wp-json\/wp\/v2\/tags?post=2847"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}