Trick or Treat: Centralized Data Lake vs Decentralized Data Mesh
Résumé
Over the course of the last few years, the augmentation of processed data and an increase in the need for fast product release cycles led to the emergence of bottlenecks in information and knowledge flows within large organizations. Recent research works attempted to resolve these issues from several perspectives, which span from the data platform architectures to the storage technologies. In this positional paper, we start by comparing the well-established methods of designing analytical data platforms and make a review of existing problems inherent to them, namely centralization of storage and ownership. It continues by analyzing the principles of a data mesh proposal and by providing an examination of unresolved challenges, such as metadata centralization. We further consider the business domain dependencies and platform architecture of our running example. The final section presents our vision for solving the identified metadata management issues in large enterprises via data decentralization and offers potential directions for future work.
Domaines
Informatique [cs]Origine | Fichiers produits par l'(les) auteur(s) |
---|