Download
7_Stadler_DaMaLOS_2024_paper_7.pdf 967,34KB
WeightNameValue
1000 Titel
  • FAIR Data Publishing with Apache Maven
1000 Autor/in
  1. Stadler, Claus |
  2. Bin, Simon |
  3. Bühmann, Lorenz |
1000 Erscheinungsjahr 2024
1000 Publikationstyp
  1. Kongressschrift |
1000 Online veröffentlicht
  • 2024-06
1000 Erschienen in
1000 Übergeordneter Kongress
1000 Lizenz
1000 Verlagsversion
  • https://zbmed.github.io/damalos/ |
1000 Ergänzendes Material
  • https://scaseco.github.io/maven4data/ |
  • https://doi.org/10.5281/zenodo.11636638 |
1000 Publikationsstatus
1000 Begutachtungsstatus
1000 Sprache der Publikation
1000 Abstract/Summary
  • Design and management of a large number of data processing pipelines is a challenging task. Analogous to DevOps, the term DataOps was coined to capture all the practices, processes and technologies related to the management of the life cycle of data artifacts, including the tracking of provenance. The solution space has been constantly increasing with novel approaches and tools becoming available, however with – for instance – more than 100 workflow engines available it is by far no longer feasible to assess them all. Semantic Web technology features many aspects relevant to DataOps, such as interlinkability of resources, DCAT for building decentral data catalogs, PROV-O for provenance descriptions, VoID for describing statistics about the used classes and properties. Yet, there are only few approaches that establish a coherent and holistic connection between these elements. In this work, we perform an in-depth analysis of the Apache Maven build system and its surrounding ecosystem for how they can be leveraged for automated data processing, publishing and RDF metadata generation with provenance tracking. We present three novel maven plugins for SPARQL and RML execution, the creation of an RDF database file, and uploading artifacts to a CKAN instance. Finally, we present a prototype architecture where a Maven deployment of a geographic RDF dataset results in the automated generation of DCAT, PROV-O and VoID metadata such that datasets can be browsed on a map and filtered e.g. by the used classes and properties. All our resources are freely available as Open Source.
1000 Sacherschließung
lokal DataOps
lokal Reproducible
lokal Data Management
lokal Semantic Web
lokal Apache Maven
lokal FAIR
1000 Fächerklassifikation (DDC)
1000 DOI 10.4126/FRL01-006474023 |
1000 Liste der Beteiligten
  1. https://orcid.org/0000-0001-9948-6458|https://frl.publisso.de/adhoc/uri/QmluLCBTaW1vbg==|https://orcid.org/0000-0002-1023-9993
1000 (Academic) Editor
1000 Label
1000 Förderer
  1. Bundesministerium für Wirtschaft und Klimaschutz |
  2. Bundesministerium für Verkehr und Digitale Infrastruktur |
1000 Fördernummer
  1. 01MK21007A
  2. 19F2266A
1000 Förderprogramm
  1. Coypu
  2. Moby Dex
1000 Dateien
  1. FAIR Data Publishing with Apache Maven
1000 Förderung
  1. 1000 joinedFunding-child
    1000 Förderer Bundesministerium für Wirtschaft und Klimaschutz |
    1000 Förderprogramm Coypu
    1000 Fördernummer 01MK21007A
  2. 1000 joinedFunding-child
    1000 Förderer Bundesministerium für Verkehr und Digitale Infrastruktur |
    1000 Förderprogramm Moby Dex
    1000 Fördernummer 19F2266A
1000 Objektart article
1000 Beschrieben durch
1000 @id frl:6474023.rdf
1000 Erstellt am 2024-04-04T15:09:03.439+0200
1000 Erstellt von 338
1000 beschreibt frl:6474023
1000 Bearbeitet von 339
1000 Zuletzt bearbeitet Tue Jun 25 07:54:51 CEST 2024
1000 Objekt bearb. Tue Jun 25 07:54:36 CEST 2024
1000 Vgl. frl:6474023
1000 Oai Id
  1. oai:frl.publisso.de:frl:6474023 |
1000 Sichtbarkeit Metadaten public
1000 Sichtbarkeit Daten public
1000 Gegenstand von

View source