Resources




Facilitating Research Data Management with HELIPORT
DOI

11th April 2024

Oliver Knodel, Stefan E. Müller, David Pape

28th HiRSE Seminar

Abstract

Researchers rely on a variety of systems and tools when it comes to administering their research data. Processes involving research data management include proposal submission, data management planning, simulation campaigns, documentation during the experiment, and the creation and submission of journal and data publications. HELIPORT is a data management solution that aims at making all steps of the research experiment’s life cycle discoverable, accessible, interoperable and reusable according to the FAIR principles. This is done by linking to and interfacing with established tools and solutions, and exchanging metadata between systems involved in a project. The metadata are presented to the researchers through a web interface, but they are also accessible to computational agents via API and machine-readable landing pages. In this presentation, we will introduce the metadata project HELIPORT and what provided the impulse for the project, discuss the documentation of a real experiment in HELIPORT, and outline current developments and challenges.




Documenting ML Experiments in HELIPORT

March 2024

David Pape, Oliver Knodel, Sebastian Starke

deRSE24 - Conference for Research Software Engineering in Germany

Abstract

HELIPORT is a data management guidance system that aims at making the components and steps of the entire research experiment's life cycle findable, accessible, interoperable and reusable according to the FAIR principles. It integrates documentation, computational workflows, data sets, the final publication of the research results, and many more resources. This is achieved by gathering metadata from established tools and platforms and passing along relevant information to the next step in the experiment's life cycle. HELIPORT's high-level overview of the project allows researchers to keep all aspects of their experiment in mind.

A particularly interesting use case are machine learning projects. They are often prototypical in nature and driven by iterative development, so reproducibility and tranparency are a great concern. It is essential to keep track of the relationship between input data, choices in model parameters, the code version in use, and performance measures and generated outputs at all times. This requires a data management platform that automatically records the changes made and their effects. Existing MLOps tools (such as Weights and Biases, MLFlow) live entirely in the ML domain and start their workflow with the assumption that data is available. HELIPORT, on the other hand, takes care of the data lifecycle as well. Our envisioned platform interoperates with the domain specific tools already used by the scientists, and is able to extract relevant metadata (e.g. provenance). It can also make persistent any additional information such as papers the work was based on, documentation of software components, workflows, or failure cases. Moreover, it should be possible to publish these metadata in machine-readable formats.

The challenge arising from these aspects consists in integrating ML workflows into HELIPORT in such a way that they work on the provided data and metadata. The goal is also to enable the comprehensible development of ML models alongside the experiment documented in HELIPORT. This allows different teams (e.g. experimentalists and AI specialists) to work together on the same project in a seamless manner, and help generate FAIRer outcomes. In the long term we hope to aide in establishing digital twins of facilities, and making their maintenance a part of the data management proces.




HELIPORT: An overarching Data Management System at HZDR

March 2024

Stefan E. Müller, Thomas Gruber, Oliver Knodel, Jeffrey Kelling, Mani Lokamani, David Pape, Martin Voigt, Guido Juckeland

DPG-Frühjahrstagung 2024

Abstract

Researchers at the Helmholtz-Zentrum Dresden-Rossendorf rely on a variety of systems and tools when it comes to administer their research data. Processes involving research data management include the project planning phase (proposal submission to the beamtime proposal management system, the creation of data management plans and data policies), the documentation during the experiment or simulation campaign (electronic laboratory notebooks, wiki pages), backup- and archival systems and the final journal and data publications (collaborative authoring tools, meta-data catalogs, software and data repositories, publication systems). In addition, modern research projects are often required to interact with a variety of software stacks and workflow management systems to allow reproducibility on the underlying IT infrastructure. The "HELmholtz ScIentific Project WORkflow PlaTform" (HELIPORT), which is currently developed by researchers at HZDR and their collaborators, tries to facilitate the management of research data and metadata by providing an overarching guidance system which combines all the information by interfacing the underlying processes and even includes a workflow engine which can be used to automate processes like data analysis or data retrieval.




Pioneering Digital Research Landscapes: Innovations at HZDR

February 2024

Oliver Knodel

Helmholtz Open Science Forum: Towards Open Digital Research Ecosystems – Interconnecting Infrastructures

Abstract

Digital infrastructures have become indispensable in the field of modern research and science. These technological frameworks play a crucial role for the entire research cycle, supporting literature searches, aiding in data collection and analysis, facilitating the creation and publication of scholarly works, and ensuring the thorough documentation and long-term storage of research findings. Additionally, these infrastructures serve as a vital means for networking and communication among peers, creating the essential foundation of an open and transparent science and research ecosystem.
In this lecture, the entire digital research landscape at the HZDR will be presented and illustrated using a representative experiment.




Open Research Project Guidance System: HELIPORT

February 2024

Tobias Huste, Oliver Knodel, Thomas Gruber, Jeffrey Kelling, Mani Lokamani, Stefan E. Müller, David Pape, Martin Voigt, Guido Juckeland, Malte C. Kaluza, Joachim Hein, Alexander Kessler, Chien-Li Lee and Bernd Schuller

Helmholtz Open Science Forum: Research Software

Abstract

In this presentation, Heliport is outlined with a focus on the research software product Heliport itself. The project was initially funded by the HElmholtz Metadata Collaboration (HMC).




Overarching Data Management Ecosystem at HZDR
DOI

September 2023

Oliver Knodel, Thomas Gruber, Jeffrey Kelling, Mani Lokamani, Stefan E. Müller, David Pape, Martin Voigt and Guido Juckeland

Vol. 1 (2023): 1st Conference on Research Data Infrastructure (CoRDI) - Connecting Communities

Abstract

When dealing with research data management, researchers at Helmholtz- Zentrum Dresden – Rossendorf (HZDR) face a variety of systems and tools. These range from the project planning phase (proposal management, data management plans and policies), over documentation during the experiment or simulation campaign, to the publication (collaborative authoring tools, metadata catalogs, publication systems, data repositories). In addition, modern research projects usually are required to interact with a variety of software stacks and workflow management systems to allow comprehensi- ble and FAIR science on the underlying IT infrastructure (HPC, data storage, network file systems, archival). This article first demonstrates the data management systems and services provided at HZDR, followed by an overview of a self-developed guidance system. It is concluded by a real-world example.




First HELIPORT Workshop: Book of Abstracts
DOI

05. - 06. October 2022

First HELIPORT Community Workshop 2023

Alexander Kessler, Alexey Ponomaryov, Andrew K. Mistry, Anton Barty, Arie Irman, Astrid Schneidewind, Bernd Schuller, Boxing Gou, Brian Edward Marre, Bridget Murphy, Carina Becker, Carolin Hundt, Chien-Li Lee, Christian Gutt, Christiane Schneide, Claudia Engelhardt, David Pape, Florian Rau, Frank Maas, Frank Schreiber, Friedrich Bethke, Gerrit Guenther, Guido Juckeland, Gunnar Pruß, Hans-Peter Schlenvoigt, Jan-Christoph Deinert, Jan-Dierk Grunwaldt, Jeffrey Kelling, Joachim Hein, Johannes Sperling, Kilian Schwarz, Kristin Elizabeth Tippey, Leon Steinmeier, Lisa Amelung , Malte Christoph Kaluza, Mani Lokamani, Marc Hanisch, Martin Voigt, Michael Bussmann, Moritz Kurzweil, Nico Hoffmann, Nicole Wagner, Oliver Knodel, Oonagh Mannix, Patrick Ufer, Peter Baumgärtel, Ralph Müller-Pfefferkorn, Sebastian Baunack, Sebastian Busch, Sebastian Sachse, Sebastian Starke, Sergey Kovalev, Simone Vadilonga, Stefan Bock, Stefan Mueller, Susanne Schoebel, Thomas Gruber, Thomas Kluge, Tobias Unruh, Wiebke Lohstroh, Wolfgang Horn

Abstract

In our HELIPORT workshop, we will provide insights into our project and share our results. In addition, we would like to provide a platform for the presentation of similar projects, as well as extensions or integrations from the surrounding research areas. The overall goal of the workshop is bringing together different institutions with similar challenges and establishing a community around our HELIPORT project. We welcome submissions on related projects, metadata in our scientific field in general or workflows, in the form of talks or posters. We also welcome first or future HELIPORT use-cases from within our community!




Project HELIPORT: The Integrated Research Data Lifecycle of the HELIPORT Project
DOI

05. - 06. October 2022

Helmholtz Metadata Collaboration | Conference 2022

Oliver Knodel, Martin Voigt, Robert Ufer, David Pape, Mani Lokamani, Jeffrey Kelling, Stefan E. Müller, Thomas Gruber, Guido Juckeland, Malte C. Kaluza, Joachim Hein, Alexander Kessler, Chien-Li Lee and Bernd Schuller

Abstract

The HELIPORT project aims to make the components or steps of the entire life cycle of a research project at the Helmholtz-Zentrum Dresden-Rossendorf (HZDR) and the Helmholtz-Institute Jena (HIJ) discoverable, accessible, interoperable and reusable according to the FAIR principles. In particular, this data management solution deals with the entire lifecycle of research experiments, starting with the generation of the first digital objects, the workflows carried out and the actual publication of research results. For this purpose, a concept was developed that identifies the different systems involved and their connections. By integrating computational workflows (CWL and others), HELIPORT can automate calculations that work with metadata from different internal systems (application management, Labbook, GitLab, and further). This presentation will cover the first year of the project, the current status and the path taken so far in the life cycle of the project.




Intergrated Data Workflow using HELIPORT at TELBE
DOI

05. - 06. October 2022

Helmholtz Metadata Collaboration | Conference 2022

Lokamani, David Pape, Thomas Gruber, Jan-Christoph Deinert, Martin Voigt, Oliver Knodel, Jeffrey Kelling, Stefan E. Müller and Guido Juckeland

Abstract

At the High-Field High-Repetition-Rate Terahertz facility @ELBE (TELBE), ultrafast terahertz-induced dynamics can be probed in various states of matter with highest precision. The TELBE sources offer both, stable and tunable narrowband THz radiation with pulse energies of several microjoules at high repetition rates and a synchronized coherent diffraction radiator,that provides broadband single-cycle pulses. The measurements at TELBE are data intensive, which can be as high as 20GB per experiment, that can lasts up to several minutes. As a result, the current data aquisition and data analysis stages are decoupled, where in the first step the primary data is processed and stored at HZDR and in a later step, restricted data access is made available to the user for post-processing. In this poster contribution, we present an integrated workflow for post-processing of the experimental data at TELBE with in-built exchange of metadata between the experiment control software LabView and the workflow execution engine UNICORE. We also present the guidance system HELIPORT[3] which manages the metadata of the associated project proposal and job information from UNICORE, and integrates with the electronic lab notebook (MediaWiki), providing a user-friendly interface for monitoring the actively running experiments at TELBE.




HELIPORT — An Integrated Research Data Lifecycle
DOI

05. - 06. October 2022

Helmholtz Metadata Collaboration | Conference 2022

26. September 2022

8th Annual "Matter and Technologies" Meeting

22. September 2022

3. Sächsische FDM-Tagung - Forschungsdatenmanagement im Spannungsfeld zwischen Idealen, Anforderungen und Praxis

05. - 07. September 2022

German Conference for Research with Synchrotron Radiation, Neutrons and on Beams at Large Facilities

Oliver Knodel, David Pape, Martin Voigt, Thomas Gruber, Jeffrey Kelling, Mani Lokamani, Stefan E. Müller, Guido Juckeland, Alexander Kessler, Joachim Hein, Malte C. Kaluza and Bernd Schuller

Abstract

HELIPORT is a data management solution that aims at making the components and steps of the entire research experiment’s life cycle discoverable, accessible, interoperable and reusable according to the FAIR principles.
Among other information, HELIPORT integrates documentation, scientific workflows, and the final publication of the research results - all via already established solutions for proposal management, electronic lab notebooks, software development and devops tools, and other additional data sources. The integration is accomplished by presenting the researchers with a high-level overview to keep all aspects of the experiment in mind, and automatically exchanging relevant metadata between the experiment’s life cycle steps.
Computational agents can interact with HELIPORT via a REST API that allows access to all components, and landing pages that allow for export of digital objects in various standardized formats and schemas. An overall digital object graph combining the metadata harvested from all sources provides scientists with a visual representation of interactions and relations between their digital objects, as well as their existence in the first place. Through the integrated computational workflow systems, HELIPORT can automate calculations using the collected metadata.
By visualizing all aspects of large-scale research experiments, HELIPORT enables deeper insights into a comprehensible data provenance with the chance of raising awareness for data management.




A FAIRly Integrated Scientific Project Lifecycle

15. July 2022

Oliver Knodel, Martin Voigt, Robert Ufer, David Pape, Mani Lokamani, Jeffrey Kelling, Stefan E. Müller, Thomas Gruber, Guido Juckeland, Malte C. Kaluza, Joachim Hein, Alexander Kessler and Bernd Schuller

HMC Dialogue

Abstract

The talk introduces the general idea behind the HELIPORT project, which aims to make the entire life cycle of a scientific experiment or project discoverable, accessible, interoperable and reusable by providing an overview from a top-level perspective. Specifically, our data management solution addresses the areas from data generation to publication of primary research data, computing workflows performed and the actual research results.




HELIPORT - An Integrated Research Data Lifecycle

5. May 2022

Oliver Knodel, Martin Voigt, Robert Ufer, David Pape, Mani Lokamani, Jeffrey Kelling, Stefan E. Müller, Thomas Gruber, Guido Juckeland, Malte C. Kaluza, Joachim Hein, Alexander Kessler and Bernd Schuller

ZIH colloquium at TU Dresden

Abstract

The guidance system HELIPORT aims to make the components or steps of the entire life cycle of a research project at Helmholtz-Zentrum Dresden-Rossendorf (HZDR) discoverable, accessible, interoperable and reusable according to the FAIR principles. In particular, this data management solution deals with the entire lifecycle of research experiments, starting with the generation of the first digital objects, the workflows carried out and the actual publication of research results. For this purpose, a concept was developed that identifies the different systems involved and their connections. By integrating computational workflows (CWL and others), HELIPORT can automate calculations that work with metadata from different internal systems (application management, Labbook, GitLab, and further).
In this lecture, the overall system will be presented using a practical example.




9. March 2022

Alexander Kessler, Joachim Hein and Malte Kaluza

HIJ semi-annual palaver

Abstract

The presentation gives an overview on the HELIPORT project at the Helmholtz Institute Jena and related IT activities.




Presentation for ELBE Beamline Scientists at HZDR

4. February 2022

Oliver Knodel, Stefan E. Müller

Abstract

The presentation gives an overview on the HELIPORT project. Furthermore the presentation gives insight into our motivation developing a guidance system which is now known under the name HELIPORT.




Project Poster

24. January 2022 (updated)

Oliver Knodel, Martin Voigt, Robert Ufer, David Pape, Mani Lokamani, Jeffrey Kelling, Stefan E. Müller, Thomas Gruber, Guido Juckeland, Malte C. Kaluza, Joachim Hein, Alexander Kessler and Bernd Schuller

Abstract

The HELIPORT poster provides a short overview on the project and introduces the usage of Handles, the workflow integration, the top-level project plan and the project metadata schema.

Used in different HMC events to provide an overwiev and project progress.




Full Integrated Research Data Lifecycle – The Project HELIPORT
DOI

16. December 2021

Oliver Knodel

SaxFDM Digital Kitchen

Abstract

Wissenschaftliche Experimente nutzen eine große Bandbreite an verschiedenen Software-Werkzeugen in den verschiedenen Phasen des Projektes von der Proposal-Einreichung über die Datennahme bis zur finalen Publikation. Eine große Herausforderung für Wissenschaftseinrichtungen ist es, WissenschaftlerInnen für die Dokumentation der genutzten Werkzeuge in allen Phasen des Forschungsprojektes zusätzliche Metadaten gemäß der FAIR-Prinzipien zur Verfügung zu stellen. Das Ziel der HELmholtz ScIentific Project WORkflow PlaTform (HELIPORT) ist es daher den kompletten Lebenszyklus eines wissenschaftlichen Projekts zu registrieren und die zugehörigen Programme und Systeme miteinander zu verknüpfen. Die maschinenlesbare Dokumentation aller im jeweiligen Forschungsprojekt durchgeführten Arbeitsschritte gemeinsam mit den dazugehörigen Metadaten macht jeden Arbeitsschritt transparent, verständlich und zitierbar und trägt somit zur Einhaltung guter wissenschaftlicher Praxis bei.
In der Präsentation von Dr. Oliver Knodel vom HZDR wird das von der HMC geförderte Projekt HELIPORT (2021-2023) vorgestellt und in die Datenmanagementstruktur des HZDR eingeordnet.




HELIPORT use case POLARIS: Integration of a High Intensity Laser in a complete data life cycle workflow

28. October 2021

Oliver Knodel, Joachim Hein, Alexander Kessler

Laserlab-Europe, ELI and CASUS Workshop "Better Data for Better Science - Research Data Management Workshop"

Abstract

The presentation outlines the POLARIS experiment at Helmholtz Institute Jena, including experimental chain, setup and first ideas regarding the description of the POLARIS experiment with HELIPORT.




HELIPORT (HELmholtz ScIentific Project WORkflow PlaTform)

28. October 2021

Oliver Knodel, Martin Voigt, Robert Ufer, David Pape, Mani Lokamani, Jeffrey Kelling, Stefan E. Müller, Thomas Gruber, Guido Juckeland, Malte C. Kaluza, Joachim Hein, Alexander Kessler and Bernd Schuller

Laserlab-Europe, ELI and CASUS Workshop "Better Data for Better Science - Research Data Management Workshop"

Abstract

The presentation outlines the HELIPORT project. The HELIPORT project aims at developing a platform which accommodates the complete life cycle of a scientific project and links all corresponding programs, systems and workflows to create a more FAIR and comprehensible project description.




HELIPORT: A Portable Platform for {FAIR Workflow | Metadata | Scientific Project Lifecycle} Management and Everything
DOI

June 2021

Oliver Knodel, Martin Voigt, Robert Ufer, David Pape, Mani Lokamani, Jeffrey Kelling, Stefan E. Müller, Thomas Gruber and Guido Juckeland

P-RECS '21: Proceedings of the 4th International Workshop on Practical Reproducible Evaluation of Computer Systems

Abstract

Modern scientific collaborations and projects (MSCPs) employ various processing stages, starting with the proposal submission, continuing with data acquisition and concluding with final publications. The realization of such MSCPs poses a huge challenge due to (1) the complexity and diversity of the tools, (2) the heterogeneity of various involved computing and experimental platforms, (3) flexibility of analysis targets towards data acquisition and (4) data throughput. Another challenge for MSCPs is to provide additional metadata according to the FAIR principles for all processing stages for internal and external use. Consequently, the demand for a system, that assists the scientist in all project stages and archives all processes on the basis of metadata standards like DataCite to make really everything transparent, understandable and citable, has risen considerably. The aim of this project is the development of the HELmholtz ScIentific Project WORkflow PlaTform (HELIPORT), which ensures data provenance by accommodating the complete life cycle of a scientific project and linking all employed programs and systems. The modular structure of HELIPORT enables the deployment of the core applications to different Helmholtz centers (HZs) and can be adapted to center-specific needs simply by adding or replacing individual components. HELIPORT is based on modern web technologies and can be used on different platforms.

Back