October 2023 Two DFG Research Software grants
Successful DFG proposals „Research Software - Quality Assured and Re-usable“
The SSC supported two teams of researchers in the DFG call “Research Software – Quality Assured and Re-usable”. The call was heavily oversubscribed, clearly demonstrating the need for more funding related to research software development. Nevertheless, both proposals were successful!
About the scientific projects
The successful proposal with Dr Sebastian Lobentanzer from the Institute for Computational Biomedicine targets the software packages Pypath, OmniPath and BioCypher. In the project “Establishing a knowledge graph community in biomedical science”, OmniPath and related software will be restructured to make it more usable, scalable and extensible. The OmniPath family consists of 1) pypath, a database builder Python module; 2) a web service; and 3) web service clients with extra functionalities for both R and Python languages. Pypath manages metadata about the original database resources, downloads them from the providers, pre-processes their data and builds the OmniPath databases. Doing this, it relies on several utilities such as ID translation, orthologous gene mapping, and taxonomy, among others. OmniPath integrates over 100 different resources focused on molecular knowledge, for example about genes, proteins, and diseases. Through the publicly available HTTP API, the resources are broadly usable, with a focus on the analysis of diseases and therapies by combining experimental data and prior knowledge to inform machine learning and mathematical models. To serve even broader user communities, BioCypher has been developed. BioCypher is a modular framework for the creation of knowledge graphs based on ontologies, pursuing high performance and scalability to support many new features, targeting single cell and spatial omics, microbiomics, metabolomics, and various multi-omics modelling and machine learning methods.
The successful proposal with Prof. Bernhard Höfle from the Institute of Geography targets the software package HELIOS++. In the project “Fostering a community-driven and sustainable HELIOS++ scientific software”, HELIOS++ is refactored and developed into a general-purpose integrated platform for users and developers from a broad spectrum of scientific disciplines. HELIOS++, the “Heidelberg LiDAR Operations Simulator”, is a general-purpose scientific software to create synthetic and realistic geographic small-footprint laser scanning / LiDAR point clouds and associated full waveforms. The main purpose is to mimic real topographic laser scanning in a computer environment, which we call virtual laser scanning (VLS). VLS enables full control of acquisition parameters and reproducibility in the data creation process, as well as highly automatic workflows coupled with data analysis. HELIOS++ can simulate VLS over a large range of spatial scales and combination of scales and geometric detail in one simulation and supports a multitude and extendable number of different platforms and scanners, which are easily inter- and exchangeable. The modularity and flexibility of control over the simulation make HELIOS++ a strong tool to support improved algorithm development and machine learning for point cloud processing, field work and survey planning, and development of novel acquisition strategies.
About the project management
In the bioinformatics project, one RSE (100%FTE) will work on the code base for 2.5 years of the three years of project runtime. A web developer (100%FTE) will support the web development part in the final year of the project. Further, a student assistant will participate throughout the project, and two workshops are planned - one user’s workshop and one developer’s workshop. The RSE, web developer, and organization of workshops is the responsibility of the SSC.
In the geoinformatics project, two RSEs (50%FTE each) will carry out the software engineering for the three years of the project runtime. One of these RSEs is associated with the SSC, the other RSE with the research group. Two student assistants will support the project for the three years, where one student assistant is at the SSC and the other at the research group.
Both projects require close cooperation of the research group and the SSC. The requirements, use cases, user stories and tests beyond unit tests have to be defined by the research groups. The RSEs that work on the projects do not necessarily have a background in the specific scientific domains, and it is very important to iterate between software engineering and scientific purpose of the software. The RSEs however are skilled in the scientific process and come from a related scientific background; the development of research software versus application software is quite different and requires familiarity with approaches to research problems.