1 Introduction
1.1 Current Situation of Research Software
Software creation, extension and reuse is an integral part of research activity. A customised software solution is often necessary to analyse, select and visualise research data, due to the unique nature of each research project. The resulting software can be shared in connection with the research data, or stand on its own as an independent result. Research software is mainly developed by scientists with basic programming skills. The correctness of the code is guaranteed as far as the software works, but the adoption of good practises concerning comments, indentation, documentation, naming, versioning and citation is often rudimentary. Other aspects that are not always considered are the use of resources, referencing and licence-conform reuse of existing software, long-term preservation and responsibilities. Thus, the aim of this text is to show the possibilities for software management plans (SMPs) and to enable infrastructure providers, e.g., libraries and data centers, to quickly get started in the topic.
There is a lot of community-driven work to foster good programming practises among scientists. This is not a contemporary phenomenon, but has instead been true since the use of code in a scientific context. However, this has been gaining increasing attention in recent years due to activities in research data management and the enhanced publication of dataset and code. This awareness building is supported by several initiatives and takes advantage of many instruments that are already commonly used among professional programmers. Among these instruments are the broad adoption of versioning platforms; the definition of software metadata schemas or special citation formats for research software like .cff (); the introduction of development environments (IDEs) allowing bug fixing, testing and bench-marking; the diffusion of software packages for the half-automatic generation of documentation and reproducible examples. Information exchange via language-specific and generic fora additionally helps with the development of software in research.
The structure of this article leads from the abstract to the concrete. The introduction presents the framework for the present work like motivation, definition and ongoing initiatives around SMPs. This is followed by section 2, which presents and compares five selected SMP templates which are the object of study. Part 3 shows some forms of support in the preparation of a SMP, including five Data Management Plan (DMP) tools. Next, chapter 4 gathers some consideration about SMPs, metadata, FAIR principles and machine-actionability. Section 5 discusses advantages and disadvantages of SMPs also in relation to possible use cases. The article closes with a conclusion as wrap-up to indicate the need for future work.
1.2 Environment for SMPs
Together with the aforementioned applications, tools and procedures for research software, amateur programmers need an instrument to collect ideas and information necessary for their programming activity, which helps them gain awareness and provide for the necessary resources. Such an instrument can be an SMP, which aims to collect all information relevant for the software development in a scientific context to support a structured approach or help distributed teams to adhere to (self-) defined standards.
Albeit no funding organization explicitly requires a software management plan at the moment, there are developments in this direction, as the German Volkswagen Foundation addresses in such a way research software in its Open Science Policy (). It can be expected that other funding agencies, such as the European Union, will follow. In these cases, it is helpful for service providers such as university libraries, data centers or scientific computing centers to be prepared.
In this article, we aim to collect all relevant information and helpful material to enable end-users to create a SMP that is tailored to the needs of their community and institution. The following text will analyse and highlight some of the advantages of SMPs from various perspectives.
1.3 SMP definition
A software management plan is a living document which supports software development, addressing most relevant aspects. There are different SMP definitions with different focuses and they all have their strengths (see, for example, or ). The following text is based on a definition that, in addition to a general formulation, attempts to understand an SMP and its possible uses via a descriptive listing. It was developed by a joint working group of German Initiative for Network Information (DINI) and German competence network for digital long-term storage (nestor) the authors belong to, and is available in German and English at the German national information website on research data management:
A Software Management Plan (SMP) contains general and technical information about the software project, information on quality assurance, release and public availability as well as legal and ethical aspects that affect the software.
The SMP summarises information that adequately describes and documents the creation, documentation, storage, versioning, licensing, archiving and/or publication of the software created or used in a project. Associated hardware and other necessary resources, as well as other associated software and software libraries, text and data publications must also be described and are a special feature of the SMP.
The purpose of an SMP is initially to support the traceability and, if necessary, the long-term usability of the software (for direct application as well as for further processing) and to facilitate user support in the event of queries. The SMP therefore also serves the purpose of quality assurance (see FAIR for Research Software.)
The SMP can be linked to one or more DMPs if the software is used for data generation or processing. SMP and DMP can be summarised as output plans.
This definition is one of the existing proposals for labelling the subject area and will be applied in the following sections. Ultimately, these ideas need to be filled with life. This requires researchers and support personnel who implement these developments in practise and at the same time deal with SMPs on a theoretical level. The following chapters also deal with this transition from theoretical considerations to practise.
1.4 Current initiatives dealing with SMPs
At the international, European and German levels, there are various institutions, working groups and initiatives that are already working on SMPs. The most important players are briefly named and presented below, without any claim for completeness.
The Software Sustainability Institute (SSI)
The SSI is an Edinburgh-based scientific consortium promoting good practise in software development. It was founded in 2010 to support individuals and institutions in understanding the central role of software in research. The aim is to use better research software to make science more sustainable and better overall. With the ‘Checklist for a Software Management Plan’, the first draft for an SMP was created and published there in 2016 (). This SMP template is therefore the first template to be implemented in DMPonline today (see section 2.2). Many of the current discussions, applications and solutions were addressed by SSI at an early stage.
DINI/nestor sub-working group SMP
DINI, the German initiative for network information, and nestor, the German competence network for digital long-term storage, have jointly hosted a working group about research data (AG Forschungsdaten) since 2014. The working group is organised on a voluntary basis as an association and is divided into sub-working groups, among which one is the sub-working group data management plans (UAG DMP). Inspired by a support request in July 2022, the sub-working group started to discuss software development as a special task within scientific work, which eventually led to the creation of an autonomous sub-working group focused on providing guidance on SMPs and SMP tools for the German scientific community. So far, the working group has provided a thorough SMP definition, collected relevant literature, is collecting templates and use cases, and writing guidance documents like the present one.
Research Data Alliance (RDA)
The RDA is an international initiative that was founded in 2013 with the aim of overcoming social and technical hurdles to increase data exchange and sharing (RDA 2024). Among the activities of the RDA, research software is gaining increasingly importance. DMPs and SMPs were the object of a session at the 20th RDA Plenary Meeting in 2023 in Sweden. Within RDA, SMPs are currently being discussed in working and interest groups. In addition, research software was declared an innovation aim within the RDA strategic plan 2024–2028 (). What is relevant in all this work is the international readiness with which the topic of SMPs is discussed and, thus, also brought into the respective national scientific systems.
RSE and DE-RSE
The development of good practises for sustainable scientific software is the core activity of several sister institutions with national and international coverage, called Research Software Engineering Associations. They are federated in the International Council of RSE Associations. National/Multinational RSE associations exist in Germany (DE-RSE), Belgium (BE-RSE), the Netherlands (NL-RSE), Denmark (Danish RSE), United Kingdom (Soc-RSE), United States (US-RSE), Australia and New Zealand (RSE-AUNZ), North European countries (NORDIC-RSE), and Asian countries (RSE Asia).
These institutions are centrally involved in all matters relating to research software, and publish recommendations, such as the DE-RSE position paper on the current situation of research software in Germany (), and the US-RSE Career Guidebook ().
In this context of career and institutional anchoring, SMPs are seen as a tool for the sustainability of research software. Currently, DE-RSE is working on a position paper for summer 2024 on establishing RSE departments in German research institutions. There, SMPs will also be included as part of the requirements placed on RSEs by, e.g., funding bodies. As a quasi-professional organisation in Germany, DE-RSE is therefore attempting to position itself in this regard and also to identify—pragmatic—options for action. This includes SMPs.
NFDI Workgroup Software Metadata
The National Research Data Infrastructure (NFDI) is a German initiative aiming to define standards, infrastructures and good practises to encourage and improve sharing of research data. The so-called ‘Sections’ deal with cross-cutting topics bundling forces and competencies orthogonal to the discipline-specific ‘Consortia’. Research software has also been recognised as a cross-sectional issue and has found its place within the Section ‘Metadata, Terminology and Provenance’ in the working group ‘Software metadata’. This group aims to enhance transparency, reproducibility, and re-usability of software in research, overcoming the limitations of existing metadata schemas for research software. To address these issues, this working group aims to provide a comprehensive metadata vocabulary for research software, compatible with existing frameworks such as Schema.org and CodeMeta. The working group will also support all NFDI consortia in applying the vocabulary as well as develop domain-specific extensions if needed.
2 SMP Content
At present several SMP templates exist, which can be freely used standalone or as a basis for consultations. This section presents the use of SMPs and gives an overview of the different SMP templates.
2.1 Facilitation of SMP creation through templates
Different approaches can be adopted for writing SMPs, similarly to DMPs. A plan can be prepared from scratch as a result of a brainstorming, or by answering a prepared collection of questions, which is usually provided as a simple text document. Some selected templates are presented and discussed in this chapter, see Figure 1. Meanwhile, templates for management plans that are prepared in special software environments are more common. Such tools enable the provision of different question complexes in a standardised way, along with support in the form of help texts and automated exporting possibilities, e.g., as PDF or via API calls among others. Some selected DMP/SMP tools are presented in the following chapter.
2.2 Overview of the considered SMP templates
For this study, we collected five templates which are available online (see also Figure 1). These are briefly described below in order to show their context of origin, their strengths and possible areas of application.
Max Planck Digital Library (MPDL) – SMP for Researchers
It was created in 2022 by the team of the Max Planck Digital library. It focuses mainly on software development by scientists, trying to diffuse good developing practises and to improve information exchange and resource management within an institution. It is organised in the following sections: General; Technical Information; Quality Assurance; Release and Publish; Legal; and Ethics. The template is implemented for the software RDMO in XML, but it is also available as a .docx file ().
PRESOFT: Research Software Management Plan template
It was developed by Teresa Gomez-Diaz and Genevieve Romier within the project PREservation for REsearch SOFTware (PRESOFT, 2017–2019) funded by the Centre national de la recherche scientifique (CNRS). The results of the PRESOFT project are published online ( as well as ). The template is implemented in DMPOpidor and consists of seven main sections: Metadata; Software context; Software features; Team organisation; Development organisation; Distribution organisation; and SMP management. The whole template is also available as the Research Software Management Plan template (PRESOFT project) (). There are 98 questions in total. Most answers are organised as free text. There are no check-boxes, ready-made answers or suggested texts. Some questions are accompanied by help texts or further information to support writing the SMP. Vocabularies are proposed for the scientific classification of the research software.
ELIXIR SMP
The SMP template in Data Stewardship Wizard (DSW) is developed by a team in ELIXIR, a distributed infrastructure for life-science information. With the SMP, they have set themselves the task of providing support for research software in the life sciences. The first draft of their SMP was prepared in 2021 (. See also ). The ELIXIR SMP is characterised by the ready use of controlled vocabulary. It therefore offers many possibilities for machine evaluation, like machine-actionable (maSMP) (see chapter 4.3). This goes hand in hand with fewer free text fields, so that it is usually only possible to select from the predefined answers. Depending on the context and location, this can be an advantage or disadvantage. It was recently updated in a hackathon in Barcelona in fall 2023.
SSI Checklist
This SMP template in form of a checklist was one of the first drafts formulated in this field (). Published in 2018 on Zenodo, it anticipated some of the current discussions. It was developed by the Software Sustainability Institute. The listed questions in the checklist were implemented in DMPonline. Therefore, the checklist can now also be used interactively for writing SMPs.
ZIB – Ubiquity Generator
This SMP template was developed by an ad-hoc working group at the Zuse Institute Berlin in the project ‘HPO-navi’. The project was funded by the German Research Foundation via the call ‘Sustainability for research software’ in the eResearch Technologies framework. The aim of the call was twofold: On the one hand it was targeted at giving the necessary resources to make a particular research software more mature and shift it from ‘demonstrator’ to at least ‘prototype’, or even better, to a 1.0 version. On the other hand, it had the goal of making research software more sustainable by means of documentation and archiving. The SMP for the software UG was aligned along the Software Sustainability Institute recommendations for Software Management Plans and was published in 2023 (; ). The SMP template is implemented in GitLab/Wiki for the internal development of research software.
2.3 Comparison of SMP templates
The five chosen SMP templates are compared in the two following Tables 1 and 2. They differ significantly in coverage, focus, and granularity. All have in common that they act as standalone documents, i.e., they are not dependent from or referencing a DMP.
GENERAL CLUSTER | SUB-CLUSTER | ELIXIR | MPDL | PRESOFT | SSI | ZIB |
---|---|---|---|---|---|---|
Administrative information | Costs | 3% | 4% | 2% | 3% | |
FAIR principles | 3% | 2% | 3% | 3% | ||
Governance | 2% | 8% | 3% | 5% | 8% | |
Requirements | 2% | 2% | 3% | |||
Documentation and versioning | Documentation | 2% | 1% | 3% | 1% | |
Quality procedures | 2% | 2% | 6% | 5% | 9% | |
Version control | 4% | |||||
Legal and ethical aspects | Intellectual property | 2% | 2% | |||
Rights | 5% | 2% | 3% | 3% | 1% | |
Performance and security | Risk analysis | 2% | 4% | |||
Security | 5% | 4% | 1% | |||
Testing | 5% | 1% | 1% | |||
Preservation and sharing | Availability | 3% | ||||
Citation | 3% | 5% | 1% | |||
Community | 2% | 4% | 1% | |||
Impact | 4% | 6% | 11% | 5% | ||
Preservation | 4% | 2% | 13% | 8% | ||
Release | 3% | 2% | 1% | 8% | 5% | |
Repository | 9% | 4% | 2% | 8% | 4% | |
Support | 4% | 1% | 3% | 4% | ||
Related objects | Related items | 1% | 8% | 3% | ||
SMP | 2% | 6% | 1% | 3% | 1% | |
Software description | Content | 2% | 1% | 3% | 3% | |
Deliverable | 2% | 4% | 3% | 11% | 1% | |
Examples | 14% | 4% | 8% | |||
Languages, formats | 4% | 1% | ||||
Metadata | 2% | 1% | ||||
Persistent identifiers | 2% | 6% | 4% | 5% | 5% | |
Research field | 3% | 1% | ||||
Road-map | 13% | 2% | 7% | |||
Scope | 2% | 1% | 1% | |||
State of the art | 27% | 6% | 2% | 4% | ||
Title | 4% | 2% | 5% | 4% | ||
Technical infrastructure | Dependencies | 8% | 21% | 3% | ||
Development | 3% | |||||
Environment | 2% | 2% | 3% | 3% | ||
Number of questions | 64 | 50 | 98 | 38 | 75 | |
The longest SMP template is the one from PRESOFT, with just under a hundred questions; the SSI checklist is the shortest with ‘only’ 38 questions. It is striking that each of the templates has a different focus and orientation. Table 1 demonstrates this impressively. Some of the templates come more from project management, others are more closely aligned with DMPs.
The SMPs from ZIB and ELIXIR are closely oriented to software development. Both, therefore, have comparatively many questions in the area of documentation and preservation. The SSI checklist, on the other hand, is aimed primarily at future re-use and less at the phase during software development. One special feature of ELIXIR is the high degree of machine-actionablilty. It is noticeable here that many question sets do not appear that are included in the other SMP templates in different ways. This is certainly also due to the fact that the SSI-SMP template is the oldest and is therefore regarded by many as a benchmark. When using PRESOFT, a particularly significant level of information is collected in the administrative area and in the software description. The MPDL template is again aimed primarily at scientists as self-taught software developers. The width of questions and subject areas is more striking here than the depth to which different aspects are queried. The detailed overview in Table 2 also shows which sub-items are addressed in the SMP template and to what extent.
The two tables help to get a first impression and comparison of the existing SMP templates. The questions of all considered templates have been firstly annotated with the most relevant keyword contained and subsequently clustered heuristically into major categories. Finally, the relative importance of each topic cluster in each catalogue was reported as relative fraction of the number of questions. The obtained structure helps comparing, qualitatively, the question collections. The two tables help to get an initial visual impression of which topics are covered by the various SMP templates and to what extend.
The FAIR4RS principles are a significant point of reference. Depending on the internal logic of the SMP templates, they are directly invoked. This is naturally less the case with SMP templates that were created before 2021, although the Findable, Accessible, Interoperable and Reusable (FAIR) Data principles already appear here indirectly as well. However, it is clear that these SMP templates will also be adapted, expanded, readjusted or phased out over time as the needs, assumptions and expectations regarding the use of research software change. This is also reflected in the SMP templates.
3 Support for SMP creation
SMPs are still a relatively new phenomenon. Similarly to DMPs in the mid-2010s, it may be that this method for the sustainable management of research software develops as a valued approach. In addition to the templates presented in chapter 2.2, there are also services that provide support for SMP creation. This chapter gives an overview in the first part and presents the existing services in more detail in the second part.
3.1 Consulting and training on SMPs
The use of SMPs is still under discussion. Therefore, a comparatively small amount of experience can be drawn on for training and consulting. As a consequence, it should not be surprising if only a few scientists and stakeholders have started to recognise the validity of their use for project and quality management in the development of research software within the own institution.
The increasing attention around research software at scientific institutions will most likely be accompanied by funders paying more attention to the topic. At the same time, more consulting and training offers will also be necessary for users. This can, for example, come from the implementation of the FAIR4RS Principles () by scientific committees or funding bodies such as the European Commission and the upcoming 10th EU Framework Programme for Research and Innovation. In the German context, the Helmholtz Association has made significant progress: From 2025 on, all Helmholtz Centres will be able to present research software management procedures in publicly available policies. In addition, an incentive for the publication of research software is being developed (). Such institutional incentives can again lead to greater demand for research software management. SMPs can provide support here.
Such SMPs as a service can be offered, for example, by research data managers, information specialists or software experts. For such a service, it is necessary that these professionals are prepared in using SMPs. The sub-working group Training/Further education of the already mentioned DINI/nestor-Working Group Research Data has prepared generic training materials for teaching SMPs that can be used in workshops for this target group (. This is a module of ).
, on the other hand, demonstrate how SMPs can be incorporated in Computer Science Higher Education training, providing working materials and use cases for students and researchers (). In the long term, it is therefore also recommended that consulting and training on SMPs will be included in the standard study curricula in the area of research software, and also in research data management ).
Still, most of the existing training materials and documents so far are strongly related to the platform used. This will become visible in 3.3.
3.2 Manuals and guides for SMPs
There are already some handouts about the preparation and use of SMPs. These can be particularly helpful for the use of SMPs as a service. The first generic reference was published by Neil Chue Hong (). Since 2018, the German Aerospace Center (DLR) guidelines have also included concrete recommendations for the general use of research software. The guidelines introduces 4 application classes (ac) for scientific software, ranging from small scripts intended for personal use (ac0) to production type software which is intended for use in mission critical systems (ac3). This categorisation of the own software according to scope and area of use can be a substantial help during the development process. The four classes are helpful as an initial orientation for an assessment of the code. Is it a small script or an entire infrastructure? Depending on the selection, fewer or more measures are necessary (). For life sciences, recommendations for a low-threshold SMP have also already been developed and discussed in the context of ELIXIR in 2021 (). The most recent contribution is the practical guide by the Netherlands eScience Center and the Dutch Research Council (NWO) from 2022 (). However, the ongoing discussions around SMPs have put SMPs in the mainstream in 2023, so that further handouts, open training material and more can be expected soon (see chapter 1.4 for current initiatives).
3.3 Implementation in DMP tools
Currently, there is no service that offers SMPs alone. In all cases, SMPs are offered as part of a platform that was actually designed for DMPs. Not surprisingly, these are always community-driven platforms from a scientific context that are committed to the open source idea. The platforms and their content have evolved. At the same time, it should be positively emphasised that for new services – such as an SMP offer – it is not always necessary to set up new infrastructure. Rather, it can make sense to expand existing solutions in order to make them usable for new demands.
DMPonline
DMPonline is a service by the British Digital Curation Centre (DCC). It is using the open source software DMPRoadmap, available on GitHub. A large community has long been formed around this software, contributing to maintain and expand the code. The Digital Curation Centre offers DMPonline as a service for many research centres in Britain and abroad. The software is also used independently by many other research infrastructure providers, including DMPTool (University of California Curation Center (UC3)) and national DMP services such as DMPTuuli (Finland) and DMP OPIDoR (France), the latter being presented below.
DMP OPIDoR
DMP OPIDoR, based also on DMPRoadMap, is operated by the Institut de l’information scientifique et technique (Inist), based in Nancy, a research support unit of the Centre national de la recherche scientifique (CNRS). The main function of DMP OPIDoR is to facilitate access to scientific information, its analysis and evaluation with the special focus on research publications and data.
The DMP OPIDoR service is designed for French scientists and their cooperation partners. Different templates for DMPs are available. Besides, there is also a template for an SMP. Moreover, DMP OPIDoR offers the possibility to make written SMPs publicly available. There are already a few written SMPs available for re-use and orientation.
Software Management Wizard (DSW)
The ELIXIR SMP platform is made available at DSW as a subsection of the Data Stewardship Wizard and is therefore part of the knowledge collection there on research data and research software in the life sciences. DSW is a place to gather knowledge on data stewardship that has been run by its own community for many years. The links to ELIXIR are extensive, which means that infrastructures and knowledge accumulation are particularly well combined here. The SMP as a service from DSW is of particular interest to scientists from the life sciences. Of course, it can also be interesting for other specialist areas.
Research Data Management Organiser (RDMO)
The Research Data Management Organiser (RDMO) is a tool for the documentation of a scientific project which, in addition to the creation of DMPs, supports the structured planning, implementation and administration of research data management throughout the entire data life cycle and enables the notation and initiation of tasks (hence the name ‘Organiser’). It has been developed for operation as a local instance and is completely adaptable to the needs of the operating institution and its community.
RDMO has reached the status of operational software and is available as open-source software on GitHub. In 2024, RDMO is in operational use at 45 institutions in Germany, Austria, France and Kenya.
Argos
Argos is a DMP tool offered and developed by the Horizon 2020 project OpenAIRE. It is primarily used by funded projects of the European Commission. Argos is not only a tool for writing DMPs, but also a platform for publishing its content via many extensions. Starting in October 2022, the Argos developers planned to implement a specific template for software management planning. In April 2023, Argos stated that a template for software management plans based on the SSI questionnaire will be published with one of the next releases. In February 2024, a template for SMPs was implemented.
4 Specifications for SMP
4.1 FAIR4RS principles, validity, and limitations
The FAIR Data principles, which constitute a set of guidelines facilitating sharing and re-use of data, were officially published in Scientific Data in 2016 by FORCE11 (), a consortium consisting of researchers, librarians, archivists, publishers and research funding bodies. The primary objective underpinning these principles revolves around ensuring data to be FAIR effectively by both humans and machines. Despite their non-standard status, these principles have garnered growing significance due to their compelling advocacy for data reusability. Attaining this objective demands incorporating the principles throughout the entire spectrum of Research Data Management (RDM) ().
It soon became clear that the FAIR principles needed to be adapted and modified to include research software. This is also because software, despite being closely related to research data, has special characteristics. Among other things, it is characterised by the fact that it is executable, that its versions are frequently updated and that it is rarely created from scratch. Recognising this need, Lamprecht et al. reformulated the FAIR principles in a publication on the progress of the joint working group, tailoring them to be more aptly applied to software (). Subsequently, in 2022, these revised FAIR4RS principles were officially published by Chue Hong at al. as a direct outcome of the collaborative efforts of the FAIR for Research Software Working Group (FAIR4RS WG) within the RDA ().
There are many obvious benefits of FAIR4RS, such as improving the discoverability of research software, enabling its identification and reuse by other researchers and communities, and facilitating compatibility and integration with different tools and platforms. However, certain barriers hinder their successful implementation. Among the significant limitations is the inherent complexity of research software, which makes it challenging to apply FAIR4RS principles uniformly across all software types. Furthermore, implementing FAIR principles for research software may require significant resources, including time, expertise and infrastructure. Another challenge is to convey the benefits of FAIR4RS to developers, funded projects and the wider research community (). While the FAIR principles have become broadly adopted and are viewed as a benchmark for research data management, the FAIR4RS principles remain relatively unrecognised or not widely embraced, particularly as RSEs ‘typically’ do not pursue conventional academic career pathways and may not place significant emphasis on traditional metrics like citations or recognition. Similar to the implementation of the FAIR principles, it lends itself to integrate the FAIR4RS principles into the SMP.
4.2 Metadata and research software
As for any other publicly available digital object, research software needs to be described with structured and machine-interpretable metadata in order to achieve interoperability of SMPs within the software development process and to make it easier for researchers to discover, understand and reuse it, and to conform to the FAIR4RS principles. In the realm of research software, several initiatives have been developed to address the need for machine-interoperable metadata, simplifying the process of describing and sharing software resources effectively. Notable among these are Bioschemas and CodeMeta. Bioschemas is a toolbox to add structured metadata to research outputs, including software, with a particular focus on life sciences. It defines a set of metadata schemas and vocabularies, built on top of existing technologies and standards, that can be used to represent such tools in Web pages and applications, and provides tools such as the ComputationalTool and the Markup Generator (). CodeMeta is a minimal metadata schema for scientific software and code, arising from a cross-disciplinary community-driven effort (). Both schemas are based on the Schema.org classes SoftwareApplication and SoftwareSourceCode, linking the data to facilitate semantic web discovery. To facilitate this, the FAIR-IMPACT project published the Research Software MetaData (RSMD) guidelines focusing on an effective collection and curation of metadata (). These guidelines help archiving, referencing, describing and citing software and making its metadata FAIR, benefiting researchers, developers, institutions, publishers, and infrastructure managers by offering tailored best practises.
4.3 Machine-actionable SMPs (maSMPs)
Adopting standardised metadata for SMP has an additional advantage—it allows the automation of initiating processes and tasks in the software development workflows. The RDA DMP Common Standards working group defined a machine-actionable (maDMP) to overcome limitations of text-based documents (). These efforts are now being transferred to SMPs as well. Such an SMP is called a maSMP. Building on the structured metadata (), a team from ELIXIR presents a machine-actionable version of their ELIXIR SMP (). This development reuses and harmonises elements from the maDMP, Schema.org, Bioschemas and CodeMeta specifications, while also adding new types and properties (see Figure 2).
Although most of the elements are focused on life sciences, the recommendations are domain-independent. To achieve an alignment between the different parties involved in the existing SMP models and to identify gaps within ELIXIR SMPs, RDMO SMPs and RDA maDMPs, a workshop in the summer of 2023 on maSMPs was conducted in Cologne (). As a result, a full metadata model was published, including entities involved in software management planning; such as an SMP itself, software source code, software release, documentation, authors and their relations (). This was followed by an NFDI4DataScience hackathon at ZB Med in December 2023.
5 Discussion
It is still open whether SMPs, similar to DMPs, will become an established method in research. After the overview on existing solutions and the presentation of existing materials in the previous chapter, the following chapter discusses the advantages and disadvantages of SMPs.
5.1 Advantages and disadvantages of SMPs
There are a lot of advantages using SMPs. First of all, information about research software is transferred from tacit to explicit knowledge. Such a superimposition of knowledge offers many advantages. It also becomes clear which aspects of working with research software have not yet been taken into account or have only been discussed incompletely. The direct specification of questions activates such ‘hidden’ knowledge. Writing down the findings then leads to an explicit form of knowledge that can be used in the software project for all participants (). Furthermore, more detailed description makes the emerging software perceived as an entity in its own right. It has a name, a management plan with responsibilities, and the like. The software thus becomes an object that can be named directly; unlike implicit parts. At the same time, software thereby also receives more institutional attention and, above all, appreciation. An SMP can significantly accelerate this development. In addition, an SMP can be used in a similar way to a DMP. For example, it is quite possible to use an SMP to convince decision-makers of the excellence of one’s own research idea in the case of a third-party-funded project. In the same way, an SMP could also become a kind of ‘deliverable’ for larger software-using research projects in the future.
There are also disadvantages in using SMPs. First, in general, formalising knowledge is a decision that should be made consciously. An SMP creates explicit knowledge. Especially in the field of software and development, this can be seen as a disadvantage. Agile methods benefit from the ability to iterate and having fewer start-up management methods. An SMP can be a disadvantage here due to the loss of flexibility. Although an SMP costs time at the beginning, the effort actually pays off in the later phases. However, the scientist must already be willing to invest this resource of time unit. And finally, on a fundamental level, it should also be discussed and debated on a fundamental level whether an SMP is necessary at all. Not every small script needs its own SMP. Categorisations of software, such as those proposed by the German Aerospace Center in 2018, can help in deciding how far the effort and return of an SMP are in a reasonable ratio ().
But, there are also quite different directions in which one could think. Instead of DMP and SMP being separate entities, one could also think about an Reproducibility Management Plan (RMP) or a Data and Digital Object Management Plan (D(DO)MP) for everything (; ). Such a plan could include everything—text, data and code. Such a plan could include everything—text, data and code. This can be an advantage, as the whole management information concerning all outputs produced within a project remains collected in one document. However, research is often complex. As the size of a scientific project increases, so does the complexity, so that having a central place to manage all the results also leads to a large, voluminous document. In reality, there is then simply a danger that it will hardly ever be read, let alone applied.
6 Conclusion
The objective of this article was to provide an overview of the potential and possibilities of SMPs. This is intended to enable service providers to make quick and informed decisions about where and to what extent SMPs are relevant for their institution. In particular, the analysis of the environment of the current discussion is, as chapter 4 shows, necessary for the understanding and classification of the SMPs. Based on this, the various SMP templates already available are presented in chapter 2. They differ in many respects, so that this presentation summarises the essential points for getting started with the SMP application.
The main findings of this article are as follows:
- SMPs are currently under discussion. There are good reasons and advantages for the use and application of SMPs.
- SMPs can go a considerable way in supporting the sustainability and reproducibility of research software.
- There are already some freely available solutions for SMPs and offering them as a service. The existing templates have different focuses and intentions.
- With most SMP-able tools, there is no need to set up your own infrastructure. Rather, they make use of existing systems.
- Initial handouts and guidance on SMPs are already available. However, there is still no comprehensive experience with the use of SMPs.
The overall aim of the article is to give an overview of SMPs for research software. This is only an interim state. However, it should give all interested parties the opportunity to quickly familiarise themselves with the concept and applications of SMPs.
Data Availability Statement
The table contents are openly available via Zenodo: https://doi.org/10.5281/zenodo.10047950.