Digital Scholarship@Leiden

Can research software be managed?

Can research software be managed?

Read about the discussions that took place during the recent Data Management Network event on software management where we heard more about TU Delft’s research software policy and the eScience Center’s software sustainability plan.

On 15 June, 2021, the Leiden University Data Management Network came together online for Connect and Imagine: Can research software be managed? - a discussion on research software sustainability.

This was the third Connect and… session, after Connect and Inspire: from F and A to I and R, how to implement research funders’ requirements and Connect and Learn: Making data sharing possible in the context of the GDPR. The meetings have been coordinated by the Centre for Digital Scholarship of the Leiden University Libraries.

Summary of the LCRDM research and recommendations

Laurents Sesink kicked off the meeting by providing insights on the state of research software sustainability in the Netherlands as researched by an LCRDM task group. Laurents is head of the Centre for Digital Scholarship at Leiden University Libraries and contributed to the resulting report "Research software sustainability in the Netherlands: current practices and recommendations" published in January 2021.

"While open access and FAIR data have received considerable attention in the context of open science policy in the Netherlands (…), this is not the case for research software and software sustainability." Research software sustainability in the Netherlands, LCRDM report, January 2021

Research software is often mentioned in the context of Open Science and research integrity, but various terms are used to focus on different aspects of sustainability, like FAIR software, sustainable software, software archiving, and software re-use.

The LCRDM task group initially set out to investigate software archiving (that is, storing source code), before concluding that they needed to include other aspects as well. Besides archiving, sustainable software requires clear licenses, documentation, and maintenance, among other things.

Members of the task group performed structured interviews with researchers who create software at different institutions in the Netherlands. The interviewees gave various reasons for sharing software, and had various ways of carrying it out. They also expressed different levels of understanding of software licenses, concerns about not having enough time to properly document everything, feelings of a lack of appreciation for software development, and the need for more support for these issues and for the availability of more training and funding.

Based on these interview results and on group discussions, the task group provided recommendations for funders, institutions, support staff, and researchers. These recommendations demonstrate that software sustainability is very much a joint effort, requiring education, guidance, policies, planning, recognition, and technical and financial support.

TU Delft research software policy

Next up was Paula Martinez Lavanchy, Research Data Officer at TU Delft Library and 4TU.ResearchData. She presented the Research Software Policy that was published and enacted by the TU Delft earlier this year and explained the road to getting there.

Discussions about improving the support for sharing research software started at TU Delft in 2017, when a researcher expressed their annoyance with the required procedure for making software open source at the Library. Because software potentially has commercial value, the TU’s valorisation centre had to sign off before it could be published with an open-source license, which could take weeks.

But annoyance with labour-intensive administrative procedures was not the only reason for creating the new policy. Research software is ubiquitous: 92% of researchers at TU Delft use software, and when researchers leave TU Delft, any loss of access to their software is a real risk to the continuation and integrity of their research. Moreover, developments around Open Science have driven an increased recognition of research software as a research output, but this recognition can only work when individual contributions to software are identified and recorded.

It took four years of drafting and discussions with research data officers, librarians, the IT department, valorisation centre, and experts outside the institution, before the policy and accompanying guidance were ready to be published.

Importantly, the policy provides clarification on copyright ownership of software produced at TU Delft: the university is the owner of the copyright, but the policy provides more freedom for researchers to publish open source software if they want. It also includes high-level requirements for how to manage software, establishes responsibilities in development, and describes workflows for sharing. A separate guidelines document helps researchers with licensing, registration, and commercialisation.

Asked what caused the most difficulty during the writing of this policy, Paula explained that agreeing on when it would not be necessary to involve the valorisation centre was one of the most difficult aspects. As TU Delft owns the copyright of software created by their employees, and being a technical university with interest in commercialisation, the copyright is not transferred to researchers by default, even though in many cases no commercialisation interest exists. The clarification that the policy provides hopefully makes it easier, however, for researchers to practice Open Science.

Practical sustainability: the Netherlands eScience Center’s software sustainability protocol

Jason Maassen is Technology Lead, Efficient Computing, at the Netherlands eScience Center (NLeSC) in Amsterdam. The NLeSC builds software for research, in collaboration with researchers.

In 2018, the NLeSC published their Software Sustainability Protocol to ensure all projects funded by, or involving, the eScience Center have a plan for managing software produced in those projects. The most important part of the Protocol is the Software Sustainability Plan, a two-page list of questions that guides the reader to best practices. Completing this plan is a requirement when getting funds from the NLeSC.

The questions are divided into three sections:

  • Minimum effort (mandatory practices): pick an open-source license; put software in a public code repository like GitHub; provide citation information; assign a persistent identifier to the repository location
  • Recommended practices: create an entry in a relevant software directory; provide documentation for end users; consider using a more extensive software quality checklist
  • Long-term aspects: how do you plan to keep the software maintained after the end of the project – will you build a community? – do you have funding or will there be spin-offs or a start-up?

Since the NLeSC started to request these Software Sustainability Plans from projects in which it was involved, the Dutch Research Council, NWO, asked if the NLeSC would create a more generic template for software management/sustainability plans. They wanted to explore with the community whether existing templates were equivalent and acceptable for NWO, and whether NWO would accept multiple software management templates depending on the institution (as they accept various DMP templates) or whether it was better to have a single template for all. To gather input, a workshop was organised for 22 June 2021.

This current example of software management through planning raised questions regarding the relation between software management plans (SMPs) and data management plans (DMPs). NLeSC projects typically have a DMP as well as an SMP. Sometimes software outputs can be described in the DMP, but if software is a main output of the project it needs a separate plan. However, when a DMP is already created for a project, it does make sense to reuse information from the DMP and not bother researchers with more questions. In some new calls, the NLeSC plans to take care of the first two sections of the plan and only ask applicants to fill out the third part on long-term aspects.

Asked about differences among software practices in various disciplines, Jason acknowledged that such differences exist and that those are reasons for having domain-specific software directories and metadata for finding and describing software. Some examples of research software directories are the NLeSC Research Software Directory and BioTools. Software development between domains is at very different levels of quality and openness. For example, technical universities see commercial value in software and produce spin-off companies around software, whereas fields like astronomy and physics are very open and so is their software.

Discussion

The three short and concise presentations inspired many attendees to contribute to the discussion.

All were in favour of asking researchers to produce some kind of SMP (software management plan), so that they have to think about software beyond the duration of the project. Current DMP (data management plan) templates do not seem to cover software well enough. However, we would like to avoid presenting researchers with many more administrative burdens, so it shouldn’t necessarily be a separate SMP. An SMP could be a supplement to, or section of, a DMP, similar to current DMP sections that ask about personal data. Could interactive, machine-actionable DMPs provide a solution, by dynamically presenting the applicable questions?

Reiterating the eScience Center’s experience, sometimes it is hard to determine when software is a research output in its own right – even for experienced research software engineers. That means it may be difficult to decide in advance how much software management needs to be planned.

Of course, it is important to archive software properly, regardless of its size and impact, beyond the end of the project. And planning the management of software may avoid conflicts with regards copyright, especially in consortiums.

What about the other topic that should be part of this discussion: recognition and rewards? How does that work at TU Delft and NLeSC?

Paula told us that TU Delft has an Open Science programme that includes FAIR software and Recognition and Rewards, but discussions on the topic are ongoing. Registration in their CRIS-system, PURE, is the first step towards receiving recognition and rewards. Delft has made the registration process automatic for software archived in 4TU Data Centre but software archived elsewhere still has to be registered manually.

Jason explained that at NLeSC they focus on encouraging and facilitating citation: the NLeSC Protocol requires that software is cited with the aim of increasing recognition. It’s an easy thing to forget to do though, and eScience Research Engineers sometimes even forget to cite their own software in publications. Having systems to register software is key to get this going.

How can we keep up with what's going on, and contribute to discussions about software sustainability?

There are several international groups thinking about improving software sustainability, like the Research Software Alliance, Software Sustainability Institute (who also commented on drafts of the TU Delft policy) and the RDA Interest Group on FAIR Software.

We also expect to continue talking about how we at Leiden University can help and support researchers in the management of their own research software, so watch out for more opportunities to connect and discuss this topic in future.

Related