Methods and Tools to Make Research Source Code Shareable and Reproducible: A Brief Overview


logo open access week Nanterre logo open EDUC Alliance

  1. Where scholars can seek help to make their research source code shareable and reproducible?
  2. CODE methodology (Collaborate, Open, Document, Execute)
  3. focus on the E (Execute = replicability)
Figure 1

1. Where scholars can seek help for their source code.

Figure 2

If we had an OSPO

PhD training :
- Developping and sharing best practices

Metadata curation :
- how to cite, describe a software (SMP)

Links with other outputs and archiving :
- HAL, Data repositories, Software Heritage

could be covered by others :
- licences (Innovation & Legal Departements)
- infrastructures (i.e Gitlab forge by IT Staff)
- Best development practices promoted by research engineers

📓 Louvet (2025)

questions received show raising concerns on Software reproducibility :

  • Do I have the right to reuse this piece of code in my application?
  • which licence should I use to publish my code?
  • How should I describe my software?
  • How can I credit contributors in my code?
  • How could I comply with Open Science principles while developing my software (and get a prize for it!)

questions received show raising concerns on Software reproducibility :

  • Do I have the right to reuse this piece of code in my application?
  • which licence should I use to publish my code?
  • How should I describe my software?
  • How can I credit contributors in my code? - How could I comply with Open Science principles while developing my software (and get a prize for it!)

link to COSO prizes application document

2. CODE methodology

Figure 4

Code Beyond FAIR 📓 Rougier et al. (2025)

Open as public (=OPEN) 📓 Cosmo et al. (2020)

Open as “open to contributions” (=COLLABORATE)

  • Should I invite others to send issues?
  • Should I invite others to send merge requests?
  • How can I credit others for their contributions?
  • What governance mechanisms protect the original intent of my software from being modified by others?

Documentation

README (capitals) file sorted on top of a list (in ASCII order) 📓 Abdelhafith (2015)

README file should contain at least:

goals and fonctions of the software environment (system and software stack required) installation and execution commands

Other information could be parsed elsewhere :
contribution -> CONTRIBUTING
authors -> CITATION.cff)
versions notes -> CHANGELOG
global informations in a machine readable format (JSON) -> CODEMETA

README file of the circuit analysis programs SPICE SINC and SLIC, 1974

comment your code

comment it yourself, don’t use a GenAI for that purpose

Figure 5

3. A focus on replicability and execution

Figure 6

replicability vs reproducibility

  • reproducibility = same software but other data
  • replicability = same software, same data (we only consider test data)

missing dependencies

Language specific dependencies

Figure 7

virtual environments with Python

virtual environments with R

Figure 8

renv::init() # open a virtual environment, captured packages versions are saved in a .local folder whithin the project
renv::status() # make sure that a new package has not been loaded without having been captured and indexed in renv.lock file
renv::screenshot() # capture all packages loaded in R (not only those mentioned in the source code)
renv::restore() # restore all packages captured from other's person project in our own environment

📓 Package Renv (n.d.)

beyond virtual environments and software dependencies 📓 Hinsen (2018)

  • Guix package manager allows to manage on the same computer different versions of the same package
  • Guix package is agnostic : R and Python are covered (better coverage for R than for Python)
  • Guix declarative nature makes simpler to define exactly which version of a package was used for a given source code
  • Roll back is always possible (get back to the previous version used of a software)

different profiles, different machines, same confirations

Figure 9

how it works

Any time we need a package, we make guix install <package>from the shell or use a different channel than the main one (for instance CRAN for R packages) ; then the command will be guix import <repository> -r <package>

guix import cran -r wikidataR > manifest.scm
guix import cran -r leaflet > manifest.scm
# imports wikidataR and Leaflet from the CRAN channel which extends guix channel and add them to the manifest.scm config file.

# in the manifest.scm file, hashs will identify specific versions of each package in the current profile:

#1l0slppa61hmzkqj9wmplb9ldsyvg61igsd9dlv5yx06k2by77xg for Leaflet v.2.2.3
#120833b7zyq1rhmn9c8iv0j6br60af7gbn5lc4dil55qhh2lp9rx for wikidataR v.2.3.3

#to run later the war_cemeteries Rscript with the specific versions of Leaflet and R : 
guix shell -m manifest.scm -- R -e 'war_cemeteries.R'

Figure 10

Containers are usefull to reproduce software on different systems (AMD or ARM, GNUX/Linux or Windows)

Docker without Guix

Figure 11

Docker with Guix 📓 Tournier (2024)

You don’t need either a Dockerfile nor a specific image like Ubuntu or Rocker to build the image. A single command and the manifest.scm will do the job for you.

guix pack -f docker -m manifest.scm 

#this command packages the script and the dependencies included in manifest.scm and convert this package # into a docker image which image can be loaded then with the following command: 

docker load

as a conlusion

Findability : HAL
Accessibility : Software Heritage
Interoperability : virtual environments, containers, replicable packages managers
Reusability : Licence, Documentation, tests

Data Without Software Are Just Numbers

📓 Davenport et al. (2020)

figures

figure source et crédits
Figure 1 cover of a video shot by Issaquah Library https://www.youtube.com/watch?v=TOjBXuJeuRk
Figure 6 Library as a R package : made with ChatGPT with the following prompt : “fill an hexagone (the common shape for R packages logos) with a drawing of a uni library”
Figure 3 ARDoISE Data Hub’s official logo (ARDoISE is The Rennes data hub)
Figure 5 University of Buffalo https://www.math.buffalo.edu/~badzioch/MTH337/Report_guide/report9.html
Figure 7 made with R package Cranly
Figure 9 from https://linuxfr.org/news/nix-1-7-nixpkgs-et-nixos-14-04-guix-0-6
Figure 10 image made with Avataars by Damien Belvèze, CC-by ; original idea by Candice Savonen
Figure 11 see https://gitlab.huma-num.fr/dbelveze/guide_replicability_practice/-/blob/main/code_r/Dockerfile?ref_type=heads

software used for this presentation

[1] "Quarto version: 1.6.40"
R version 4.5.2 (2025-10-31)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 24.04.3 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.12.0 
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0  LAPACK version 3.12.0

locale:
 [1] LC_CTYPE=fr_FR.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=fr_FR.UTF-8        LC_COLLATE=fr_FR.UTF-8    
 [5] LC_MONETARY=fr_FR.UTF-8    LC_MESSAGES=fr_FR.UTF-8   
 [7] LC_PAPER=fr_FR.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C       

time zone: Europe/Paris
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
 [1] digest_0.6.37     later_1.3.2       fastmap_1.2.0     xfun_0.53        
 [5] knitr_1.50        htmltools_0.5.8.1 rmarkdown_2.29    lifecycle_1.0.4  
 [9] ps_1.7.6          cli_3.6.5         processx_3.8.3    compiler_4.5.2   
[13] rstudioapi_0.15.0 tools_4.5.2       quarto_1.5.1      evaluate_1.0.4   
[17] Rcpp_1.1.0        yaml_2.3.10       rlang_1.1.6       jsonlite_2.0.0   

References

Abdelhafith, O. (2015). README.md: History and Components. In Medium.
Cosmo, R. di, Gruenpeter, M., Marmol, B., Monteil, A., Romary, L., & Sadowska, J. (2020). Curated archiving of research software artifacts: Lessons learned from the french open archive HAL. International Journal of Digital Curation, 15(1), 16–16. https://doi.org/10.2218/ijdc.v15i1.698
Davenport, J. H., Grant, J., & Jones, C. M. (2020). Data Without Software Are Just Numbers. Data Science Journal, 19(1). https://doi.org/10.5334/dsj-2020-003
Hinsen, K. (2018). Verifiability in computer-aided research: The role of digital scientific notations at the human-computer interface. PeerJ Computer Science, 4, e158. https://doi.org/10.7717/peerj-cs.158
Louvet, V. (2025). Official opening of the UGA OSPO. UGA.
Package renv : Présentation et retour d’expérience. (n.d.). Retrieved May 13, 2025, from https://elisemaigne.pages.mia.inra.fr/2021_package_renv/presentation.html#41
Rougier, N., Di Cosme, R., Hinsen, C., Maurice, C., Le Berre, D., Monat, R., Louvet, V., Jullien, N., Granger, S., & Maumet, C. (2025). Code beyond FAIR. https://inria.hal.science/hal-04930405v1
Tournier, S. (2024, December). (Re)déploiement de conteneurs et machines virtuelles avec guix. JRES (Journées réseaux de l’enseignement Et de La Recherche ) 2024.