A checklist for open-sourcing your code for reproducibility

  • Start with a fresh computer

  • Document in detail / document hierarchically

  • Use docker or anaconda

Reproducibility in machine learning;

  • Use yaml for hyperparameters; .yaml better for comments

  • Have separate config file for each experiment

  • Save within epochs

  • Log all features of training and evaluation

  • Experiment management; pytorch lightnight; scared; mlflow

  • Set seed and save it in config

pip freeze > requirements.txt

  • Test and validate (test the setup on a separate machine to ensure reproducible build;)

  • Make sure dependencies are ok and not hardcoded paths exist in the code

  • Checklist in the readme:

  • - dependencies

  • - training script

  • - evaluation script

  • - pretrained models

  • - results

  • Good to have

  • - contributing guide

  • - blog post

Last updated