A checklist for open-sourcing your code for reproducibility
Start with a fresh computer
Document in detail / document hierarchically
Use docker or anaconda
Reproducibility in machine learning;
Use yaml for hyperparameters; .yaml better for comments
Have separate config file for each experiment
Save within epochs
Log all features of training and evaluation
Experiment management; pytorch lightnight; scared; mlflow
Set seed and save it in config
pip freeze > requirements.txt
Test and validate (test the setup on a separate machine to ensure reproducible build;)
Make sure dependencies are ok and not hardcoded paths exist in the code
Checklist in the readme:
- dependencies
- training script
- evaluation script
- pretrained models
- results
Good to have
- contributing guide
- blog post
Previouscommon issues in code reproductionNextpractice on the version control and reproducing the experiments
Last updated