What’s Reproducibility

session info

session info

session info

session info

Reproducibility vs Repeatability vs Replicability

session info

Why Reproducibility is important…

session info

We can’t get to replicability without reproducibility

session info

It’s worth the wait

session info

session info

Reproducibility can also be for your future self!

session info

The process

session info

R Markdown

session info

R Markdown lets you test your work

session info

R Markdown allows you to more clearly show what you did

session info

R Markdown makes it easier to update code and see results

session info

Clean your environment

Regularly cleaning your environment and trying your code again, can help ensure that your code is running as expected.

Occasionally we might forget to save a step of our code in our R Markdown file that we ran only in the console. This will help us figure that out.

session info

Check if your file knits regularly

Regularly checking if your file knits will help you spot a missing step or error earlier when you have less code to try to identify where your code might have gone wrong.

session info

session info

Image by Allison Horst.

Tell your future self and others what you did!

Provide sufficient detail so that you can understand what you did and why.

# Taking a random sample of 100 individuals from the population
# WITHOUT replacement
samp_pop <- sample(100, replace = FALSE)

# Then split them into two groups of 50
# a[x:xx] is the syntax for indexing a vector
samp_pop1 <- samp_pop[1:50]
samp_pop2 <- samp_pop[51:100]

session info

Need random numbers to stay consistent?

Use set.seed(): sets the starting state for the random number generator.

set.seed(123)
sample(10)
 [1]  3 10  2  8  6  9  1  7  5  4
set.seed(123)
sample(10)
 [1]  3 10  2  8  6  9  1  7  5  4
set.seed(456)
sample(10)
 [1]  5  3  6 10  4  9  1  2  8  7

Note that these are only pseudo random and the values are created doing calculations based on the given seed. Thus the same “random” values will be reproduced by everyone using the same seed with set.seed.

R Markdown syntax

Before: Markdown sytax before rendering

After knit: Result of markdown sytax after rendering

R Markdown syntax

Go to the RStudio toolbar: Help > Cheat Sheets > R Markdown Cheat Sheet (which will download it)

Or Help > Cheat Sheets > R Markdown Reference Guide

The End

Additional references

Versions matter

session info

Session info can help

session info

GUT CHECK

Why is reproducibility so important?

A. It helps to ensure that your code is working consistently and it helps others understand what you did

B. It ensures that your code is correct

GUT CHECK

What is NOT a practice to improve the reproducibility of our work?

A. Using R Markdown files to describe what your code is doing

B. Using scripts instead of R Markdown files

C. Testing your code with R Markdown files or the run previous button

D. Regularly cleaning the environment

More resources

Summary

To help make your work more reproducible:

  • Use RMarkdown
  • Clean your environment regularly
  • Check the knit of your RMarkdown regularly
  • Tell your future self and others what you did!
  • Print session info!

Resources & Lab