Chapter 8 Documenting analyses
8.1 Learning Objectives
8.2 Why documentation?
Documentation is an important but sometimes overlooked part of creating a reproducible analysis! There are two parts of documentation we will discuss here: 1) In notebook descriptions and 2) READMEs
Both these notebook descriptions and READMEs are written in markdown – a shorthand for html (the same as the documentation parts of your code). If you aren’t familiar, markdown is such a handy tool and we encourage you to learn it (it doesn’t take too long), here’s a quick guide to get you started.
8.2.1 Notebook descriptions
As we discussed in chapter 5, data analyses can lead one on a winding trail of decisions, but notebooks allow you to narrate your thought process as you travel along these analyses explorations!
Your scientific notebook should include descriptions that describe:
8.2.1.1 The purposes of the notebook
What scientific question are you trying to answer? Describe the dataset you are using to try to answer this and why does it help answer this question?
8.2.1.2 The rationales behind your decisions
Describe why a particular code chunk is doing a particular thing – the more odd the code looks, the greater need for you to describe why you are doing it.
Describe any particular filters or cutoffs you are using and how did you decide on those?
For data wrangling steps, why are you wrangling the data in such a way – is this because a certain package you are using requires it?
8.2.1.3 Your observations of the results
What do you think about the results? The plots and tables you show in the notebook – how do they inform your original questions?
8.2.2 READMEs!
READMEs are also a great way to help your collaborators get quickly acquainted with the project.
READMEs stick out in a project and are generally universal signal for new people to the project to start by READing them. GitHub automatically will preview your file called “README.md” when someone comes to the main page of your repository which further encourages people looking at your project to read the information in your README.
Information that should be included in a README:
- General purpose of the project
- Instructions on how to re-run the project
- Lists of any software required by the project
- Input and output file descriptions.
- Descriptions of any additional tools included in the project?
You can take a look at this template README to get your started.
8.2.2.1 More about writing READMEs:
8.3 Get the exercise project files (or continue with the files you used in the previous chapter)
Get the Python project example files
Now double click your chapter zip file to unzip. For Windows you may have to follow these instructions.
Get the R project example files
Now double click your chapter zip file to unzip. For Windows you may have to follow these instructions.
8.4 Exercise 1: Practice beefing up your notebook descriptions
Python project exercise
- Start up JuptyerLab with running
juptyer lab
from your command line. - Activate your conda environment using
conda activate reproducible-python
. - Open up your notebook you’ve been working on in the previous chapters:
make_heatmap.ipynb
- Create a new chunk in your notebook and choose the “Markdown” option in the dropdown menu.
::include_slide("https://docs.google.com/presentation/d/1LMurysUhCjZb7DVF6KS9QmJ5NBjwWVjRn40MS9f2noE/edit#slide=id.gfaa026a583_0_30") ottrpal
5. Continue to add more descriptions where you feel is necessary, You can reference the descriptions we have in the “final” version looks like in the example Python repository. (Again, final here is in quotes because we may continue to make improvements to this notebook too – remember what we said about iterative?)
R project exercise
- Open up RStudio.
- Open up your notebook you’ve been working on in the previous chapters:
make_heatmap.Rmd
- In between code chunks, add more descriptions using Markdown language.
- You can test how this renders by saving your
.Rmd
and then opening up the resultingnb.html
file and choosingView in Browser
. - Continue to add more descriptions where you feel is necessary. You can reference the descriptions we have in the “final” version looks like in the example R repository. (Again, final here is in quotes because we may continue to make improvements to this notebook too – remember what we said about iterative?)
8.5 Exercise 2: Write a README for your project!
- Download this template README.
- Fill in the questions inside the
{ }
to create a README for this project. - You can reference the “final” versions of the README, but keep in mind it will reference items that we will discuss in the “advanced” portion of this course. See the R README here and the Python README here.
- Add your README and updated notebook to your GitHub repository. Follow these instructions to add the latest version of your notebook to your GitHub repository. Later, we will practice and discuss how to more fully utilize the features of GitHub but for now, just drag and drop it as the instructions linked describe.
Any feedback you have regarding this exercise is greatly appreciated; you can fill out this form!