This project is supposed to get you used to working with R in a way that would be conducive for collaborating or creating reproducible analyses.
Thus send us a link to your data if it is hosted somewhere or make sure to turn in a data file (csv, excel etc.) on CoursePlus. If the data is not hosted publicly somewhere, you must turn it in on Drop Box with your other files.
You are also free to create your own data if you wish, but please ensure that it is large enough to perform the rest of the following requirements.
Options for places to find data are:
tidytuesdayR
package: https://github.com/thebioengineer/tidytuesdayRcatdata
package: https://CRAN.R-project.org/package=catdatadataplay
package: https://github.com/avahoffman/dataplayTo use the data that comes with R, enter datasets::
and
press tab in RStudio to see the names of the datasets - for
example datasets::ability.cov
will load the ability
dataset.
You are not limited to these options for finding your data.
(.5 points total - 0.25 points for describing data, 0.25 points for describing where you got your data)
(3 points - per successful method)
(1 point)
ggplot2
.(2 points total: 1 point for each of 2 different kinds of plots completed)
(2 points total, 1 point for attempt)
(1 point total - 0.5 points for describing analysis, 0.5 points for interpretation)
sessionInfo()
to the end of your analysis so that
you and others can see what version of R and packages you used.(0.5 points)
Please see the project example on our website for an example project: Source code Rmd and the output html.
Bonus: Create a function as part of your analysis. If you do this correctly, it can make up for lost points on other sections. (1.5 points)
10 points total plus bonus
Grading Rubric:
Item | Description | points |
---|---|---|
Describe Data Source | Describe what your data looks like. Identify what the variables and samples are. Describe how the data was originally created. | 0.25 |
Describe Data | Describe where you got your data | 0.25 |
Wrangling - cleaning, subsetting, manipulation (ex renaming, recoding, reshaping, filtering) | Perform at least three different methods | 3 |
Describe wrangling and reason | Please describe what you did to clean/subset/wrangle/manipulate your data and why. | 1 |
Data Viz | Make 2 different kinds of plots | 2 |
Data Analysis | Perform a simple analysis of your data. This can involve summarizing the data to describe aspects about it (quartiles, means, range etc.) or you may perform a simple statistical test. | 2 |
Describe Analysis | Describe what analysis you performed and why. | 0.5 |
Interpret Analysis | Provide some simple interpretation about what your analysis might indicate about your data. You will not be graded based on the validity of your statistical interpretation, but rather the implementation and description of what you did. | 0.5 |
Session info | Need to have session info | 0.5 |
Bonus | Create a function as part of your analysis. If you do this correctly, it can make up for lost points on other sections. (1.5 points) | 1.5 extra |