Statistical and Data Sciences: Faculty Publications

Creating Optimal Conditions for Reproducible Data Analysis in R with ‘Fertile’

Document Type

Article

Publication Date

11-26-2020

Publication Title

Stat

Issue

e332

Abstract

The advancement of scientific knowledge increasingly depends on ensuring that data-driven research is reproducible: that two people with the same data obtain the same results. However, while the necessity of reproducibility is clear, there are significant behavioral and technical challenges that impede its widespread implementation and no clear consensus on standards of what constitutes reproducibility in published research. We present fertile, an R package that focuses on a series of common mistakes programmers make while conducting data science projects in R, primarily through the RStudio integrated development environment. fertile operates in two modes: proactively, to prevent reproducibility mistakes from happening in the first place, and retroactively, analyzing code that is already written for potential problems. Furthermore, fertile is designed to educate users on why their mistakes are problematic and how to fix them.

Recommended Citation

Bertin, Audrey M. and Baumer, Benjamin, "Creating Optimal Conditions for Reproducible Data Analysis in R with ‘Fertile’" (2020). Statistical and Data Sciences: Faculty Publications, Smith College, Northampton, MA.
https://scholarworks.smith.edu/sds_facpubs/31

Rights

Version

Version of Record

Download

Find in your library

Included in

Data Science Commons, Other Computer Sciences Commons, Statistics and Probability Commons

COinS

DOI

https://doi.org/10.1002/sta4.332

Statistical and Data Sciences: Faculty Publications

Creating Optimal Conditions for Reproducible Data Analysis in R with ‘Fertile’

Document Type

Publication Date

Publication Title

Issue

Abstract

Recommended Citation

Rights

Version

Included in

DOI

Search

Browse

Author Corner

Links

Statistical and Data Sciences: Faculty Publications

Creating Optimal Conditions for Reproducible Data Analysis in R with ‘Fertile’

Authors

Document Type

Publication Date

Publication Title

Issue

Abstract

Recommended Citation

Rights

Version

Included in

Share

DOI

Search

Browse

Author Corner

Links