To access this work you must either be on the Smith College campus OR have valid Smith login credentials.

On Campus users: To access this work if you are on campus please Select the Download button.

Off Campus users: To access this work from off campus, please select the Off-Campus button and enter your Smith username and password when prompted.

Non-Smith users: You may request this item through Interlibrary Loan at your own library.

Publication Date

2021

First Advisor

Benjamin S. Baumer

Second Advisor

Albert Y. Kim

Document Type

Honors Project

Degree Name

Bachelor of Arts

Department

Statistical and Data Sciences

Keywords

Reproducibility, R, Statistical computing, Education

Abstract

Data science research is considered reproducible when the associated code and data files produce identical results when run by another analyst. Although reproducibility is a key component in the advancement of scientific knowledge, a significant proportion of research articles and other analyses fail to meet reproducibility standards. Steps have been taken to address this issue, including academic courses on reproducibility, additional requirements or recommendations for journal article acceptance, and a variety of software tools. However, many of these are challenging to use, are too generalized, or are not accessible to a wide audience. In this thesis, I present my work on developing fertile, an R package designed to help improve the reproducibility of R Projects and address the limitations of other solutions by being 1) simple to use, 2) easily accessible, 2) broad in scope, 3) tailored to the specific challenges faced by R users, 4) customizable, and 5) educational. Chapter 1 considers the background information motivating fertile, including explanation of reproducibility, its issues, current solutions, and their limitations. Chapter 2 is code-focused, demonstrating the functions available in fertile to address different aspects of reproducibility and delving into some of the details of how software works. Finally, Chapter 3 considers fertile's potential applications in the real world, including an in-depth analysis of an experiment involving fertile's integration into an introductory data science course at Smith College

Rights

©2021. Audrey Margaret Bertin

Language

English

Comments

118 pages : color illustrations Includes bibliographical references (pages [113]-118)

Share

COinS