Document Type

Article

Publication Date

4-3-2019

Publication Title

Journal of Computational and Graphical Statistics

Publication Title

Journal of Computational and Graphical Statistics

Volume

28

Issue

2

Abstract

Many interesting datasets available on the Internet are of a medium size—too big to fit into a personal computer’s memory, but not so large that they would not fit comfortably on its hard disk. In the coming years, datasets of this magnitude will inform vital research in a wide array of application domains. However, due to a variety of constraints they are cumbersome to ingest, wrangle, analyze, and share in a reproducible fashion. These obstructions hamper thorough peer-review and thus disrupt the forward progress of science. We propose a predictable and pipeable framework for R (the state-of-the-art statistical computing environment) that leverages SQL (the venerable database architecture and query language) to make reproducible research on medium data a painless reality. Supplementary material for this article is available online.

Comments

Peer reviewed accepted manuscript.

First Page

256

Last Page

264

Digital Object Identifier (DOI)

10.1080/10618600.2018.1512867

Rights

“Licensed to Smith College and distributed CC-BY under the Smith College Faculty Open Access Policy.”

Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Plum Print visual indicator of research metrics
PlumX Metrics
  • Citations
    • Citation Indexes: 5
  • Usage
    • Downloads: 286
    • Abstract Views: 3
  • Captures
    • Readers: 20
see details

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.