Document Type

Article

Publication Date

12-29-2015

Publication Title

The American Statistician

Abstract

Data science is an emerging interdisciplinary field that combines elements of mathematics, statistics, computer science, and knowledge in a particular application domain for the purpose of extracting meaningful information from the increasingly sophisticated array of data available in many settings. These data tend to be nontraditional, in the sense that they are often live, large, complex, and/or messy. A first course in statistics at the undergraduate level typically introduces students to a variety of techniques to analyze small, neat, and clean datasets. However, whether they pursue more formal training in statistics or not, many of these students will end up working with data that are considerably more complex, and will need facility with statistical computing techniques. More importantly, these students require a framework for thinking structurally about data. We describe an undergraduate course in a liberal arts environment that provides students with the tools necessary to apply data science. The course emphasizes modern, practical, and useful skills that cover the full data analysis spectrum, from asking an interesting question to acquiring, managing, manipulating, processing, querying, analyzing, and visualizing data, as well communicating findings in written, graphical, and oral forms. Supplementary materials for this article are available online.

Keywords

Computational statistics, Data science, Data visualization, Data wrangling, Machine learning, Statistical computing, Undergraduate curriculum

Volume

Issue

First Page

334

Last Page

342

DOI

dx.doi.org/10.1080/00031305.2015.1081105

Rights

Licensed to Smith College and distributed CC-BY under the Smith College Faculty Open Access Policy.

Comments

Peer reviewed accepted manuscript.

Recommended Citation

Baumer, Benjamin, "A Data Science Course for Undergraduates: Thinking with Data" (2015). Mathematics and Statistics: Faculty Publications, Smith College, Northampton, MA.
https://scholarworks.smith.edu/mth_facpubs/25

Download

Find in your library

Included in

Statistics and Probability Commons

COinS

Smith ScholarWorks

Mathematics and Statistics: Faculty Publications

A Data Science Course for Undergraduates: Thinking with Data

Document Type

Publication Date

Publication Title

Abstract

Keywords

Volume

Issue

First Page

Last Page

DOI

Rights

Comments

Recommended Citation

Included in

Search

Browse

Author Corner

Links

Smith ScholarWorks

Mathematics and Statistics: Faculty Publications

A Data Science Course for Undergraduates: Thinking with Data

Authors

Document Type

Publication Date

Publication Title

Abstract

Keywords

Volume

Issue

First Page

Last Page

DOI

Rights

Comments

Recommended Citation

Included in

Share

Search

Browse

Author Corner

Links