What this course isn’t, and is

Is Not:

  • a statistics course (in the way you think)
  • a general introduction to software engineering
  • a basic introduction to R
  • an introduction to machine learning or AI


  • an introduction to data science
  • about exploratory data analysis, data management, reproducibility — and a little philosophy of science
  • habituation to some good software engineering practices that are especially valuable for data science work


This course assumes basic competence with introductory R.

“Introductory R”

Lessons 1-5 of the Carpentries “R for Social Scientists” curriculum

  • Installing R and packages
  • Working in the R Studio IDE
  • Common data types
  • Reading and writing CSV files
  • Tidyverse R: mutate(), filter(), select(); plotting with ggplot2
“Basic competence”
Given time and a reference (cheatsheet, Stack Exchange, mentor) you can figure out how to solve a problem



There are 6 labs. For the labs, you need to:

  • Submit all 6 labs
  • Revise and resubmit at least 4 labs, as necessary, until I accept them (usually based on automatic checks)
lab deadline
1 Git Sep 15
2 Debugging Sep 22
3 Functional Programming Oct 6
4 EDA Oct 27
5 Code Review Dec 1
6 Reproducibility Dec 1
The course project is divided into stages. Detailed instructions and guidelines are here.
stage deadline
1 Proposal Sept 22
2 Data journey narrative Oct 13
3 Exploratory data analysis Nov 3
4 Code review Dec 8
5 Reproducible report Dec 13
6 Flash talk presentation Dec 13
The usual
Do the assigned reading, come to class prepared to discuss it, contribute to class and your lab collaborations, and so on.

Pedagogical note

This course uses a version of contract grading. This means:

  • Your grade will be determined by the work you complete, not an assessment of the quality of your work.
  • There are only two possible grades: completed (A) and incomplete (B).
    • In exceptional cases, eg, almost no work completed, I might also assign a failing grade.

Contract grading was originally developed in writing courses, where the primary goal was to align grading with explicit (“objective”) measures of effort or productivity rather than tacit (“subjective”) measures of quality. Simply practicing writing a lot is more valuable for most undergraduate students than trying to write well. Contract grading also simplifies the grading process.

I hate the term “contract grading,” which reinforces the idea that education is a commodity that you, the student, are purchasing from me, the teacher. The student-written “contract” of contract grading also seems basically unnecessary.