Software, hardware, and accounts
Checklist
- R 4.1 or higher
- RStudio
- Package manager for your OS
- git and GitKraken
- pandoc or Quarto
- GitHub account
Hardware
- Many sessions will involve writing code in class
- Tablets (iPad, Surface) will only work if you’re using them to access a cloud computing setup
- I recommend that your machine have at least
- 200 GB free hard drive space
- 8 GB RAM
- Intel i5 processor
- Operating system capable of running R 4.1 or higher
- Mac: 10.13 (High Sierra) or more recent
- Windows: See here
- If you’re running Linux I’m going to assume you can figure this out yourself
Accessibility
I chose the tools and platforms for this course in part because they’re industry-standard. If you pursue a career as a data scientist in industry, you’ll be expected to use GitHub (or something similar) on a daily basis. And RStudio is by far the most commonly used IDE (“integrated development environment”) for R.
However, like many other technologies, they were originally developed using ableist assumptions about “normal” computer users. In response to criticism, the developers of these systems and tools have gone back and made their technologies more accessible. But there may still be barriers to accessibility that I have not anticipated.
If you encounter a barrier to participating in this course — even a small inconvenience — please let me know. Similarly, if you have ideas for making the course more accessible, please share them with me.
R and RStudio
- R is updated regularly, and there’s a breaking change about once per year
- RStudio also regularly gets new features, though breaking changes are less frequent
- For consistency in class, I encourage you to use the latest version of both R and RStudio
- I use rig to switch between different versions of R
Compilers
Sometimes you’ll need to compile a package in order to install it.
- Mac
- In a terminal, simply enter
You’ll need to enter an administrator passwordsudo xcode-select --install
- If you also need a Fortran compiler
(personally I’ve never needed one)
- Windows
- Linux
- I’m pretty sure Linux and the like come with the necessary compilers
OS
- All of the tools we’ll use are available on all major OSes
- I have much less experience debugging Windows and Linux, but will do what I can
Package manager
- Software to facilitate installing, upgrading, and removing other software
- Highly recommended for installing git, pandoc, and some other tools we’ll use later
- Mac OS: Homebrew or MacPorts
- Pros and cons of these two main options
- I use Homebrew
- Windows: Chocolatey
- I guess there are others, but this is what previous students have recommended
- Either: Anaconda is popular among Python folks, and works as a package manager for multiple OSes
- But sometimes packages aren’t up to date
- Ex: Most recent version of R appears to be 3.6.0, released April 2019
Once your package manager is ready to go, install git and pandoc. (You might consider Quarto, which includes pandoc and possibly also LaTeX.)
Mac, Homebrew:
brew install git brew install pandoc
Windows, Chocolatey:
- Needs to be in elevated/administrator mode
- If prompted, Yes to running additional scripts
choco install git choco install pandoc
LaTeX
We don’t use it directly, but it turns Rmarkdown into PDFs
If you have no idea what that meant, don’t worry about it for now
I recommend the TinyTeX distribution, which you can install via R:
install.packages('tinytex') tinytex::install_tinytex()
GitHub and GitKraken
- You’ll need
- a GitHub account,
- the GitHub Student Developer Pack, and
- GitKraken
- After you have access to the Student Developer Pack, activate GitHub integration in GitKraken. This should enable all of GitKraken’s Pro features.
- GitHub is owned by Microsoft
- And there are concerns about
- But it offers some key features that I want to use in this course
- You can do everything this course requires in private repositories, so long as you give access to the people who need access