Today

By the end of today you will…

Getting started

Download this application exercise by pasting the code below into your console (bottom left of screen)

download.file("https://sta101.github.io/static/appex/ae1.Rmd",
destfile = "ae1.rmd")

R as a calculator

5 * 5 + 10
x = 3
x + x^2
x = 1:10
x * 7

In the last couple examples we save some value as the object “x”.

We can “print” x to the screen by typing the name of the object (“x”) in the console or in a code chunk.

Tour of RStudio

Load a package

library(tidyverse) 

Load data

roster = read_csv("https://sta101.github.io/static/appex/data/sample-roster.csv")
survey = read_csv("https://sta101.github.io/static/appex/data/sample-survey.csv")

Question: What objects store the data in the code chunk above? Can you print them to the screen?

Create a new code chunk with CMD+OPTION+I (mac) or CTRL+ALT+I (windows/linux)

So far we’ve already seen two functions. library and read_csv. Functions in R are attached to parentheses and take an input, aka an argument, and often (but not always) return an output. To learn more about a function, you can check the documentation with ?, e.g. ?library.

Demos

Let’s glimpse the data frame.

glimpse(survey)
## Rows: 12
## Columns: 5
## $ name                 <chr> "A", "Appa", "Bumi", "Soka", "Katara", "Suki", "Z…
## $ email                <chr> "the-last-Rbender@duke.edu", "yip-yip-appa@duke.e…
## $ bender               <chr> "Airbender", "Airbender", "Earthbender", "None", …
## $ previous_programming <chr> "No", "No", "No", "Somewhat", "Yes", "Yes", "Yes"…
## $ cat_dog              <chr> "dog", "cat", "cat", "dog", "dog", "cat", "cat", …

To look at all of it, we can use view()

view(survey)

View the roster data in the console

Terminology: “columns” of a dataframe are called variables whereas “rows” are observations.

Question: How many variables are in the data frame survey? How many observations? What about the data frame roster?

Why must I input net-id email?

roster %>% 
  left_join(survey, by = "email")

Count the benders in the data

count(survey, bender)
## # A tibble: 5 × 2
##   bender          n
##   <chr>       <int>
## 1 Airbender       3
## 2 Earthbender     3
## 3 Firebender      4
## 4 None            1
## 5 Waterbender     1
survey %>%
  mutate(pet = ifelse(cat_dog == "dog", 1, 0)) %>%
  group_by(bender) %>%
  summarize(proportion_dog = mean(pet))
## # A tibble: 5 × 2
##   bender      proportion_dog
##   <chr>                <dbl>
## 1 Airbender            0.667
## 2 Earthbender          0.333
## 3 Firebender           0    
## 4 None                 1    
## 5 Waterbender          1