dplyr
After today’s session you will be able to:
dplyr
RMarkdown (Rmd) files have three sections:
# = headings
_text_ = italics
**text** = bold
[text](link) = hyperlinked text
Other format options here: markdownguide.org/basic-syntax
Let’s look at the structure of an example code chunk
Note that chunk start must be formatted like:
- ```{language chunk_name, option_1, option_2, ...}
Let’s check out three crucial code chunk options!
rmarkdown
package
install.packages
function
Look at (1) YAML, (2) Plain text, and (3) code chunks
read.csv
readxl::read_excel
read.csv
to read “minnow.csv” into R
str
or dplyr::glimpse
Two ways in base R to access data:
data[row number, column number]
# Get first column
my_df[,1]
# Get first row
my_df[1,]
# Get the value in the tenth row and fourth column
my_df[10, 4]
my_df[c(1, 2, 3), 1]
would get rows 1 through 3 of column 1data$column
dplyr
Part 1: filter
==
, |
, and &
filter
dplyr::filter
== subset
# Subset to only butterfly milkweed records
milkweed <- filter(.data = flowers, species == "Asclepias tuberosa")
filter
instead of subset
just to live fully in the Tidyverse
dplyr
Part 2: mutate
mutate
-
) in column names
dplyr
Part 3: select
select
# Keep only species information and count columns
df_v3a <- select(.data = df_v2, species, count)
# Remove the weight column
df_v3b <- select(.data = df_v2, -weight_kg)
dplyr