Intro to Data Science

Lab 2 – Conditionals & Version Control

A Guide to Your Process

Scheduling

Learning Objectives

Practice

Supporting Information

Class Discussion

Today’s Plan

  • Muddiest Point Review
  • Conditionals
  • Version Control Background
  • Navigating GitHub

Today’s Learning Objectives

After today’s session you will be able to:

  • Write conditional statements
  • Manage missing data in objects with conditionals
  • Explain the difference(s) between Git and GitHub
  • Define fundamental version control vocabulary

Muddiest Point Review

  • Recurring topics from most recent MPs


  • What other topic(s) would you like to review?

Conditionals

  • You can write code that runs only if an ‘if statement’ is true
    • Otherwise that chunk of code is skipped!


  • This allows you to write flexible code that can handle any outcome that you can anticipate!
    • Particularly useful for subsetting data based on the contents of a column


  • These ‘if statements’ are called conditionals


  • The answer to a conditional must be either TRUE or FALSE

Fundamentals: EQUAL

  • Are two things exactly equal?


# Check a conditional
"hello" == "hello"
[1] TRUE


  • Uses == operator
    • Just two equal signs

Fundamentals: OR

  • Are any of these conditions met?


# Check either one conditional *or* the other
"hello" == "hello" | 2 == 7
[1] TRUE


  • Uses | operator
    • Shift + \ on keyboard

Fundamentals: AND

  • Are all of the conditions met?


# Are *all* conditions TRUE?
"hello" == "hello" & 2 == 7
[1] FALSE


  • Uses & operator
    • Shift + 7 on keyboard

Fundamentals: Summary

EQUAL

  • Are two things exactly equal?


"hello" == "hello"
[1] TRUE


  • Uses == operator
    • Just two equal signs

OR

  • Are any of these conditions met?


"hello" == "hello" | 2 == 7
[1] TRUE


  • Uses | operator
    • Shift + \ on keyboard

AND

  • Are all of the conditions met?


"hello" == "hello" & 2 == 7
[1] FALSE


  • Uses & operator
    • Shift + 7 on keyboard

Practice: Fundamental Conditionals

  1. Write a conditional that tests whether “apple” is equal to “orange”


  1. Write a conditional that uses the & operator and returns TRUE


  1. Write a conditional that uses the | operator and returns FALSE

Conditionals & Vectors

  • We can also use conditionals on vectors


  • When we do this, we get one TRUE or FALSE per element in the vector


  • Let’s explore that with a demonstration:
# Make a vector
my_vec <- c(1, 2, 1, 1, 3, 0, 2)

# Find all elements that are equal to 1
my_vec == 1
[1]  TRUE FALSE  TRUE  TRUE FALSE FALSE FALSE

Conditionals & Subsetting

palmerpenguins R package hex logo

  • Often we want to use conditional statements to “subset” data
    • I.e., keep only rows that meet the condition


  • Subsetting is supported because columns and rows in data are vectors


  • Let’s look at the first column in an example dataset from the palmerpenguins package
# Check out a condiitonal on one column in the dataset
palmerpenguins::penguins[1] == "Adelie"
       species
  [1,]    TRUE
  [2,]    TRUE
  [3,]    TRUE
  [4,]    TRUE
  [5,]    TRUE
  [6,]    TRUE
  [7,]    TRUE
  [8,]    TRUE
  [9,]    TRUE
 [10,]    TRUE
 [11,]    TRUE
 [12,]    TRUE
 [13,]    TRUE
 [14,]    TRUE
 [15,]    TRUE
 [16,]    TRUE
 [17,]    TRUE
 [18,]    TRUE
 [19,]    TRUE
 [20,]    TRUE
 [21,]    TRUE
 [22,]    TRUE
 [23,]    TRUE
 [24,]    TRUE
 [25,]    TRUE
 [26,]    TRUE
 [27,]    TRUE
 [28,]    TRUE
 [29,]    TRUE
 [30,]    TRUE
 [31,]    TRUE
 [32,]    TRUE
 [33,]    TRUE
 [34,]    TRUE
 [35,]    TRUE
 [36,]    TRUE
 [37,]    TRUE
 [38,]    TRUE
 [39,]    TRUE
 [40,]    TRUE
 [41,]    TRUE
 [42,]    TRUE
 [43,]    TRUE
 [44,]    TRUE
 [45,]    TRUE
 [46,]    TRUE
 [47,]    TRUE
 [48,]    TRUE
 [49,]    TRUE
 [50,]    TRUE
 [51,]    TRUE
 [52,]    TRUE
 [53,]    TRUE
 [54,]    TRUE
 [55,]    TRUE
 [56,]    TRUE
 [57,]    TRUE
 [58,]    TRUE
 [59,]    TRUE
 [60,]    TRUE
 [61,]    TRUE
 [62,]    TRUE
 [63,]    TRUE
 [64,]    TRUE
 [65,]    TRUE
 [66,]    TRUE
 [67,]    TRUE
 [68,]    TRUE
 [69,]    TRUE
 [70,]    TRUE
 [71,]    TRUE
 [72,]    TRUE
 [73,]    TRUE
 [74,]    TRUE
 [75,]    TRUE
 [76,]    TRUE
 [77,]    TRUE
 [78,]    TRUE
 [79,]    TRUE
 [80,]    TRUE
 [81,]    TRUE
 [82,]    TRUE
 [83,]    TRUE
 [84,]    TRUE
 [85,]    TRUE
 [86,]    TRUE
 [87,]    TRUE
 [88,]    TRUE
 [89,]    TRUE
 [90,]    TRUE
 [91,]    TRUE
 [92,]    TRUE
 [93,]    TRUE
 [94,]    TRUE
 [95,]    TRUE
 [96,]    TRUE
 [97,]    TRUE
 [98,]    TRUE
 [99,]    TRUE
[100,]    TRUE
[101,]    TRUE
[102,]    TRUE
[103,]    TRUE
[104,]    TRUE
[105,]    TRUE
[106,]    TRUE
[107,]    TRUE
[108,]    TRUE
[109,]    TRUE
[110,]    TRUE
[111,]    TRUE
[112,]    TRUE
[113,]    TRUE
[114,]    TRUE
[115,]    TRUE
[116,]    TRUE
[117,]    TRUE
[118,]    TRUE
[119,]    TRUE
[120,]    TRUE
[121,]    TRUE
[122,]    TRUE
[123,]    TRUE
[124,]    TRUE
[125,]    TRUE
[126,]    TRUE
[127,]    TRUE
[128,]    TRUE
[129,]    TRUE
[130,]    TRUE
[131,]    TRUE
[132,]    TRUE
[133,]    TRUE
[134,]    TRUE
[135,]    TRUE
[136,]    TRUE
[137,]    TRUE
[138,]    TRUE
[139,]    TRUE
[140,]    TRUE
[141,]    TRUE
[142,]    TRUE
[143,]    TRUE
[144,]    TRUE
[145,]    TRUE
[146,]    TRUE
[147,]    TRUE
[148,]    TRUE
[149,]    TRUE
[150,]    TRUE
[151,]    TRUE
[152,]    TRUE
[153,]   FALSE
[154,]   FALSE
[155,]   FALSE
[156,]   FALSE
[157,]   FALSE
[158,]   FALSE
[159,]   FALSE
[160,]   FALSE
[161,]   FALSE
[162,]   FALSE
[163,]   FALSE
[164,]   FALSE
[165,]   FALSE
[166,]   FALSE
[167,]   FALSE
[168,]   FALSE
[169,]   FALSE
[170,]   FALSE
[171,]   FALSE
[172,]   FALSE
[173,]   FALSE
[174,]   FALSE
[175,]   FALSE
[176,]   FALSE
[177,]   FALSE
[178,]   FALSE
[179,]   FALSE
[180,]   FALSE
[181,]   FALSE
[182,]   FALSE
[183,]   FALSE
[184,]   FALSE
[185,]   FALSE
[186,]   FALSE
[187,]   FALSE
[188,]   FALSE
[189,]   FALSE
[190,]   FALSE
[191,]   FALSE
[192,]   FALSE
[193,]   FALSE
[194,]   FALSE
[195,]   FALSE
[196,]   FALSE
[197,]   FALSE
[198,]   FALSE
[199,]   FALSE
[200,]   FALSE
[201,]   FALSE
[202,]   FALSE
[203,]   FALSE
[204,]   FALSE
[205,]   FALSE
[206,]   FALSE
[207,]   FALSE
[208,]   FALSE
[209,]   FALSE
[210,]   FALSE
[211,]   FALSE
[212,]   FALSE
[213,]   FALSE
[214,]   FALSE
[215,]   FALSE
[216,]   FALSE
[217,]   FALSE
[218,]   FALSE
[219,]   FALSE
[220,]   FALSE
[221,]   FALSE
[222,]   FALSE
[223,]   FALSE
[224,]   FALSE
[225,]   FALSE
[226,]   FALSE
[227,]   FALSE
[228,]   FALSE
[229,]   FALSE
[230,]   FALSE
[231,]   FALSE
[232,]   FALSE
[233,]   FALSE
[234,]   FALSE
[235,]   FALSE
[236,]   FALSE
[237,]   FALSE
[238,]   FALSE
[239,]   FALSE
[240,]   FALSE
[241,]   FALSE
[242,]   FALSE
[243,]   FALSE
[244,]   FALSE
[245,]   FALSE
[246,]   FALSE
[247,]   FALSE
[248,]   FALSE
[249,]   FALSE
[250,]   FALSE
[251,]   FALSE
[252,]   FALSE
[253,]   FALSE
[254,]   FALSE
[255,]   FALSE
[256,]   FALSE
[257,]   FALSE
[258,]   FALSE
[259,]   FALSE
[260,]   FALSE
[261,]   FALSE
[262,]   FALSE
[263,]   FALSE
[264,]   FALSE
[265,]   FALSE
[266,]   FALSE
[267,]   FALSE
[268,]   FALSE
[269,]   FALSE
[270,]   FALSE
[271,]   FALSE
[272,]   FALSE
[273,]   FALSE
[274,]   FALSE
[275,]   FALSE
[276,]   FALSE
[277,]   FALSE
[278,]   FALSE
[279,]   FALSE
[280,]   FALSE
[281,]   FALSE
[282,]   FALSE
[283,]   FALSE
[284,]   FALSE
[285,]   FALSE
[286,]   FALSE
[287,]   FALSE
[288,]   FALSE
[289,]   FALSE
[290,]   FALSE
[291,]   FALSE
[292,]   FALSE
[293,]   FALSE
[294,]   FALSE
[295,]   FALSE
[296,]   FALSE
[297,]   FALSE
[298,]   FALSE
[299,]   FALSE
[300,]   FALSE
[301,]   FALSE
[302,]   FALSE
[303,]   FALSE
[304,]   FALSE
[305,]   FALSE
[306,]   FALSE
[307,]   FALSE
[308,]   FALSE
[309,]   FALSE
[310,]   FALSE
[311,]   FALSE
[312,]   FALSE
[313,]   FALSE
[314,]   FALSE
[315,]   FALSE
[316,]   FALSE
[317,]   FALSE
[318,]   FALSE
[319,]   FALSE
[320,]   FALSE
[321,]   FALSE
[322,]   FALSE
[323,]   FALSE
[324,]   FALSE
[325,]   FALSE
[326,]   FALSE
[327,]   FALSE
[328,]   FALSE
[329,]   FALSE
[330,]   FALSE
[331,]   FALSE
[332,]   FALSE
[333,]   FALSE
[334,]   FALSE
[335,]   FALSE
[336,]   FALSE
[337,]   FALSE
[338,]   FALSE
[339,]   FALSE
[340,]   FALSE
[341,]   FALSE
[342,]   FALSE
[343,]   FALSE
[344,]   FALSE

Subsetting Cont.

palmerpenguins R package hex logo

  • We can use the base R subset function to keep only the rows of data where a specified column meets a condition


  • Let’s subset the penguin dataset we explored earlier
# Load needed library
library(palmerpenguins)

# Check row number of penguins data
nrow(penguins)
[1] 344


# Subset it to only 2008
peng_sub <- subset(x = penguins, year == 2008)

# Check row number again
nrow(peng_sub)
[1] 114

Practice: Subsetting

palmerpenguins R package hex logo

  • We’ll use the base R subset function with the peng_df object
    • If needed, consult the help file for more details (?subset)


  • Subset peng_df to only Adelie penguins (and assign to a sub_v1 object)
    • I.e., species == "Adelie"


  • How many rows does that subset have?

More Practice: Fundamental Conditionals

palmerpenguins R package hex logo

  • Subset peng_df to Adelie or Gentoo penguins
    • Assign this subset to sub_v2 object


  • Subset peng_df to only male Gentoo penguins
    • Assign to sub_v3 object


  • How many rows does that subset have?

Discussion: Conditionals

  • We’ve covered EQUAL, OR, and AND
    • ==, |, or &


  • What unanswered questions do you have?


  • What other types of conditional statements would be useful?
    • Think about it in the context of wanting to subset some data

Numeric Conditionals

  • For numbers, we can use greater/less than conditionals!


  • Greater / less than are expressed as normal
    • > and <


  • Adding ‘or equal to’ is done by adding an equal sign
    • >= and <=

Practice: Numeric Conditionals

palmerpenguins R package hex logo

  • Subset peng_df to only penguins with a bill length greater than 40 mm
    • Assign to sub_v7


  • Subset peng_df to only penguins with a body mass less than or equal to 4,000 g
    • Assign to sub_v8

Temperature Check

How are you Feeling?

Comic-style graph depicting someone's emotional state as they debug code (from initial struggle and defeat to eventual triumph)

Version Control Background

  • Version control” is a set of tools for tracking changes to a code file


  • Version control lets you work only in a single file but still preserves the history
    • No comments you’d have to resolve (like MS Word / etc.)
    • No need to “Save As” and retain many different files for each draft

Version Control Rationale

Why use version control?


Collaboration Handshake


Easily share & work together on files

Reproducibility Arrows Left Right


Fully document your progress and end results

Portfolio Book


Demonstrate your skills to others with coding know-how

Git Versus GitHub

Git logo GitHub logo

Git

GitHub

Git Versus GitHub

Git logo GitHub logo

Git

  • Actual version control software


  • Does the tracking of changes


  • Works locally on your computer


  • Not something others can interact with

GitHub

Git Versus GitHub

Git logo GitHub logo

Git

  • Actual version control software


  • Does the tracking of changes


  • Works locally on your computer


  • Not something others can interact with

GitHub

  • Graphical user interface (GUI) for Git


  • Allows viewing of Git-tracked changes
    • Not actually doing version control itself


  • Other users can see your content
    • Depending on your settings

Create a Github Profile


  • Create a profile


  • Let me know if you run into any issues!

Practice: Create a Repository

  • To practice what we’ve just covered, make a practice repository!


  • This is a test repository so:
    • Set it to “Private”
    • Create the repo with both a README and a .gitignore


  • As you go through this process, take notes for ‘future you’
    • This course will require you to make two more repositories
    • So you’ll want to have a good resource to remind yourself with down the line

Temperature Check

How are you Feeling?

Comic-style graph depicting someone's emotional state as they debug code (from initial struggle and defeat to eventual triumph)

Upcoming Due Dates

Due before lecture

(By midnight)

Due before lab

(By midnight)

  • Muddiest Point #3

Bonus Git Info

Git & RStudio

  • If desired, you can get GitHub to talk directly with RStudio
    • Done through Git!


  • Advantage is a clearer connection between your RStudio work and GitHub

Prep Steps

  1. Install R & RStudio


  1. Create a GitHub Account


  1. Install Git


  1. Connect RStudio and GitHub

Git and RStudio

  • You can work through part of an established GitHub workshop for this section


Bonus Conditionals

OR with >2 Options

  • OR conditionals with many options get cumbersome quickly
    • E.g., x == 1 | x == 2 | x == 3 | x == 4 …


  • We can use concatenation and the %in% operator to simplify this!


  • %in% is effectively “if any of <this vector> matches the value”
    • E.g., x %in% c(1, 2, 3, 4, …)

Conditionals: NOT

  • You can also exclude based on conditions
    • Two different ways of doing this


  • For one / a few options: use != for “not equal to”
    • E.g., x != 10


  • Can be combined with %in% to exclude a set of options
    • E.g., !x %in% c(...)
    • Note the exclamation mark is before the object

Practice: Advanced Conditionals

palmerpenguins R package hex logo

  • Subset peng_df is species is any of “Adelie”, “Gentoo”, or “Chinstrap”
    • Use the %in% operator


  • Subset peng_df to all islands except “Dream”