Intro to Data Science

Lab 6 – Visualization II

A Guide to Your Process

Scheduling

Learning Objectives

Practice

Supporting Information

Class Discussion

Today’s Plan

  • Muddiest Point Review
  • Data Visualization with ggplot2
    • Editing theme elements
  • Multi-Panel Graphs
  • GitHub Presence Check-Ins (1-on-1)
    • Not graded! Don’t stress!

Today’s Learning Objectives

After today’s session you will be able to:

  • Modify background elements in a ggplot2 graph
  • Create publication-quality figures with ggplot2
  • Explain the difference between plot faceting and plot grids

Muddiest Point Review

  • Recurring topics from most recent MPs:


  • What other topic(s) would you like to review?

ggplot2 Review

  • ggplots require: (1) data, (2) aesthetics, (3) geometries
    • Optionally can mess with theme parameters

Theme Internal Structure

  • Theme is composed of elements


  • Elements can be modified as desired inside of theme function


  • Each type of element has a different ‘helper function’ needed to modify that element
    • Change text = use element_text
    • Change line = use element_line
    • Remove an element with element_blank

Theme Syntax

  • You use the theme function once with as many element_... functions as you need


  • Here’s an example of the proper syntax
# Make a simple scatterplot
ggplot(data = my_df, mapping = aes(x = x_var, y = y_var)) +
    geom_point() +
    # Modify its theme to make the axis font size bigger
    theme(axis.text = element_text(size = 20),
          # Also remove the grid lines
          panel.grid = element_blank())


  • Note how the theme and element_... functions are used together

Gridline Theme Components

  • You’ll learn theme argument names as you work more with ggplot2


  • Here are a few broadly relevant ones:
    • Gridlines = panel.grid
    • Plot background = panel.background
    • Axis lines (X & Y) = axis.line

Get Ready

hex logo for ggplot2 R package

  1. Create a script for this week


  1. Load ggplot2


  1. Read “minnow.csv” into R and check the structure!


  1. Copy the final graph we made last time


  1. Assign the graph to an object

Remove Gridlines

hex logo for ggplot2 R package

  • To that graph, add the following code:
    • theme(panel.grid = element_blank())


  • What does this do to your graph?


  • What happens if you add these two lines as well (inside of the theme parentheses!)?
    • panel.background = element_blank()
    • axis.line = element_line(color = "black")

Remove Gridlines

hex logo for ggplot2 R package

Changing Text Size

  • We can also modify text size inside of theme


  • Axis “title” vs. axis “text
    • axis.title = axis label text (given to labs function)
    • axis.text = text on axis tick marks


Screen capture of the x-axis of a ggplot2-style graph with the axis label (i.e., the bigger text -- a.k.a. 'title') in a rectangle and the axis tick labels (i.e., the smaller text -- a.k.a. 'text') in a separate rectangle

  • Want to modify just X or Y? Add that to the argument name!
    • E.g., theme(axis.text.x = element_text(...))

Change Text Size

hex logo for ggplot2 R package

  • Take your most recent graph
    • No gridlines, no background gray square, black axis lines


  • And make the following tweaks:
    • Make the axis title font size 15
    • Make the axis text font size 13


  • What does that leave you with?

Change Text Size

hex logo for ggplot2 R package

Customizing Legend

  • You can also customize the plot legend in the theme function!
    • Legend placement = legend.position
    • Legend title = legend.title


  • legend.position works differently than most other elements!
    • Instead of wanting element_... it wants c(<x position>, <y position>)
    • Positions range from 0 (left / bottom) to 1 (right / top)

Customizing Legend Syntax

  • Check out an example where we put the legend in the middle of the plot
# Example scatterplot
ggplot(data = my_df, mapping = aes(x = x_var, y = y_var)) +
    geom_point() +
    # With legend in the middle of the space
    theme(legend.position = c(0.5, 0.5))


  • That graph would have the legend in both:
    • the center (left / right)
    • the middle (top / bottom)

Customize Legend

hex logo for ggplot2 R package

  • To the graph you created last practice:
    • Remove the legend title
    • Experiment with legend placement until you’re happy


  • You may put the legend wherever you’d like but:
    • It should not overlap any points / boxplots


  • What does that graph look like?

Customize Legend

hex logo for ggplot2 R package

Temperature Check

How are you Feeling?

Comic-style graph depicting someone's emotional state as they debug code (from initial struggle and defeat to eventual triumph)

Multi-Panel Background

  • Sometimes nice to have multiple graphs next to each other
    • Makes direct comparison easier
    • Journals have limits on number of figures but multi-panels still count as 1


  • Two methods (for ggplots):
    1. ggplot2::facet_grid()
    2. cowplot::plot_grid()

Three panels of maps of the continental US where each panel shows conditions under a different climate change future

Facted Graphs

  • ggplot2 has an internal way of handling this called facets


  • Facets work similarly to geometries
    • You get separate plots for each level of the facet variable


  • Facets must all be the same plot type and have identical axes
    • Sometimes not an issue but good to keep in mind!

Fact Syntax

  • To facet into 1 row x many columns:
# Example scatterplot
ggplot(data = my_df, mapping = aes(x = x_var, y = y_var)) +
    geom_point() +
    # Facet into rows of some other variable
    facet_grid(. ~ facet_variable)


  • To facet into many rows x 1 column:
# Example scatterplot
ggplot(data = my_df, mapping = aes(x = x_var, y = y_var)) +
    geom_point() +
    # Facet into rows of some other variable
    facet_grid(facet_variable ~ .)

Facet Example

hex logo for palmerpenguins R package hex logo for ggplot2 R package

ggplot(data = penguins, aes(x = body_mass_g, y = flipper_length_mm, color = species)) + 
  geom_point() +
  facet_grid(. ~ island) +
  labs(x = "Body Mass (g)", y = "Flipper Length (mm)") +
  theme(legend.position = c(0.87, 0.85),
        legend.title = element_blank(),
        panel.background = element_blank())

Facet

hex logo for ggplot2 R package

  • Using the fish data, make a new graph that:
    • Has nest diameter on the X axis
    • Has nest depth on the Y axis
    • Is a scatterplot
    • Faceted by species
    • Plus any additional theme tweaks you want to make!


  • What does your plot look like?

Facet

hex logo for ggplot2 R package

Plot Grids

  • Facets work great when all panels are the same
    • What about when you want different graphs in each panel?


  • cowplot::plot_grid lets you put multiple different graphs together


  • Have to make graphs separately first, then combine them


  • Example syntax:
cowplot::plot_grid(plot1, plot2, ncol = 1, nrow = 2)

Plot Grid Example

Plot Grids

hex logo for cowplot R package hex logo for ggplot2 R package

  • Make two graphs using the fish data:
  1. Copy your faceted graph of diameter vs. depth
    • But remove the facet by species
  2. Make a boxplot with flow on the y-axis and species on the x-axis


  • Using plot_grid, make a multi-panel graph with these two graphs
    • Make them side by side (I.e., 2 columns, 1 row)

Plot Grids

hex logo for cowplot R package hex logo for ggplot2 R package



Temperature Check

How are you Feeling?

Comic-style graph depicting someone's emotional state as they debug code (from initial struggle and defeat to eventual triumph)

GitHub Presence FAQ

  • Worth 40 pts (16% course grade)


  • Checklist-style rubric on Canvas


  • Due day before last lab


  • Can basically finish all of it now though if you want!

GitHub Presence Assignment

  • This assignment will seriously help in interviews / job apps!
    • Demonstrates your data science skills


  • I don’t want anyone caught unawares by this assignment
    • So I’ll meet with each of you 1-on-1 today to see where you’re at so far


  • Good chance for you to ask any questions you have!
    • Also lets me give you tips for success

Upcoming Due Dates

Due before lab

(By midnight)

  • Homework #6
  • Muddiest Point #7

Due before lecture

(By midnight)

  • Homework #7
  • Submit Draft 2 of Function Tutorials
    • Double check rubric to see that you’re not leaving any points on the table!
    • Remember to also submit the Revision Response

GitHub 1-on-1s

  • Stick around until we do our 1-on-1


  • After you have met with me you can leave
    • Though you’re welcome to stay and work on Homework #6 / course assignments!


  • Any volunteers to go first?