1 Why ggplot2?

# Load library ggplot2
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.2.1

2 Basics of ggplot2

2.1 Data

  • Must be a data.frame
  • Gets pulled into the ggplot() object
head(iris)

2.2 Aesthetics

  • How your data are represented visually
    • a.k.a. mapping
  • which data on the x
  • which data on the y
  • but also: color, size , shape, transparency

2.3 Geometry

  • The geometric objects in the plot
  • points, lines, polygons, etc
  • shortcut functions: geom point(), geom bar(), geom line()

2.3.1 Basic Structure

  • Specify the data and variables inside the ggplot function.
  • Anything else that goes in here becomes a global setting.
  • Then add layers: geometric objects, statistical models, and facets.

2.3.2 An Example:

#ggplot(data = iris, aes(x = Sepal.Length, y = Sepal.Width))+ geom_point()
myplot <- ggplot(data = iris, aes(x = Sepal.Length, y = Sepal.Width))
myplot + geom_point()

2.3.3 Changing the aesthetics of a geom: Increase the size of points

ggplot(data = iris, aes(x = Sepal.Length, y = Sepal.Width)) +
geom_point(size = 3)

2.3.4 Changing the aesthetics of a geom: Add some color

ggplot(iris, aes(Sepal.Length, Sepal.Width, color = Species)) +
geom_point(size = 3)

2.3.5 Changing the aesthetics of a geom: Differentiate points by shape

ggplot(iris, aes(Sepal.Length, Sepal.Width, color = Species)) +
geom_point(aes(shape = Species), size = 3)

2.4 Stats

  • Statistical transformations and data summary
  • All geoms have associated default stats, and vice-versa
  • e.g. binning for a histogram or fitting a linear model

2.4.1 Example:

# Box-plot illustrating birth weight by race
library(MASS) # for loading birthwt data 
ggplot(birthwt, aes(factor(race), bwt)) + geom_boxplot()

2.5 Facets

  • Subsetting data to make lattice plots
  • Really powerful

2.5.1 Faceting: single column, multiple rows

ggplot(iris, aes(Sepal.Length, Sepal.Width, color = Species)) + geom_point() + facet_grid(Species ~ .)

2.5.2 Faceting: single row, multiple columns

ggplot(iris, aes(Sepal.Length, Sepal.Width, color = Species)) + geom_point() + facet_grid( . ~ Species)

2.5.3 or just wrap your facets

ggplot(iris, aes(Sepal.Length, Sepal.Width, color = Species)) +
geom_point() +
facet_wrap( ~ Species) # notice lack of .

2.6 Scales

  • Control the mapping from data to aesthetics
  • Often used for adjusting color mapping

2.6.1 Example

ggplot(iris, aes(Sepal.Length, Sepal.Width, color = Species)) +
  geom_point() + facet_grid(Species ~ .) + scale_color_manual(values = c("red", "green", "blue"))

2.6.2 Commonly used scales

scale_fill_discrete(); scale_colour_discrete()
scale_fill_hue(); scale_color_hue()
scale_fill_manual(); scale_color_manual()
scale_fill_brewer(); scale_color_brewer()
scale_linetype(); scale_shape_manual()

3 Histogram

h <- ggplot(faithful, aes(x = waiting))
h + geom_histogram(binwidth = 8, fill = "steelblue",
colour = "black")

4 Line plot

# Read the climate data:
climate <- read.csv("C:/Users/Chiranjit Dutta/Dropbox/Chiranjit Dutta/R Tutorial Summer 2022/R Tutorial 2022/Lecture_materials/Data/climate.csv", header = T)
ggplot(climate, aes(Year, Anomaly10y)) + geom_line()

We can also plot confidence regions

ggplot(climate, aes(Year, Anomaly10y)) + geom_ribbon(aes(ymin = Anomaly10y - Unc10y, ymax = Anomaly10y + Unc10y),fill = "blue", alpha = .1) + geom_line(color = "steelblue")

5 Bar Plot

library(tidyr)
## Warning: package 'tidyr' was built under R version 4.2.1
df <- gather(iris, variable, value, -Species)
ggplot(df, aes(Species, value, fill = variable)) +
geom_bar(stat = "identity")

ggplot(df, aes(Species, value, fill = variable)) + geom_bar(stat = "identity", position = "dodge")

ggplot(df, aes(Species, value, fill = variable)) + geom_bar(stat = "identity", position="dodge", color="black")

6 Density Plot

ggplot(faithful, aes(waiting)) + geom_density()

ggplot(faithful, aes(waiting)) + geom_density(fill = "blue", alpha = 0.1)

7 Themes

7.0.1 Adding themes

Themes are a great way to define custom plots.

+theme()
# see ?theme() for more options

7.0.2 Example of a themed plot

ggplot(iris, aes(Sepal.Length, Sepal.Width, color = Species)) + geom_point(size = 1.2, shape = 16) +
  facet_wrap( ~ Species) +
  theme(legend.key = element_rect(fill = NA),
        legend.position = "bottom",
        strip.background = element_rect(fill = NA),
        axis.title.y = element_text(angle = 0))

8 Plotting multiple time series on a single graph:

The US economics time series datasets are used from package ggplot2. This is a data frame with 478 rows and 6 variables.

head(economics)
ggplot(economics, aes(x=date)) + 
  geom_line(aes(y = psavert), color = "darkred") + 
  geom_line(aes(y = uempmed), color="steelblue", linetype="twodash") + 
  ggtitle("Multiple time series plot on a single graph") # putting the title on the graph

9 Saving plots

ggsave("˜/path/to/figure/filename.png")
ggsave(plot1, file = "˜/path/to/figure/filename.png")
ggsave(file = "/path/to/figure/filename.png", width = 6, height =4)
ggsave(file = "/path/to/figure/filename.eps")
ggsave(file = "/path/to/figure/filename.jpg")
ggsave(file = "/path/to/figure/filename.pdf")