Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions episodes/different_plots.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ Not to be confused with histograms, barcharts count the number of
observations in different groups. Where the scale in histograms is
continuous, and split into bins, the scale in barcharts is discrete.

Here we map the color-variable to the x-axis in the barchart. `geom_bar` counts
Here we map the `color`-variable to the x-axis in the barchart. `geom_bar` counts
the number of observations itself - we do not need to
provide a count:

Expand All @@ -89,17 +89,17 @@ Why are the columns in the barchart above in that order?

One might guess that they are simply in alphabetical order.

Not so! Color is a categorical variable. Diamonds either have the color
"D" (which is the best color), or another color (like "J", which is the worst).
Not so! `color` is a categorical variable. Diamonds either have the colour
"D" (which is the best colour), or another colour (like "J", which is the worst).

There are no "D.E" colors, they do not exist on a continous range.
There are no "D.E" colours, they do not exist on a continuous range.

This is called "factors" in R.
The data in a factor can take one of several values, called levels. And the
order of these levels are what control the order in the plot.

The order can be either arbitrary, or there can exist an implicit order in the
data, like with the color of the diamonds, where D is the best color, and J is
data, like with the colour of the diamonds, where D is the best colour, and J is
the worst. These types of ordered categorical data are called ordinal data.

They look like this:
Expand All @@ -109,7 +109,7 @@ diamonds %>%
str()
```
Note that even though the colour "D" is better than "E", the levels
of the color factor indicates that "D<E".
of the `color` factor indicates that "D<E".

All this just to say: We can control the order of columns in the plot, by
controlling the order of the levels of the categorical value we are plotting:
Expand Down
16 changes: 8 additions & 8 deletions episodes/facets.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,11 @@ library(tidyverse)

If we only make one plot we quickly runs into the problem of trying to plot
too much information in the plot. Here we plot the price against carat,
color by the color of the diamonds. And represent their clarity by the
colour by the `color` of the diamonds. And represent their clarity by the
shape of the points:

```{r too_much_in_plot}
ggplot(data = diamonds, mapping = aes(x = carat, y = price, color = color, shape = clarity)) +
ggplot(data = diamonds, mapping = aes(x = carat, y = price, colour = color, shape = clarity)) +
geom_point()
```

Expand All @@ -43,7 +43,7 @@ along with all the other information, we make one plot for each value of
clarity. This is called facetting:

```{r first_facet}
ggplot(data = diamonds, mapping = aes(x = carat, y = price, color = color)) +
ggplot(data = diamonds, mapping = aes(x = carat, y = price, colour = color)) +
geom_point() +
facet_wrap(~clarity)
```
Expand Down Expand Up @@ -76,14 +76,14 @@ through them one category at a time, and make comparisons.
## Exercise

Plot price as a function of depth (price on the y-axis, depth on the x-axis),
and facet by cut. If you want a colorful plot, color the points
by color.
and facet by cut. If you want a colourful plot, colour the points
by `color`.

:::: solution
## Solution

```{r solution, eval = FALSE}
ggplot(data = diamonds, mapping = aes(x = depth, y = price, color = color)) +
ggplot(data = diamonds, mapping = aes(x = depth, y = price, colour = color)) +
geom_point() +
facet_wrap(~cut)

Expand All @@ -103,11 +103,11 @@ We can expand on the "small multiple" concept, by plotting the
facets in a grid, defined by two categorical values.

In this plot we plot price as a function of carate, and
make individual plots for each combination of clarity and color:
make individual plots for each combination of `clarity` and `color`:

```{r facet_grid}
diamonds %>%
ggplot(aes(x = carat, y = price, color = color)) +
ggplot(aes(x = carat, y = price, colour = color)) +
geom_point() +
facet_grid(clarity ~ color)
```
Expand Down
22 changes: 13 additions & 9 deletions episodes/further_mapping.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -25,15 +25,16 @@ knitr::opts_chunk$set(echo = TRUE)
```

We saw how to map data to a position in a scatterplot. But we are able to map
the data to other elements of a plot, eg the color of the points.
the data to other elements of a plot, eg the colour of the points.


```{r more_mapping}
ggplot(data = diamonds, mapping = aes(x = carat, y = price, colour = color)) +
geom_point()
```

The argument to which we are mapping the values in the column *color* is also called *colour*, making the code look a bit weird.
The argument to which we are mapping the values in the column *color* is also
called *colour*, making the code look a bit weird.

Are these colours suitable? Probably not. The authors of this course
material are not able to distinguish all of the colours. We will return to how
Expand All @@ -43,20 +44,23 @@ to change colours in plots later in this course.
:::: callout
## Spelling

Color, and some other words can be spelled in more than one way.
For arguments ggplot understands both the correct english spelling
Colour, and some other words can be spelled in more than one way.
For arguments, ggplot understands both the correct english spelling
*colour* and the american spelling *color*.

Note that this only applies to the arguments in the functions. If the
column in the dataset is called *color* ggplot will not find it if
column in the dataset is called *color*, ggplot will not find it if
you write *colour* instead.

In an attempt to reduce confusion, we use *colour* for the arguments and
*color* when we refer to the variable `color`.

::::

Not surprisingly, the "best" color, D have higher prices than the "worst"
color, "J".
Not surprisingly, the "best" colour, D have higher prices than the "worst"
colour, "J".

A common mistake is to place the color argument a wrong place:
A common mistake is to place the colour argument a wrong place:
```{r chunk2}
ggplot(data = diamonds, mapping = aes(x = carat, y = price), colour = color) +
geom_point()
Expand Down Expand Up @@ -106,7 +110,7 @@ values directly. One very useful aesthetic to play with, at least when
we have as many datapoints as we have here, is `alpha`:

```{r}
ggplot(data = diamonds, mapping = aes(x = carat, y = price, color = color)) +
ggplot(data = diamonds, mapping = aes(x = carat, y = price, colour = color)) +
geom_point(alpha = 0.1)
```

Expand Down
2 changes: 1 addition & 1 deletion episodes/getting_started.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ We are going to cover each element in the following.
## What is the difference?

ggplot2 is the library, containing different types of
functions for plotting, theming the plots, changing colors and lots of other stuff.
functions for plotting, theming the plots, changing colours and lots of other stuff.

ggplot is one of these functions in ggplot2, and the one that begins every plot we make.

Expand Down
15 changes: 12 additions & 3 deletions episodes/introduction.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,20 @@ exercises: 5


:::: objectives
- "Get to know the importance of visualisations"
- "Get to know the data we are going to work with"
- "Get to know the importance of visualisations"
- "Get to know the data we are going to work with"

::::

::::instructor
We have a slight problem with the column `color` in the dataset, which
can be confusing when we also have a `colour` argument in play.

We try to be consistent in this material, and only use `color` when referring
to the variable, and the correct spelling when referring to the arguments, or
the phenomenon "colour".

::::


```{r setup, include=FALSE}
Expand Down Expand Up @@ -63,7 +72,7 @@ There are 10 variables in the dataset:
|----------|-------------|
| carat | Weight of the diamond in carat (0.200 gram) |
| cut | Quality of the cut of the diamond (Fair, Good, Very Good, Premium, Ideal) |
| color | Color of the diamond from D (best), to J (worst) |
| color | Colour of the diamond from D (best), to J (worst) |
| clarity | How clear is the diamond. I1 (worst), SI2, SI1, VS2, VS1, VVS2, VVS1, IF (best) |
| depth | Total depth percentage = z / mean(x, y) |
| table | Width of the top of the diamond relative to its widest point |
Expand Down
60 changes: 30 additions & 30 deletions episodes/scaling-and-coordinates.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -7,15 +7,15 @@ exercises: 5
:::: questions
- "How can we adjust the scales in a plot?"
- "How can we zoom-in to specific parts of a plot?"
- "How can we change the colors of the plot?"
- "How can we change the colours of the plot?"
- "How do I make a pie-chart?"
::::

::::objectives
- "Learn to zoom by adjusting scales"
- "Learn how to make log-scale plots"
- "Learn why you should not make a pie-chart"
- "Learn how to control the color-scale"
- "Learn how to control the colour-scale"
::::


Expand Down Expand Up @@ -143,7 +143,7 @@ could interchange the x and y values in the mapping argument.
Or we could add a coordinate function that changes the coordinate system:

```{r flipped-coords}
ggplot(data = diamonds, mapping = aes(x = carat, y = price, color = color)) +
ggplot(data = diamonds, mapping = aes(x = carat, y = price, colour = color)) +
geom_point() +
coord_flip()
```
Expand Down Expand Up @@ -213,7 +213,7 @@ can be defined by making a stacked bar-chart, and changing the
coordinate system to polar.

We begin by filtering the data set to only include diamonds
with the color "G", and then make a barchart. We add the
with the `color` "G", and then make a barchart. We add the
argument `position = "stack"` to `geom_bar` to stack the bars
rather than having them side by side. And then we adjust
the coordinate system to be polar (the y-axis specifically),
Expand All @@ -223,7 +223,7 @@ beginning at 0:
diamonds %>%
mutate(color = as.character(color)) %>%
filter(color == "G") %>%
ggplot(aes(x= color, fill = cut)) +
ggplot(aes(x = color, fill = cut)) +
geom_bar(position = "stack") +
coord_polar("y", start=0)
```
Expand Down Expand Up @@ -264,61 +264,61 @@ Rare exceptions exists. But making pie charts should be done with EXTREME cautio
::::


## Coloring the scale
## Colouring the scale

Looking at the plot below, the authors of this course get pretty frustrated.


```{r scale_color}
ggplot(data = diamonds, mapping = aes(x = carat, y = price, color = color)) +
```{r scale_colour}
ggplot(data = diamonds, mapping = aes(x = carat, y = price, colour = color)) +
geom_point()
```

We are not really able to distinquish the color for "D" and "E". Or for "G" and
"H". Controlling the colors is important not only for aesthetic reasons, but
We are not really able to distinquish the colour for "D" and "E". Or for "G" and
"H". Controlling the colours is important not only for aesthetic reasons, but
also for actually illustrating what the plot is showing.

Here, the color is introduced by mapping the color of the diamonds to the
coloring of the points. This actually is mapping a value to a scale, no different
Here, the colour is introduced by mapping the `color` of the diamonds to the
colouring of the points. This actually is mapping a value to a scale, no different
from the mapping of the price to the y-axis.

In the same way we can adjust the scale of the y-axis as shown above, we are
able to adjust the actual colors in the plot.
able to adjust the actual colours in the plot.

The functions for this are (almost) all called `scale_` and then continues
with `color` if we are coloring points, `fill` if we want to control the
fill-color of a solid object in the plot, and finally something that specifies
with `colour` if we are colouring points, `fill` if we want to control the
fill-colour of a solid object in the plot, and finally something that specifies
either the type of data we are plotting, or specific functionality to control
the color.
the colour.

Below we adjust the color using the special family of functions `brewer`:
`scale_color_brewer`. Nice colors, but even worse:
Below we adjust the colour using the special family of functions `brewer`:
`scale_colour_brewer`. Nice colours, but even worse:

```{r color_brewer}
ggplot(data = diamonds, mapping = aes(x = carat, y = price, color = color)) +
```{r colour_brewer}
ggplot(data = diamonds, mapping = aes(x = carat, y = price, colour = color)) +
geom_point() +
scale_color_brewer() +
scale_colour_brewer() +
theme(panel.background = element_rect(fill = "black"))
```
What we did to change the background will be covered in the next episode.

Finding the optimal colors usually requires a lot of fiddling around. Rather
than using functions to choose the colors, we can chose the manually,
Finding the optimal colours usually requires a lot of fiddling around. Rather
than using functions to choose the colours, we can choose them manually,
like this:

```{r manual_colors}
ggplot(data = diamonds, mapping = aes(x = carat, y = price, color = color)) +
```{r manual_colours}
ggplot(data = diamonds, mapping = aes(x = carat, y = price, colour = color)) +
geom_point() +
scale_color_manual(values=c('#7fc97f','#beaed4','#fdc086','#ffff99','#386cb0','#f0027f','#bf5b17'))
scale_colour_manual(values=c('#7fc97f','#beaed4','#fdc086','#ffff99','#386cb0','#f0027f','#bf5b17'))
```

The codes #7fc97f are "hex-codes", specifying the colors. You can find websites
allowing you to chose a color, and get the code. A good place to get suggestions
for color-pallettes is [Colorbrewer2](https://colorbrewer2.org/).
The codes #7fc97f are "hex-codes", specifying the colours. You can find websites
allowing you to chose a colour, and get the code. A good place to get suggestions
for colour-pallettes is [Colorbrewer2](https://colorbrewer2.org/).


:::: keypoints
- "Pie charts are a bad idea!"
- "Zooming might exclude data if done wrong"
- "Play around to find the colors you like"
- "Play around to find the colours you like"
::::
2 changes: 1 addition & 1 deletion episodes/theming.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ Most of the `elements` of the plot need to be defined in a special way. If we
want the "theme" a text element, we set the `axis.text` to be an `element_text()`
function with specific arguments to specify *what* we want to do. For the background
of the plot we are changing a rectangular object `element_rect`, and can set the background
color like this:
colour like this:

```{r background}
diamonds %>%
Expand Down