Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
120 changes: 50 additions & 70 deletions episodes/05-reproducible-reports.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -5,23 +5,19 @@ exercises: 15
source: Rmd
---

::::::::::::::::::::::::::::::::::::::: objectives

- Describe the value of reproducible reporting.
- Create a new Quarto document (`.qmd`) in RStudio.
- Use Markdown syntax to format text.
- Create and run code chunks within a Quarto document.
- Render a Quarto document to an HTML report.

::::::::::::::::::::::::::::::::::::::::::::::::::

:::::::::::::::::::::::::::::::::::::::: questions

- How can I combine my code, results, and narrative into a single document?
- How can I automatically update my reports when my data changes?
- What is Quarto and how does it differ from a standard R script?

::::::::::::::::::::::::::::::::::::::::::::::::::
::: objectives
- Describe the value of reproducible reporting.
- Create a new Quarto document (`.qmd`) in RStudio.
- Use Markdown syntax to format text.
- Create and run code chunks within a Quarto document.
- Render a Quarto document to an HTML report.
:::

::: questions
- How can I combine my code, results, and narrative into a single document?
- How can I automatically update my reports when my data changes?
- What is Quarto and how does it differ from a standard R script?
:::

```{r, include=FALSE}
source("files/download_data.R")
Expand Down Expand Up @@ -73,7 +69,7 @@ books2 <- read_csv("data/books.csv") %>%

## Introduction to Reproducible Reporting

So far, we have been writing code in `.R` scripts. This is excellent for data analysis, but what happens when you need to share your findings with a colleague or a library director? You might copy a plot into a Word document or an email, then type out your interpretation.
So far, we have been writing code in `.R` scripts. This is excellent for data analysis, but what happens when you need to share your findings with a colleague or a library director? You might copy a plot into a Word document or an email, then type out your interpretation.

But what if the data changes next month? You would have to re-run your script, re-save the plot, copy it back into Word, and update your text. This manual process is prone to errors and tedious.

Expand All @@ -83,21 +79,19 @@ But what if the data changes next month? You would have to re-run your script, r

To create a new Quarto document in RStudio:

1. Click the **File** menu.
2. Select **New File** > **Quarto Document...**
3. In the dialog box, give your document a **Title** (e.g., "Library Usage Report") and enter your name as **Author**.
4. Ensure **HTML** is selected as the output format.
5. Click **Create**.
1. Click the **File** menu.
2. Select **New File** \> **Quarto Document...**
3. In the dialog box, give your document a **Title** (e.g., "Library Usage Report") and enter your name as **Author**.
4. Ensure **HTML** is selected as the output format.
5. Click **Create**.

RStudio will open a new file with some example content. Notice the file extension is `.qmd`.

::::::::::::::::::::::::::::::::::::::::: callout

::: callout
## Quarto vs. RMarkdown

If you have used R before, you might be familiar with RMarkdown (`.Rmd`). Quarto (`.qmd`) is the next-generation version of RMarkdown. It works very similarly but supports more languages (like Python and Julia) and has better features for scientific publishing.

:::::::::::::::::::::::::::::::::::::::::
:::

## Anatomy of a Quarto Document

Expand All @@ -107,7 +101,7 @@ A Quarto document has three main parts:

At the very top, enclosed between two lines of `---`, is the **YAML Header**. This contains metadata about the document.

```yaml
``` yaml
---
title: "Library Usage Report"
author: "Your Name"
Expand All @@ -117,24 +111,22 @@ format: html

### 2. Markdown Text

The white space is where you write your narrative. You use **Markdown** syntax to format text.
The white space is where you write your narrative. You use **Markdown** syntax to format text.

- `**Bold**` for **bold text**
- `*Italics*` for *italics*
- `# Heading 1` for a main title
- `## Heading 2` for a section title
- `- List item` for bullet points
- `**Bold**` for **bold text**
- `*Italics*` for *italics*
- `# Heading 1` for a main title
- `## Heading 2` for a section title
- `- List item` for bullet points

### 3. Code Chunks

Code chunks are where your R code lives. They start with ` ```{r} ` and end with ` ``` `.
Code chunks are where your R code lives. They start with ```` ```{r} ```` and end with ```` ``` ````.

````
```{r}
# This is a code chunk
summary(cars)
```
````

You can insert a new chunk by clicking the **+C** button in the editor toolbar, or by pressing <kbd>Ctrl</kbd>+<kbd>Alt</kbd>+<kbd>I</kbd> (Windows/Linux) or <kbd>Cmd</kbd>+<kbd>Option</kbd>+<kbd>I</kbd> (Mac).

Expand All @@ -145,7 +137,7 @@ Let's clean up the example file and create a report using our `books` data.
1. Delete everything in the file *below* the YAML header.
2. Add a new **setup** code chunk to load our libraries and prepare the data.

```{{r}}
```{r}
#| label: setup
#| include: false

Expand All @@ -167,29 +159,24 @@ books2 <- read_csv("data/books.csv") %>%
)
```

::::::::::::::::::::::::::::::::::::::::: callout

::: callout
## Chunk Options

Notice the lines starting with `#|`. These are **chunk options**.
- `#| label: setup` gives the chunk a name.
- `#| include: false` runs the code but hides the code and output from the final report. This is great for loading data silently.

:::::::::::::::::::::::::::::::::::::::::
Notice the lines starting with `#|`. These are **chunk options**. - `#| label: setup` gives the chunk a name. - `#| include: false` runs the code but hides the code and output from the final report. This is great for loading data silently.
:::

### Adding Analysis

Now, let's add a section header and some text.

```markdown
``` markdown
## High Usage Items

We are analyzing items with more than 10 checkouts to understand circulation patterns across sub-collections.
```

Next, insert a new code chunk and paste the plotting code we developed in the previous episode (ggplot2).

````
```{r}
#| label: plot-high-usage
#| echo: false
Expand All @@ -211,7 +198,6 @@ ggplot(data = booksHighUsage,
theme_bw() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
```
````

Setting `#| echo: false` will display the *plot* in the report, but hide the R *code* that generated it. This is often preferred for reports intended for non-coders.

Expand All @@ -220,67 +206,61 @@ Setting `#| echo: false` will display the *plot* in the report, but hide the R *
Now comes the magic. Click the **Render** button (blue arrow icon) at the top of the editor pane.

RStudio will:

1. Run all your code chunks from scratch.
2. Generate the plots and results.
3. Combine them with your text.
4. Create a new file named `library_usage_report.html` in your project folder.
5. Open a preview of the report.

::::::::::::::::::::::::::::::::::::::: challenge

:::: challenge
## Challenge: Add a Summary Table

1. Add a new header `## Summary Statistics` to your Quarto document.
2. Insert a new code chunk.
3. Write code to calculate the mean checkouts per format (Hint: use `group_by(format)` and `summarize()`).
4. Render the document again to see your new table included in the report.

::::::::::::::: solution

::: solution
## Solution

Add this to your document:

```markdown
``` markdown
## Summary Statistics

The table below shows the average checkouts for each item format.
```

````
```{r}
```{{r}}
#| label: summary-table

books2 %>%
group_by(format) %>%
summarize(mean_checkouts = mean(tot_chkout, na.rm = TRUE)) %>%
arrange(desc(mean_checkouts))
```
````

Render the document to see the updated report.

:::::::::::::::::::::::::

::::::::::::::::::::::::::::::::::::::::::::::::::
:::
::::

## Why This Matters

By using Quarto, your report is now **reproducible**.
By using Quarto, your report is now **reproducible**.

If you download a new version of `books.csv` next month:

1. Save it to your `data/` folder.
2. Open your Quarto document.
3. Click **Render**.

Your report will automatically update with the new data, creating a fresh plot and table without you having to copy-paste a single thing.

:::::::::::::::::::::::::::::::::::::::: keypoints

- **Quarto** allows you to mix code and text to create reproducible reports.
- Use the **YAML header** to configure document metadata like title and output format.
- **Code chunks** run R code and can display or hide input/output using options like `#| echo: false`.
- **Rendering** the document executes the code and produces the final output (HTML, PDF, etc.).
- This workflow saves time and reduces errors when reporting on data that changes over time.

::::::::::::::::::::::::::::::::::::::::::::::::::
::: keypoints
- **Quarto** allows you to mix code and text to create reproducible reports.
- Use the **YAML header** to configure document metadata like title and output format.
- **Code chunks** run R code and can display or hide input/output using options like `#| echo: false`.
- **Rendering** the document executes the code and produces the final output (HTML, PDF, etc.).
- This workflow saves time and reduces errors when reporting on data that changes over time.
:::
1 change: 1 addition & 0 deletions lc-r.Rproj
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
Version: 1.0
ProjectId: 0c1aa31e-0b52-4784-994c-ceebf424a714

RestoreWorkspace: Default
SaveWorkspace: Default
Expand Down