Skip to content

Commit ce7c68f

Browse files
committed
markdown source builds
Auto-generated via `{sandpaper}` Source : c085c77 Branch : main Author : swillerhansen <121032241+swillerhansen@users.noreply.github.com> Time : 2025-12-02 17:12:12 +0000 Message : Merge pull request #23 from swillerhansen/main rearranged into more episodes
1 parent 1482d7a commit ce7c68f

File tree

5 files changed

+11
-99
lines changed

5 files changed

+11
-99
lines changed
-4.73 KB
Binary file not shown.

introduction.md

Lines changed: 6 additions & 93 deletions
Original file line numberDiff line numberDiff line change
@@ -18,100 +18,13 @@ exercises: 2
1818
::::::::::::::::::::::::::::::::::::::::::::::::
1919

2020
## Introduction
21+
This is a course in how to use the programming language R to do web scraping. Web scraping is a set of methods to systematically download parts of web pages by writing code in a script instead of copy-pasting from the web page.
2122

22-
This is a lesson created via The Carpentries Workbench. It is written in
23-
[Pandoc-flavored Markdown][pandoc] for static files (with extension `.md`) and
24-
[R Markdown][r-markdown] for dynamic files that can render code into output
25-
(with extension `.Rmd`). Please refer to the [Introduction to The Carpentries
26-
Workbench][carpentries-workbench] for full documentation.
23+
A web pages is made up of elements in the HTML language.
24+
HTML is a hierarchical file format
25+
Insert example of HTML hierarchy here
2726

28-
What you need to know is that there are three sections required for a valid
29-
Carpentries lesson template:
27+
The advantage with using web scraping is that we can in our script specify which specific parts of the web page that we want to download by referring to those elements' HTML code. This allows for precision in what we download.
3028

31-
1. `questions` are displayed at the beginning of the episode to prime the
32-
learner for the content.
33-
2. `objectives` are the learning objectives for an episode displayed with
34-
the questions.
35-
3. `keypoints` are displayed at the end of the episode to reinforce the
36-
objectives.
37-
38-
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: instructor
39-
40-
Inline instructor notes can help inform instructors of timing challenges
41-
associated with the lessons. They appear in the "Instructor View"
42-
43-
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
44-
45-
::::::::::::::::::::::::::::::::::::: challenge
46-
47-
## Challenge 1: Can you do it?
48-
49-
What is the output of this command?
50-
51-
```r
52-
paste("This", "new", "lesson", "looks", "good")
53-
```
54-
55-
:::::::::::::::::::::::: solution
56-
57-
## Output
58-
59-
```output
60-
[1] "This new lesson looks good"
61-
```
62-
63-
:::::::::::::::::::::::::::::::::
64-
65-
66-
## Challenge 2: how do you nest solutions within challenge blocks?
67-
68-
:::::::::::::::::::::::: solution
69-
70-
You can add a line with at least three colons and a `solution` tag.
71-
72-
:::::::::::::::::::::::::::::::::
73-
::::::::::::::::::::::::::::::::::::::::::::::::
74-
75-
## Figures
76-
77-
You can include figures generated from R Markdown:
78-
79-
80-
``` r
81-
pie(
82-
c(Sky = 78, "Sunny side of pyramid" = 17, "Shady side of pyramid" = 5),
83-
init.angle = 315,
84-
col = c("deepskyblue", "yellow", "yellow3"),
85-
border = FALSE
86-
)
87-
```
88-
89-
<div class="figure" style="text-align: center">
90-
<img src="fig/introduction-rendered-pyramid-1.png" alt="pie chart illusion of a pyramid" />
91-
<p class="caption">Sun arise each and every morning</p>
92-
</div>
93-
Or you can use pandoc markdown for static figures with the following syntax:
94-
95-
`![optional caption that appears below the figure](figure url){alt='alt text for
96-
accessibility purposes'}`
97-
98-
![You belong in The Carpentries!](https://raw.githubusercontent.com/carpentries/logo/master/Badge_Carpentries.svg){alt='Blue Carpentries hex person logo with no text.'}
99-
100-
## Math
101-
102-
One of our episodes contains $\LaTeX$ equations when describing how to create
103-
dynamic reports with {knitr}, so we now use mathjax to describe this:
104-
105-
`$\alpha = \dfrac{1}{(1 - \beta)^2}$` becomes: $\alpha = \dfrac{1}{(1 - \beta)^2}$
106-
107-
Cool, right?
108-
109-
::::::::::::::::::::::::::::::::::::: keypoints
110-
111-
- Use `.md` files for episodes when you want static content
112-
- Use `.Rmd` files for episodes when you need to generate output
113-
- Run `sandpaper::check_lesson()` to identify any issues with your lesson
114-
- Run `sandpaper::build_lesson()` to preview your lesson locally
115-
116-
::::::::::::::::::::::::::::::::::::::::::::::::
29+
Furthermore, as we will do in this course, we can scrape the same element on multiple pages. i.e. instead of opening each page separately and for each page marking the parts of the page that we want to download and copy-pasting it, we can write a script that will scrape the same HTML element on multiple pages. This allows for speed and consistency
11730

md5sum.txt

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
"file" "checksum" "built" "date"
22
"CODE_OF_CONDUCT.md" "c93c83c630db2fe2462240bf72552548" "site/built/CODE_OF_CONDUCT.md" "2025-12-02"
33
"LICENSE.md" "b24ebbb41b14ca25cf6b8216dda83e5f" "site/built/LICENSE.md" "2025-12-02"
4-
"config.yaml" "f9777239524deba0d4e0de629ebdd42d" "site/built/config.yaml" "2025-12-02"
4+
"config.yaml" "dc065c5ad7d706bf0d87ac48544a4e76" "site/built/config.yaml" "2025-12-02"
55
"index.md" "a02c9c785ed98ddd84fe3d34ddb12fcd" "site/built/index.md" "2025-12-02"
66
"links.md" "8184cf4149eafbf03ce8da8ff0778c14" "site/built/links.md" "2025-12-02"
7-
"episodes/introduction.Rmd" "2d68fdc4d8fedf307f5a783dccccdd06" "site/built/introduction.md" "2025-12-02"
8-
"episodes/tables.Rmd" "fb87d3366e0ea8ed5c14b9df8555eebc" "site/built/tables.md" "2025-12-02"
9-
"episodes/paragraph.Rmd" "8083247733507a712bfec100b5b77da8" "site/built/paragraph.md" "2025-12-02"
7+
"episodes/introduction.Rmd" "e6e67b8c28338ae5d64877803659fd63" "site/built/introduction.md" "2025-12-02"
8+
"episodes/tables.Rmd" "c0dd7589f5b1813683386595f33ff7c2" "site/built/tables.md" "2025-12-02"
9+
"episodes/paragraph.Rmd" "c5862f89b42dd95bbe30a63c8c4da146" "site/built/paragraph.md" "2025-12-02"
1010
"instructors/instructor-notes.md" "5cf113fd22defb29d17b64597f3c9bc0" "site/built/instructor-notes.md" "2025-12-02"
1111
"learners/reference.md" "527a12e217602daae51c5fd9ef8958df" "site/built/reference.md" "2025-12-02"
1212
"learners/setup.md" "61568b36c8b96363218c9736f6aee03a" "site/built/setup.md" "2025-12-02"

paragraph.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: "Paragraph"
2+
title: "Scraping text and headers"
33
output: html_document
44
date: "2024-12-13"
55
---

tables.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,6 @@ library(scales)
4949
```
5050

5151
### Scraping multiple tables on one page
52-
test
5352
One of the formats that data on a website often come in is a table
5453

5554
Let's look at statics about students at The University of Copenhagen (UCPH) at this page:

0 commit comments

Comments
 (0)