Skip to content

Commit 86733b3

Browse files
honzajavorekgullmar
authored andcommitted
style: use dollar variables (saving data) (apify#1843)
As I progressed with apify#1584 I felt the code examples were starting to be more and more complex. Then I remembered that when I was young, us jQuery folks used to lean towards a naming convention where variables holding jQuery selections were prefixed with $. I changed the code examples in all lessons to adhere to this as I feel it makes them more readable and less cluttered. ----- ℹ️ The changes still use `$.map` and `$.each`, because they were made prior to the facb3c0 commit. It's gonna happen, but not yet. --------- Co-authored-by: gullmar <[email protected]>
1 parent 1fd325a commit 86733b3

File tree

2 files changed

+29
-27
lines changed

2 files changed

+29
-27
lines changed

sources/academy/webscraping/scraping_basics_javascript2/08_saving_data.md

Lines changed: 28 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ We should use widely popular formats that have well-defined solutions for all th
2525

2626
## Collecting data
2727

28-
Producing results line by line is an efficient approach to handling large datasets, but to simplify this lesson, we'll store all our data in one variable. This'll take three changes to our program:
28+
Producing results line by line is an efficient approach to handling large datasets, but to simplify this lesson, we'll store all our data in one variable. This'll take four changes to our program:
2929

3030
```js
3131
import * as cheerio from 'cheerio';
@@ -38,16 +38,15 @@ if (response.ok) {
3838
const $ = cheerio.load(html);
3939

4040
// highlight-next-line
41-
const data = [];
42-
$(".product-item").each((i, element) => {
43-
const productItem = $(element);
41+
const $items = $(".product-item").map((i, element) => {
42+
const $productItem = $(element);
4443

45-
const title = productItem.find(".product-item__title");
46-
const titleText = title.text().trim();
44+
const $title = $productItem.find(".product-item__title");
45+
const title = $title.text().trim();
4746

48-
const price = productItem.find(".price").contents().last();
47+
const $price = $productItem.find(".price").contents().last();
4948
const priceRange = { minPrice: null, price: null };
50-
const priceText = price
49+
const priceText = $price
5150
.text()
5251
.trim()
5352
.replace("$", "")
@@ -62,17 +61,34 @@ if (response.ok) {
6261
}
6362

6463
// highlight-next-line
65-
data.push({ title: titleText, ...priceRange });
64+
return { title, ...priceRange };
6665
});
67-
66+
// highlight-next-line
67+
const data = $items.get();
6868
// highlight-next-line
6969
console.log(data);
7070
} else {
7171
throw new Error(`HTTP ${response.status}`);
7272
}
7373
```
7474

75-
Before looping over the products, we prepare an empty array. Then, instead of printing each line, we append the data of each product to the array in the form of a JavaScript object. At the end of the program, we print the entire array at once.
75+
Instead of printing each line, we now return the data for each product as a JavaScript object. We've replaced `.each()` with [`.map()`](https://cheerio.js.org/docs/api/classes/Cheerio#map-3), which also iterates over the selection but, in addition, collects all the results and returns them as a Cheerio collection. We then convert it into a standard JavaScript array by calling [`.get()`](https://cheerio.js.org/docs/api/classes/Cheerio#call-signature-32). Near the end of the program, we print the entire array.
76+
77+
:::tip Advanced syntax
78+
79+
When returning the item object, we use [shorthand property syntax](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Object_initializer#property_definitions) to set the title, and [spread syntax](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Spread_syntax) to set the prices. It's the same as if we wrote the following:
80+
81+
```js
82+
{
83+
title: title,
84+
minPrice: priceRange.minPrice,
85+
price: priceRange.price,
86+
}
87+
```
88+
89+
:::
90+
91+
The program should now print the results as a single large JavaScript array:
7692

7793
```text
7894
$ node index.js
@@ -91,20 +107,6 @@ $ node index.js
91107
]
92108
```
93109

94-
:::tip Spread syntax
95-
96-
The three dots in `{ title: titleText, ...priceRange }` are called [spread syntax](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Spread_syntax). It's the same as if we wrote the following:
97-
98-
```js
99-
{
100-
title: titleText,
101-
minPrice: priceRange.minPrice,
102-
price: priceRange.price,
103-
}
104-
```
105-
106-
:::
107-
108110
## Saving data as JSON
109111

110112
The JSON format is popular primarily among developers. We use it for storing data, configuration files, or as a way to transfer data between programs (e.g., APIs). Its origin stems from the syntax of JavaScript objects, but people now use it accross programming languages.
@@ -202,7 +204,7 @@ In this lesson, we created export files in two formats. The following challenges
202204

203205
### Process your JSON
204206

205-
Write a new Node.js program that reads `products.json`, finds all products with a min price greater than $500, and prints each of them.
207+
Write a new Node.js program that reads the `products.json` file we created in this lesson, finds all products with a min price greater than $500, and prints each of them.
206208

207209
<details>
208210
<summary>Solution</summary>

sources/academy/webscraping/scraping_basics_python/08_saving_data.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -186,7 +186,7 @@ In this lesson, we created export files in two formats. The following challenges
186186

187187
### Process your JSON
188188

189-
Write a new Python program that reads `products.json`, finds all products with a min price greater than $500, and prints each one using [`pp()`](https://docs.python.org/3/library/pprint.html#pprint.pp).
189+
Write a new Python program that reads the `products.json` file we created in this lesson, finds all products with a min price greater than $500, and prints each one using [`pp()`](https://docs.python.org/3/library/pprint.html#pprint.pp).
190190

191191
<details>
192192
<summary>Solution</summary>

0 commit comments

Comments
 (0)