Fonts 2025 queries by IvanUkhov · Pull Request #4175 · HTTPArchive/almanac.httparchive.org

IvanUkhov · 2025-07-24T10:57:11Z

Makes progress on #4073

Fonts

Resources

Structure

The queries are split by the section where they are used:

design/ is about foundries and families,
development/ is about tools and technologies, and
performance/ is about hosting and serving.

Each file name starts with one of the following prefixes indicating the primary subject of the corresponding analysis:

fonts_ is about font files,
pages_ is about HTML pages,
scripts_ is about JavaScript scripts, and
styles_ is about CSS style sheets.

The prefix is followed by the property studied given in singular, potentially extended one or several suffixes narrowing down the scope, as in fonts_size_by_table.sql and pages_link_relation.sql.

Content

Each query starts with a preamble indicating the section, question, and normalization type, as illustrated below:

-- Section: Performance
-- Question: What is the distribution of the file size broken down by table?
-- Normalization: Pages

Many queries rely on temporary functions for convenience and clarity. The functions that appear in several queries are extracted into a common file called common.sql. Whenever any of the functions defined in common.sql is used by a query, the query has the following pseudo-directive at the top:

-- INCLUDE https://github.com/HTTPArchive/almanac.httparchive.org/blob/main/sql/{year}/fonts/common.sql

The pseudo-directive has to be replaced with the content of common.sql prior to executing the query in question.

In addition, queries generally have parameters, as in @date, so as to be able to run them for different configurations. The values for the parameters will have to be supplied upon execution.

All the above is taken take of automatically if the queries are executed using execute.py, which we discuss next.

Execution

The queries can be executed using the execute.py script. The results are first saved in local CSV files sitting next to the SQL files and then uploaded to the spreadsheet. In the spreadsheet, for each query, a separate sheet is created and named after the question the query answers, which is given in its preamble. If the CSV file already exists, the corresponding query is not executed. If cell A1 is already populated, the corresponding sheet is not updated.

First, ensure that the Application Default Credentials authorization strategy is configured, and that the HTTP Archive project is used as the quota project:

gcloud auth application-default login \
  --scopes https://www.googleapis.com/auth/cloud-platform,https://www.googleapis.com/auth/spreadsheets
gcloud auth application-default set-quota-project httparchive

Second, install the Python prerequisites for the script:

pip install -r requirements.txt

The script can be run for all or a subset of the queries as illustrated below:

python execute.py
python execute.py design/*.sql
python execute.py development/fonts_*.sql

By default, it operates in a dry-run mode: it does not run the queries but prints an estimate of the amount of data that would be processed by each query. To actually run the queries, pass the --no-dry-run option as follows:

python execute.py --no-dry-run
python execute.py --no-dry-run design/*.sql
python execute.py --no-dry-run development/fonts_*.sql

sql/2025/fonts/execute.py

IvanUkhov · 2025-08-04T14:39:05Z

@tunetheweb, I think you were the one reviewing the queries last year. If you do not mind, I would like to invite you to review this year, too, but please feel free to assign someone else. This year, we did not change anything. We just migrated the queries to crawl and added a Python script for execution. The instructions are in the readme.

sql/2025/fonts/design/styles_family.sql

tunetheweb

LGTM with some non-blocking comments.

sql/2025/fonts/common.sql

sql/2025/fonts/design/fonts_designer.sql

sql/2025/fonts/design/fonts_family_by_script.sql

sql/2025/fonts/design/styles_family.sql

sql/2025/fonts/performance/fonts_format_file.sql

sql/2025/fonts/execute.py

sql/2025/fonts/common.sql

IvanUkhov · 2025-08-22T04:54:22Z

(The linter is failing due to the code elsewhere.)

tunetheweb · 2025-08-22T07:04:17Z

(The linter is failing due to the code elsewhere.)

Fixing in #4196

tunetheweb · 2025-08-22T11:25:52Z

That's fixed in main now if you can resync this branch @IvanUkhov .

After that are you good to merge this?

IvanUkhov · 2025-08-22T11:36:24Z

Thank you. Rebased.

Well, I have not received any feedback from the lead. I would merge, if you are OK with potential follow-up PRs.

tunetheweb · 2025-08-22T11:37:11Z

Thank you. Rebased.

Well, I have not received any feedback from the lead. I would merge, if you are OK with potential follow-up PRs.

Yeah lets do that.

IvanUkhov changed the title ~~Fonts 2025~~ Fonts 2025 queries Aug 2, 2025

github-advanced-security bot found potential problems Aug 4, 2025

View reviewed changes

sql/2025/fonts/execute.py Fixed Show fixed Hide fixed

github-advanced-security bot found potential problems Aug 4, 2025

View reviewed changes

sql/2025/fonts/execute.py Fixed Show fixed Hide fixed

github-advanced-security bot found potential problems Aug 4, 2025

View reviewed changes

sql/2025/fonts/execute.py Fixed Show fixed Hide fixed

IvanUkhov marked this pull request as ready for review August 4, 2025 09:36

tunetheweb added the analysis Querying the dataset label Aug 18, 2025

max-ostapenko mentioned this pull request Aug 19, 2025

Change parsed_css.css from STRING to JSON HTTPArchive/httparchive.org#1095

Closed

max-ostapenko reviewed Aug 19, 2025

View reviewed changes

sql/2025/fonts/design/styles_family.sql Outdated Show resolved Hide resolved

tunetheweb reviewed Aug 20, 2025

View reviewed changes

max-ostapenko reviewed Aug 21, 2025

View reviewed changes

sql/2025/fonts/common.sql Show resolved Hide resolved

IvanUkhov added 16 commits August 22, 2025 13:29

Copy the queries from 2024 to 2025

3e101ad

Update the readme

100a649

Update the INCLUDE pseudo-directives

bd194f2

Replace all with crawl

c1bcf92

Update the dates to 2025-07-01

d44598b

Update the usage of custom_metrics

89ec842

Update the usage of summary

51975f2

Update the usage of payload in the common functions

18e5776

Update the usage of JSON_*

bd63581

Fix development/fonts_hinting

702b4ea

Add a Python script for validating the queries

7542b57

Add development/styles_hyphens

7fccfbe

Replace all single dates with a placeholder

cabb64c

Replace all multiple dates with a placeholder

565f37f

Replace 2025 with {year}

a07049e

Mention the parameters in the readme

eeabe04

IvanUkhov added 26 commits August 22, 2025 13:30

Create sheets for query results

2fb4d25

Add a few comments

29a3ce8

Name sheets by the question

a7e01f3

Populate the spreadsheet

013368a

Make a cosmetic adjustment

8c3bf11

Nullify NaNs

cac0adc

Address a lint

a9bf33a

Exclude non-SQL files

fb5bcad

Add a parameter for controlling the number of workers

ffe8e6f

Use SAFE.INT64 for respBodySize

8e3286f

Take the first line of the error

f96c378

Cast file sizes to integers

a5e48f8

Downsample in design/fonts_family_by_script.sql

8874d28

Fix a typo

5fe0e2a

Add rounding in design/fonts_metric.sql

27e2ba0

Fix a typo

9d1efd6

Fix the reporting of failures

e396c87

Update the readme

ee65748

Update the usage of the Chrome UX report

3787dd6

Update the usage of parsed_css

ae31d3a

Use JSON instead of STRING in custom JavaScript functions

ee3a5f0

Make a cosmetic adjustment

dcb8044

Remove JSON_QUERY in favor of direct indexing

14fd419

Simplify SCRIPTS

1e036ff

Simplify HAS_EMOJI

e8f0482

Do no use subsampling

038a4ec

tunetheweb approved these changes Aug 22, 2025

View reviewed changes

tunetheweb merged commit b6bcddb into HTTPArchive:main Aug 22, 2025
4 checks passed

Uh oh!

Conversation

IvanUkhov commented Jul 24, 2025 • edited by tunetheweb Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Fonts

Resources

Structure

Content

Execution

Uh oh!

Uh oh!

Uh oh!

Uh oh!

IvanUkhov commented Aug 4, 2025

Uh oh!

Uh oh!

tunetheweb left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

IvanUkhov commented Aug 22, 2025

Uh oh!

tunetheweb commented Aug 22, 2025

Uh oh!

tunetheweb commented Aug 22, 2025

Uh oh!

IvanUkhov commented Aug 22, 2025

Uh oh!

tunetheweb commented Aug 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

IvanUkhov commented Jul 24, 2025 •

edited by tunetheweb

Loading

tunetheweb left a comment •

edited

Loading