Add script to delete unused tokens from TYR database by devin-ai-integration[bot] · Pull Request #4518 · hove-io/navitia

devin-ai-integration · 2026-02-24T11:10:11Z

Add script to delete unused tokens from TYR database

Summary

Adds a Python script (source/tyr/delete_unused_tokens.py) that reads a CSV export of unused tokens (no API calls for 365+ days) and generates SQL DELETE statements for the key table.

How it works:

Parses the semicolon-delimited CSV, extracting the "Début du token" (token prefix) column
For 1755 entries with valid 8-char hex prefixes: generates DELETE FROM key WHERE token LIKE '<prefix>%' using a temporary table for efficient matching
For 14 entries where Excel corrupted the prefix (scientific notation like 8,85E+08, or truncated to 7 digits): falls back to DELETE FROM key WHERE id IN (...) using the tyr_id column
Generated SQL is wrapped in BEGIN; with COMMIT commented out so the operator can review row counts before committing
Also supports --execute mode for direct DB execution via SQLAlchemy

Review & Testing Checklist for Human

Verify that tyr_id in the CSV corresponds to key.id — this assumption was inferred (same login appears with different tyr_id values), but has not been confirmed against the actual production database. If tyr_id is actually user.id, the 14 fallback deletions by ID would be wrong.
Review the 14 corrupted entries (lines 48–61 of the generated SQL) — these will be deleted by ID rather than token prefix. Confirm these are the correct keys to remove.
Run the generated SQL with SELECT COUNT(*) first (already included in the output) to verify the number of matched tokens before committing the DELETE. Check that the count matches expectations (~1769).
Consider LIKE prefix collision risk — deletions match token LIKE '<8-char-hex>%'. If two different tokens share the same first 8 characters, both would be deleted. This is statistically unlikely but worth a quick sanity check.
Note on --execute mode: it calls conn.commit() at the end, which will auto-commit the transaction even though COMMIT is commented out in the SQL. If you plan to use --execute, be aware there is no review step.

Notes

Requested by: @Louispautasso
Link to Devin run

Python script that reads a CSV file of unused tokens (no API calls for 365+ days) and generates SQL DELETE statements for the 'key' table. - Handles 1755 valid hex token prefixes via LIKE matching - Handles 14 corrupted entries (Excel scientific notation) via key ID - Uses a transaction with manual COMMIT for safety - Supports both SQL file generation and direct DB execution Co-Authored-By: unknown <>

devin-ai-integration · 2026-02-24T11:10:17Z

Original prompt from louis.pautasso

A partir de ce fichier, fais moi un script qui supprime en bdd tout les tokens qui commencent pas la liste dans la colonne "debut de token"
ATTACHMENT:"https://app.devin.ai/attachments/450b708c-acd9-46c1-8e05-8ea10a4aea1d/tokens-unsage-365%28result%29+%281%29.csv"

You only need to look in the following repos: hove-io/loki, hove-io/datahub-api-aws-infra, hove-io/navitia

devin-ai-integration · 2026-02-24T11:10:18Z

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

Address comments on this PR that start with 'DevinAI' or '@devin'.
Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

Disable automatic comment and CI monitoring

Co-Authored-By: unknown <>

devin-ai-integration

Devin Review found 2 potential issues.

View 5 additional findings in Devin Review.

devin-ai-integration · 2026-02-24T11:23:18Z

source/tyr/delete_unused_tokens.py

+        for statement in sql.split(";"):
+            statement = statement.strip()
+            if statement and not statement.startswith("--"):


🔴 Naive SQL splitting on ; causes BEGIN, CREATE TABLE, and DELETE statements to be skipped in --execute mode

In --execute mode, execute_sql splits the entire generated SQL on ; and then skips any chunk that starts with --. Because generate_sql places comment lines immediately before SQL statements (with no intervening ;), the split merges comments with the following SQL statement into a single chunk. Since the chunk starts with --, the actual SQL statement is silently skipped.

Detailed explanation of which statements are skipped and why

The generated SQL looks like:

-- ============================================================= -- Script de suppression... -- ============================================================= BEGIN; -- Partie 1: Suppression par préfixe... CREATE TEMPORARY TABLE _token_prefixes_to_delete ...;

When split on ;, the first chunk is:

-- =============================================================\n...\n\nBEGIN

This starts with --, so BEGIN is never executed — all operations run without a transaction.

The second chunk is:

\n\n-- Partie 1: ...\n\nCREATE TEMPORARY TABLE _token_prefixes_to_delete (prefix TEXT NOT NULL)

This also starts with -- (after stripping), so CREATE TEMPORARY TABLE is never executed. The subsequent INSERT and DELETE statements referencing _token_prefixes_to_delete will then fail with a "relation does not exist" error.

Similarly, the DELETE FROM key WHERE id IN (...) for corrupted entries is in a chunk starting with comments and is also skipped.

Impact: In --execute mode, the script either crashes (table not found) or silently skips critical DELETE statements, and runs without transaction safety.

Prompt for agents

In source/tyr/delete_unused_tokens.py, the execute_sql function (lines 134-154) splits SQL on semicolons and then checks if each chunk starts with '--' to skip comments. This is fundamentally broken because comments and SQL statements get merged into the same chunk after splitting. The fix should replace the naive split-on-semicolon approach with proper statement-by-statement execution. Options: 1. Instead of generating a single SQL string and splitting it, refactor generate_sql to return a list of individual SQL statements (without comments), and have execute_sql iterate over that list directly. 2. Alternatively, use sqlalchemy's text() to execute the entire SQL script at once if the driver supports it, or use a proper SQL parser. 3. At minimum, filter out comment-only lines from each chunk before checking if it starts with '--'. For example, after splitting on ';', strip each chunk, split it into lines, remove lines that start with '--' or are empty, and then rejoin to get the actual SQL statement.

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-02-24T11:23:20Z

source/tyr/delete_unused_tokens.py

+        key_ids = ", ".join(entry[0] for entry in corrupted_entries)
+        lines.append(f"DELETE FROM key WHERE id IN ({key_ids});")


🔴 SQL injection via unsanitized tyr_id from CSV in corrupted entries DELETE statement

The tyr_id values read from the CSV are directly interpolated into a SQL DELETE FROM key WHERE id IN (...) statement at source/tyr/delete_unused_tokens.py:114 without any validation that they are integers. A malicious or malformed CSV could contain arbitrary SQL in the tyr_id column.

Root cause and exploitation path

At source/tyr/delete_unused_tokens.py:43, tyr_id = row[0].strip() reads the raw string from CSV. At line 113-114:

key_ids = ", ".join(entry[0] for entry in corrupted_entries) lines.append(f"DELETE FROM key WHERE id IN ({key_ids});")

If a CSV row has tyr_id = 1); DROP TABLE key; --, the generated SQL becomes:

DELETE FROM key WHERE id IN (1); DROP TABLE key; --);

This is exploitable both in the generated SQL file (if executed by a DBA) and in --execute mode. Even for the --output mode (generating a .sql file), the injected SQL would be present in the output file.

Impact: Potential for arbitrary SQL execution including data destruction.

Suggested change

key_ids = ", ".join(entry[0] for entry in corrupted_entries)

lines.append(f"DELETE FROM key WHERE id IN ({key_ids});")

key_ids = ", ".join(str(int(entry[0])) for entry in corrupted_entries)

lines.append(f"DELETE FROM key WHERE id IN ({key_ids});")

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-03-08T17:50:49Z

Closing due to inactivity for more than 7 days. Configure here.

devin-ai-integration bot assigned Louispautasso Feb 24, 2026

Fix black formatting and flake8 f-string warning

66b3479

Co-Authored-By: unknown <>

devin-ai-integration bot commented Feb 24, 2026

View reviewed changes

devin-ai-integration bot closed this Mar 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add script to delete unused tokens from TYR database#4518

Add script to delete unused tokens from TYR database#4518
devin-ai-integration[bot] wants to merge 2 commits intodevfrom
devin/1771931306-delete-unused-tokens

devin-ai-integration bot commented Feb 24, 2026 •

edited

Loading

Uh oh!

devin-ai-integration bot commented Feb 24, 2026

Uh oh!

devin-ai-integration bot commented Feb 24, 2026

Uh oh!

devin-ai-integration bot left a comment

Uh oh!

devin-ai-integration bot Feb 24, 2026

Uh oh!

devin-ai-integration bot Feb 24, 2026

Uh oh!

devin-ai-integration bot commented Mar 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		key_ids = ", ".join(entry[0] for entry in corrupted_entries)
		lines.append(f"DELETE FROM key WHERE id IN ({key_ids});")

Conversation

devin-ai-integration bot commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Add script to delete unused tokens from TYR database

Summary

Review & Testing Checklist for Human

Notes

Uh oh!

devin-ai-integration bot commented Feb 24, 2026

Uh oh!

devin-ai-integration bot commented Feb 24, 2026

🤖 Devin AI Engineer

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot commented Mar 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

devin-ai-integration bot commented Feb 24, 2026 •

edited

Loading