Skip to content

Conversation

Swatinem
Copy link
Member

As the ReleaseFile is cross-project, and has no index on the newly added date columns,
I have duplicated the per-project deletion code to rather run per-organization.

I’m not entirely confident in this TBH, and I wonder if this has any tests?

@Swatinem Swatinem requested review from a team and armenzg July 18, 2025 11:16
@Swatinem Swatinem self-assigned this Jul 18, 2025
@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Jul 18, 2025
cursor[bot]

This comment was marked as outdated.

cursor[bot]

This comment was marked as outdated.

Copy link
Member

@jan-auer jan-auer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As cursor points out, the --organization flag is only used for per-org model deletes but not for other models that would technically support filtering by organization. I think this can also be added in a follow-up to keep the change small, but we should be clear in the CLI doc about it. For example "Limit truncation to only entries from organization, applies only to release files".

@codecov
Copy link

codecov bot commented Jul 18, 2025

Codecov Report

Attention: Patch coverage is 66.66667% with 2 lines in your changes missing coverage. Please review.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/sentry/db/deletion.py 66.66% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #95870      +/-   ##
==========================================
+ Coverage   80.83%   80.84%   +0.01%     
==========================================
  Files       10592    10592              
  Lines      610104   610109       +5     
  Branches    23993    23993              
==========================================
+ Hits       493167   493233      +66     
+ Misses     116227   116166      -61     
  Partials      710      710              

Base automatically changed from swatinem/make-releasefiles-tti to master July 21, 2025 10:21
@Swatinem Swatinem requested review from a team as code owners July 21, 2025 10:21
As the `ReleaseFile` is cross-project, and has no index on the newly added date columns,
I have duplicated the per-project deletion code to rather run per-organization.
Copy link
Contributor

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Bulk Deletion Ignores Organization Filter

When the cleanup command is executed with the --organization flag, the run_bulk_query_deletes function does not filter deletions by the specified organization. This occurs because run_bulk_query_deletes does not accept or pass the organization_id to the BulkDeleteQuery constructor. As a result, models handled by this function (e.g., UserReport, GroupEmailThread) are deleted from all organizations instead of being limited to the target organization.

src/sentry/runner/commands/cleanup.py#L234-L235

run_bulk_query_deletes(bulk_query_deletes, is_filtered, days, project, project_id)

src/sentry/runner/commands/cleanup.py#L246-L253

q = BulkDeleteQuery(
model=model_tp,
dtfield=dtfield,
days=days,
project_id=project_id,
order_by=order_by,
)

src/sentry/runner/commands/cleanup.py#L478-L520

(GroupEmailThread, "date", None),
] + additional_bulk_query_deletes
return BULK_QUERY_DELETES
def run_bulk_query_deletes(
bulk_query_deletes: list[tuple[type[Model], str, str | None]],
is_filtered: Callable[[type[Model]], bool],
days: int,
project: str | None,
project_id: int | None,
) -> None:
from sentry.db.deletion import BulkDeleteQuery
debug_output("Running bulk query deletes in bulk_query_deletes")
for model_tp, dtfield, order_by in bulk_query_deletes:
chunk_size = 10000
debug_output(f"Removing {model_tp.__name__} for days={days} project={project or '*'}")
if is_filtered(model_tp):
debug_output(">> Skipping %s" % model_tp.__name__)
else:
BulkDeleteQuery(
model=model_tp,
dtfield=dtfield,
days=days,
project_id=project_id,
order_by=order_by,
).execute(chunk_size=chunk_size)
def prepare_deletes_by_project(
project: str | None, project_id: int | None, is_filtered: Callable[[type[Model]], bool]
) -> tuple[QuerySet[Any] | None, list[tuple[Any, str, str]]]:
from sentry.constants import ObjectStatus
from sentry.models.debugfile import ProjectDebugFile
from sentry.models.group import Group
from sentry.models.project import Project
# Deletions that we run per project. In some cases we can't use an index on just the date
# column, so as an alternative we use `(project_id, <date_col>)` instead
DELETES_BY_PROJECT = [

Fix in CursorFix in Web


Was this report helpful? Give feedback by reacting with 👍 or 👎

@Swatinem Swatinem merged commit 9534eef into master Jul 22, 2025
65 checks passed
@Swatinem Swatinem deleted the swatinem/cleanup-releasefiles branch July 22, 2025 08:25
andrewshie-sentry pushed a commit that referenced this pull request Aug 4, 2025
As the `ReleaseFile` is cross-project, and has no index on the newly
added date columns,
I have duplicated the per-project deletion code to rather run
per-organization.

I’m not entirely confident in this TBH, and I wonder if this has any
tests?
@github-actions github-actions bot locked and limited conversation to collaborators Aug 6, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants