Filter out Copr builds without SRPM in SQL #2863

m-blaha · 2025-11-04T12:21:07Z

In addition to pagination slicing, the CoprBuildsList class uses Python code to filter out Copr builds that are waiting for an SRPM or whose SRPM build failed. This causes that the API in some cases can return fewer items than the user requested.

Moving the filter into SQL resolves the problem. The SQL filter relies on the fact that the build_id field is NULL until the build is actually created by submitting to Copr.

Resolves: #2505

gemini-code-assist · 2025-11-04T12:21:20Z

Summary of Changes

Hello @m-blaha, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refactors the data retrieval process for Copr build lists by migrating the filtering of incomplete builds from application-level Python code to the database query. This enhancement ensures that the API consistently delivers the expected number of results and improves overall performance by optimizing data processing at its source.

Highlights

SQL-based Filtering: The filtering logic for Copr builds that are waiting for an SRPM or whose SRPM build failed has been moved from Python code to the SQL query. This is achieved by filtering out records where the build_id is NULL.
Improved API Efficiency: By performing the filtering at the database level, the API becomes more efficient as it no longer needs to fetch and then discard unwanted records in Python.
Accurate Pagination: This change resolves an issue where the API could return fewer items than requested due to post-query filtering, ensuring that pagination slicing works as expected.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request correctly moves the filtering of Copr builds from Python code into the SQL query. This is a good change that resolves an issue with pagination and improves efficiency. The implementation is clean and the reasoning is sound. I have one suggestion for a further performance improvement to address a potential N+1 query problem in the API endpoint, which would make this part of the code even more efficient.

gemini-code-assist · 2025-11-04T12:22:45Z

packit_service/service/api/copr_builds.py

        for build in CoprBuildTargetModel.get_merged_chroots(first, last):
            build_info = CoprBuildTargetModel.get_by_build_id(build.build_id, None)
-            if build_info.status == BuildStatus.waiting_for_srpm:
-                continue
-            if (
-                build_info.status == BuildStatus.failure
-                and not build_info.build_start_time
-                and not build_info.build_logs_url
-            ):
-                # SRPM build failed, it doesn't make sense to list this build
-                continue
            project_info = build_info.get_project()


This loop currently causes an N+1 query problem. For each build returned by get_merged_chroots, you're making at least two more database queries: get_by_build_id and then get_project (which itself can trigger multiple lazy-loads). This can lead to significant performance degradation, especially with a large number of builds.

To resolve this, I recommend modifying CoprBuildTargetModel.get_merged_chroots to fetch all the necessary information in a single query by using joins and returning all required fields. This would eliminate the need for extra queries inside the loop.

For example, you could extend the query in get_merged_chroots to join with CoprBuildGroupModel, PipelineModel, ProjectEventModel, and GitProjectModel to retrieve fields like project_name, build_submitted_time, web_url, commit_sha, and project details. You would need to use an aggregate function (like min or max) on these additional fields within the group_by clause, since they will be the same for all chroots of a given build.

This would make the API endpoint much more performant.

This makes sense, but is out of scope here. If we wanted to optimize the code, it should be tracked as a separate issue.

lbarcziova · 2025-12-17T12:54:08Z

packit_service/models.py

+                .filter(
+                    # Exclude builds without build_id - these are builds waiting for SRPM
+                    # or where SRPM build failed, so technically they are not actual Copr
+                    # builds yet.
+                    CoprBuildTargetModel.build_id.isnot(None),
+                )


unfortunately, I don't think this will solve the issue, the model has already set the build_id even when waiting for SRPM, example API response:

"anitya_package": null, "anitya_project_id": null, "anitya_project_name": null, "anitya_version": null, "branch_name": null, "build_finished_time": null, "build_id": "9923561", "build_logs_url": null, "build_start_time": null, "build_submitted_time": 1765975741, "built_packages": null, "chroot": "rhel-9-x86_64", "commit_sha": "93e63e21c0401256021ec3a0460a9517d4986267", "copr_owner": "packit", "copr_project": "theforeman-foreman-10802", "issue_id": null, "non_git_upstream": false, "pr_id": 10802, "project_url": "https://github.com/theforeman/foreman", "release": null, "repo_name": "foreman", "repo_namespace": "theforeman", "run_ids": [ 1086847 ], "srpm_build_id": 520981, "status": "waiting_for_srpm", "web_url": "https://copr.fedorainfracloud.org/coprs/build/9923561/" }

we are setting if here right after the build is submitted

Yeah, you're right :( I'll try to find another way to filter builds directly in the query so I can keep removal of the Python-level filtering.

I tried to replicate the additional Python filtering directly in the query.

m-blaha · 2025-12-19T14:57:27Z

/gemini review

gemini-code-assist

Code Review

This pull request is a great improvement. It correctly moves the filtering logic for Copr builds from the application layer to the database query. This change effectively resolves the described pagination issue and improves performance by filtering data at the source. The implementation is clean and the logic for excluding certain builds is correctly translated to SQLAlchemy. I have a couple of suggestions to enhance the readability of the new filter condition, but overall, this is a solid contribution.

gemini-code-assist · 2025-12-19T14:58:55Z

packit_service/models.py

    desc,
    func,
    null,
+    or_,


To improve the readability of the filter condition being added, consider importing and_ and not_ instead of or_. This will allow expressing the filter logic in a way that more directly mirrors the case being excluded, making the code easier to understand.

Suggested change

or_,

and_,

not_,

gemini-code-assist · 2025-12-19T14:58:55Z

packit_service/models.py

+                    or_(
+                        CoprBuildTargetModel.status != BuildStatus.failure,
+                        CoprBuildTargetModel.build_start_time.isnot(None),
+                        CoprBuildTargetModel.build_logs_url.isnot(None),
+                    ),


While this or_ condition is logically correct, it represents the inverted logic of what is being excluded. Using not_(and_(...)) would be more readable as it directly translates the condition for an SRPM build failure that should be excluded: status == failure AND build_start_time IS NULL AND build_logs_url IS NULL. This makes the intent of the filter clearer and closer to the original Python logic that was removed.

Suggested change

or_(

CoprBuildTargetModel.status != BuildStatus.failure,

CoprBuildTargetModel.build_start_time.isnot(None),

CoprBuildTargetModel.build_logs_url.isnot(None),

),

not_(

and_(

CoprBuildTargetModel.status == BuildStatus.failure,

CoprBuildTargetModel.build_start_time.is_(None),

CoprBuildTargetModel.build_logs_url.is_(None),

)

),

centosinfra-prod-github-app · 2025-12-19T14:59:41Z

Build succeeded.
https://gateway-cloud-softwarefactory.apps.ocp.cloud.ci.centos.org/zuul/t/packit-service/buildset/f6c07cf9c80f4483a44cc6c75705b2ba

✔️ pre-commit SUCCESS in 1m 47s
✔️ packit-service-tests SUCCESS in 3m 14s

centosinfra-prod-github-app · 2026-01-07T14:13:40Z

Build succeeded.
https://gateway-cloud-softwarefactory.apps.ocp.cloud.ci.centos.org/zuul/t/packit-service/buildset/4883bf4078bb4ea1a0e0f25769b99f01

✔️ pre-commit SUCCESS in 1m 45s
✔️ packit-service-tests SUCCESS in 3m 16s

lbarcziova

this approach looks good to me! Do you think we could add a testcase in https://github.com/packit/packit-service/blob/main/tests_openshift/database/test_models.py and/or https://github.com/packit/packit-service/blob/main/tests_openshift/service/test_api.py? These tests are not run in CI unfortunately (yet), but are good for this kind of changes; they require compose setup, there is this Make target.

m-blaha · 2026-01-12T19:07:55Z

Sure, will try!

m-blaha · 2026-01-14T14:20:29Z

@lbarcziova I've added the test

centosinfra-prod-github-app · 2026-01-14T14:22:51Z

Build succeeded.
https://gateway-cloud-softwarefactory.apps.ocp.cloud.ci.centos.org/zuul/t/packit-service/buildset/80dd25b6bc2b40c3a5b1ec11cbb08ebb

✔️ pre-commit SUCCESS in 1m 44s
✔️ packit-service-tests SUCCESS in 3m 23s

lbarcziova · 2026-01-15T12:17:36Z

packit_service/models.py

+                    or_(
+                        CoprBuildTargetModel.status != BuildStatus.failure,
+                        CoprBuildTargetModel.build_start_time.isnot(None),
+                        CoprBuildTargetModel.build_logs_url.isnot(None),
+                    ),


I checked now how we actually handle SRPM failure, and here it looks like we don't touch the Copr build models at all, we only change the status to cancelled, if the build is retriggered (by push or similar)

have you found some code that leaves the DB items in this state, or we could remove this and stick to only filtering based on status (probably add also BuildStatus.canceled)

To be honest, I only reproduced current python filtering on database level. Let me check whether start_time/logs_url check is actually useful.

@lbarcziova I checked the production database and found many copr_build_targets records missing start_time and build_logs_url and marked as failure. They are coming at a rate of approximately 80 records per day. It appears there was no copr.build.Start event, only copr.build.End for these builds (I checked the logs). I'm not sure whether these events were lost or never issued from the copr side. Some of these builds correspond to srpm build failures (e.g. https://copr.fedorainfracloud.org/coprs/packit/osbuild-image-builder-frontend-3989/build/10010130/), but others are regular target build failures where the srpm succeeded (e.g. https://copr.fedorainfracloud.org/coprs/packit/systemd-systemd-40311/build/10010141/). Is this a known issue, or should I file a report?

FWIW, here is PR that introduced start_time/logs_url check - #2396 . It seems that the reason was to filter out log links with empty URLs.

@m-blaha thanks a lot for checking this thoroughly. So there are records of Copr builds that ended with failure but still don't have the start_time and build_logs_url? That seems like a bug indeed, please do. But we can keep it in the PR like this as we already did the same filtering on the API level.

I looked also into copr build logs and it seems to be an issue on the copr side. Filed an issue there - fedora-copr/copr#4127

lbarcziova

thank you!

centosinfra-prod-github-app · 2026-01-16T14:24:10Z

Build succeeded (gate pipeline).
https://gateway-cloud-softwarefactory.apps.ocp.cloud.ci.centos.org/zuul/t/packit-service/buildset/6ac9021a739c411eb36a9e761f9bb34f

✔️ pre-commit SUCCESS in 1m 46s

centosinfra-prod-github-app · 2026-01-16T14:24:36Z

Pull request merge failed: Required status check "Notes are either written, or there are none / release-notes" is expected.

In addition to pagination slicing, the CoprBuildsList class uses Python code to filter out Copr builds that are waiting for an SRPM or whose SRPM build failed. This causes that the API in some cases can return fewer items than the user requested. Moving the filter into SQL resolves the problem. Resolves: packit#2505

centosinfra-prod-github-app · 2026-01-16T14:47:44Z

Build succeeded.
https://gateway-cloud-softwarefactory.apps.ocp.cloud.ci.centos.org/zuul/t/packit-service/buildset/550c0b04786a4279b0cbcdb7e71ccd5a

✔️ pre-commit SUCCESS in 1m 45s
✔️ packit-service-tests SUCCESS in 3m 32s

centosinfra-prod-github-app · 2026-01-16T14:50:03Z

Build succeeded (gate pipeline).
https://gateway-cloud-softwarefactory.apps.ocp.cloud.ci.centos.org/zuul/t/packit-service/buildset/224e13c226d0476ab3e8a2011080fa2b

✔️ pre-commit SUCCESS in 1m 44s

m-blaha requested a review from a team as a code owner November 4, 2025 12:21

usercont-release-bot added this to Packit Kanban Board Nov 4, 2025

github-project-automation bot moved this to new in Packit Kanban Board Nov 4, 2025

m-blaha force-pushed the filter-out-no-source branch from ab917f1 to fe6327e Compare November 4, 2025 12:22

gemini-code-assist bot reviewed Nov 4, 2025

View reviewed changes

lbarcziova requested changes Dec 17, 2025

View reviewed changes

betulependule moved this from new to in-review in Packit Kanban Board Dec 18, 2025

m-blaha force-pushed the filter-out-no-source branch from fe6327e to f7bbf3f Compare December 19, 2025 14:55

gemini-code-assist bot reviewed Dec 19, 2025

View reviewed changes

m-blaha force-pushed the filter-out-no-source branch from f7bbf3f to 61f869f Compare January 7, 2026 14:09

m-blaha requested review from mfocko and nforro as code owners January 7, 2026 14:09

lbarcziova added this to Packit pull requests Jan 9, 2026

github-project-automation bot moved this to New in Packit pull requests Jan 9, 2026

lbarcziova moved this from New to In review in Packit pull requests Jan 9, 2026

lbarcziova reviewed Jan 12, 2026

View reviewed changes

m-blaha force-pushed the filter-out-no-source branch from 61f869f to e7fc6d9 Compare January 14, 2026 14:18

lbarcziova reviewed Jan 15, 2026

View reviewed changes

lbarcziova approved these changes Jan 16, 2026

View reviewed changes

m-blaha added the mergeit Merge via Zuul label Jan 16, 2026

Tests for filtering out Copr builds without SRPM

e81528a

m-blaha force-pushed the filter-out-no-source branch from e7fc6d9 to e81528a Compare January 16, 2026 14:43

centosinfra-prod-github-app bot merged commit 22c70d8 into packit:main Jan 16, 2026
6 checks passed

github-project-automation bot moved this from In review to Done in Packit pull requests Jan 16, 2026

Filter out Copr builds without SRPM in SQL #2863

Filter out Copr builds without SRPM in SQL #2863

Conversation

m-blaha commented Nov 4, 2025

Uh oh!

gemini-code-assist bot commented Nov 4, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

m-blaha commented Dec 19, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

centosinfra-prod-github-app bot commented Dec 19, 2025

Uh oh!

centosinfra-prod-github-app bot commented Jan 7, 2026

Uh oh!

lbarcziova left a comment

Choose a reason for hiding this comment

Uh oh!

m-blaha commented Jan 12, 2026

Uh oh!

m-blaha commented Jan 14, 2026

Uh oh!

centosinfra-prod-github-app bot commented Jan 14, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lbarcziova left a comment

Choose a reason for hiding this comment

Uh oh!

centosinfra-prod-github-app bot commented Jan 16, 2026

Uh oh!

centosinfra-prod-github-app bot commented Jan 16, 2026

Uh oh!

centosinfra-prod-github-app bot commented Jan 16, 2026

Uh oh!

centosinfra-prod-github-app bot commented Jan 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development