Skip to content

Conversation

sanjaysrikakulam
Copy link
Member

@bgruening
Copy link
Member

@sj213 this might put back some load onto the PG. Can you keep an eye on this please ...

But we need the cleanup script ... we would need to enhance the query etc ... or adding indices.

@sj213
Copy link
Contributor

sj213 commented Apr 23, 2025

No argument here. But since this query is likely to run for a week or so and we all agree that this query badly wants some finishing, might I suggest that for the time being the query is not run via cron but rather manually, in a screen/tmux session and prefixed with EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS, FORMAT JSON)?

Two advantages:

  • Since the query is known to run several days and also block other instances of the same query that are started on the following days, we would have to forcibly terminate those queries started later anyway; we've been there before
  • By running the query with the full EXPLAIN ANALYZE monty, we obtain valuable data for the Dalibo analyzer. Two birds with one stone.

@sanjaysrikakulam
Copy link
Member Author

No argument here. But since this query is likely to run for a week or so and we all agree that this query badly wants some finishing, might I suggest that for the time being the query is not run via cron but rather manually, in a screen/tmux session and prefixed with EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS, FORMAT JSON)?

Two advantages:

* Since the query is known to run several days and also block other instances of the same query that are started on the following days, we would have to forcibly terminate those queries started later anyway; we've been there before

* By running the query with the full EXPLAIN ANALYZE monty, we obtain valuable data for the Dalibo analyzer. Two birds with one stone.

Unfortunately, we cannot wrap it with EXPLAIN... because gxadmin is simply calling a Galaxy Python script, and the script uses the Galaxy models (SQLAlchemy) for doing all the things that it does.

@bgruening
Copy link
Member

Stefan should be able to see the slow query on his side isn't it?

@sj213
Copy link
Contributor

sj213 commented Apr 23, 2025

Once it completes it should show up in the logs as one of the slow queries. And then the problem is which one of all the slow queries in the log it was.

But there may be another option: looking at /opt/galaxy/server/scripts/cleanup_datasets/pgcleanup.py I see this script supports a --debug option which should enable logging of the SQL queries sent to the server.

However, in /opt/gxadmin/partx/25-galaxy.sh the function galaxy_cleanup() does not pass the debug option to pgcleanup.py if the envar GXADMIN_DEBUG is set, so 25-galaxy.sh would need to be hacked a bit to enable query logging in galaxy_cleanup().

It should thus be possible to hack gxadmin, run it with debugging enabled to obtain the query text, abort the query submitted and then re-execute it from psql(1) with analyze enabled.

@sj213
Copy link
Contributor

sj213 commented Apr 23, 2025

Actually, I just remembered that there is an even simpler solution: pg_stat_activity(query) shows the query text of any running SQL statement. There is, however, a catch: By default this column is limited to 1 KiB and the text of our queries is often longer. The limit can be raised by adjusting track_activity_query_size but a change of this runtime parameter only takes effect at a server restart. (Damn, this would have been another bullet item for the last maintenance break...)

Nonetheless, I'll prepare a PR to raise the limit to 4 KiB, so the change will take effect at the next server restart.

@bgruening
Copy link
Member

Ok, lets run this query tomorrow and see?

@sanjaysrikakulam
Copy link
Member Author

Ok, lets run this query tomorrow and see?

The query is now running in a tmux session (for additional details, see the OP's chat).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants