Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion pysqa/wrapper/slurm.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,13 @@ def get_queue_status_command(self) -> list[str]:
@staticmethod
def get_job_id_from_output(queue_submit_output: str) -> int:
"""Extracts the job ID from the output of the job submission command."""
return int(queue_submit_output.splitlines()[-1].rstrip().lstrip().split()[-1])
return int(
queue_submit_output.splitlines()[-1]
.rstrip()
.lstrip()
.split()[-1]
.split(";")[0]
)
Comment on lines +50 to +56
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Make parsing resilient (blank lines, “Submitted…” form, array jobs).

Current logic can break on:

  • Trailing blank line (IndexError).
  • Array IDs like "12345_1" or "12345_[1-10]" (ValueError on int()).
  • It can be simplified using strip()/partition() and made future-proof with a regex for leading digits.

Proposed refactor:

  • Use last non-empty line, take last token, strip semicolon-suffix, then extract leading digits.
  • Raise a clear error when parsing fails.

Apply this diff within the method:

-        return int(
-            queue_submit_output.splitlines()[-1]
-            .rstrip()
-            .lstrip()
-            .split()[-1]
-            .split(";")[0]
-        )
+        lines = [ln.strip() for ln in queue_submit_output.splitlines() if ln.strip()]
+        if not lines:
+            raise ValueError("Empty sbatch output; cannot parse Slurm job ID.")
+        last = lines[-1]
+        token = last.split()[-1]  # works for "Submitted batch job 12345" too
+        token = token.partition(";")[0]  # remove parsable suffix like ";cluster"
+        m = re.match(r"^(\d+)", token)
+        if not m:
+            raise ValueError(f"Unable to parse Slurm job ID from: {last!r}")
+        return int(m.group(1))

Additionally, add the import at the top of the file:

import re

Tests to cover:

  • "12345"
  • "12345;cluster"
  • "Submitted batch job 12345"
  • "12345_1" and "12345_[1-10]"
  • Output ending with a blank line
🤖 Prompt for AI Agents
In pysqa/wrapper/slurm.py around lines 50 to 56, the current parsing logic for
extracting job IDs from queue_submit_output is fragile and can fail on blank
lines, array job IDs, or different output formats. Refactor the code to first
find the last non-empty line, then extract the last token, strip any trailing
semicolon, and use a regex to extract only the leading digits from that token.
Add error handling to raise a clear exception if parsing fails. Also, add
"import re" at the top of the file. Write tests covering various output cases
including plain IDs, IDs with semicolons, "Submitted batch job" lines, array job
IDs, and outputs ending with blank lines.


@staticmethod
def convert_queue_status(queue_status_output: str) -> pandas.DataFrame:
Expand Down
Loading