Skip to content

Commit 5e38507

Browse files
nikostrcmeesters
andauthored
feat: treat sbatch errors as job errors instead of workflow errors (#322)
fixes #320 When I do some very rudimentary testing it seems to work: when sbatch fails, it is handled like a job error rather than a workflow error. The argument for `report_job_error` needs to be a `SubmittedJobInfo`. These jobs are technically never reported as submitted. I'm not sure if that would cause some type of problem? Would this PR need some sort of tests included? <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Bug Fixes** - Improved error handling during SLURM job submission to prevent unnecessary interruptions when an error occurs, ensuring smoother workflow execution. - Fixed minor formatting issue related to job status checking for more accurate time calculations. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Christian Meesters <[email protected]>
1 parent bf4fcd1 commit 5e38507

File tree

1 file changed

+9
-3
lines changed

1 file changed

+9
-3
lines changed

snakemake_executor_plugin_slurm/__init__.py

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -366,9 +366,15 @@ def run_job(self, job: JobExecutorInterface):
366366
process.returncode, call, output=err
367367
)
368368
except subprocess.CalledProcessError as e:
369-
raise WorkflowError(
370-
f"SLURM sbatch failed. The error message was {e.output}"
369+
self.report_job_error(
370+
SubmittedJobInfo(job),
371+
msg=(
372+
"SLURM sbatch failed. "
373+
f"The error message was '{e.output.strip()}'.\n"
374+
f" sbatch call:\n {call}\n"
375+
),
371376
)
377+
return
372378
# any other error message indicating failure?
373379
if "submission failed" in err:
374380
raise WorkflowError(
@@ -445,7 +451,7 @@ async def check_active_jobs(
445451

446452
# We use this sacct syntax for argument 'starttime' to keep it compatible
447453
# with slurm < 20.11
448-
sacct_starttime = f"{datetime.now() - timedelta(days = 2):%Y-%m-%dT%H:00}"
454+
sacct_starttime = f"{datetime.now() - timedelta(days=2):%Y-%m-%dT%H:00}"
449455
# previously we had
450456
# f"--starttime now-2days --endtime now --name {self.run_uuid}"
451457
# in line 218 - once v20.11 is definitively not in use any more,

0 commit comments

Comments
 (0)