Skip to content

Infinite Judging Loop When Compilation Results Differ Between Judgehosts #3143

@alessiojr

Description

@alessiojr

I recently migrated directly from DOMjudge 7.3.3 to version 9, and I'm uncertain whether this issue is a bug in the current codebase or potentially related to the migration process itself. I've identified what I believe to be the root cause and have drafted a proposed fix, but I wanted to report this first to confirm whether my understanding aligns with the expected system behavior. If the development team agrees with this analysis and the proposed approach, I would be happy to submit a formal pull request with the fix. However, I want to ensure that what I'm observing is indeed a bug and not an intended behavior or a migration-specific issue before proceeding.

When multiple judgehosts process the same judging and produce different compilation results (one succeeds, another fails), the system detects the inconsistency and creates an InternalError record. However, it fails to properly invalidate the pending judge tasks, causing the judging to remain in an indefinite processing state.

Expected Behavior

When compilation result inconsistencies are detected:

  1. An InternalError should be created
  2. The problematic judgehost(s) should be disabled
  3. All related judge tasks should be marked as invalid
  4. The judging should be marked with the internal error
  5. The system should allow for proper cleanup and potential rejudging

Actual Behavior

When compilation result inconsistencies are detected:

  1. An InternalError is created ✓
  2. The problematic judgehost(s) are disabled ✓
  3. Judge tasks remain valid and active
  4. The judging is not marked with the internal error
  5. The judging continues to be processed indefinitely, creating a loop

Technical Details

The issue occurs in in the updateJudgingAction method, specifically in two scenarios: JudgehostController.php

Scenario 1: Lines ~420-430

When compilation succeeds but a previous result indicated compiler error:

  • The code creates an InternalError and disables other judgehosts
  • Missing: Invalidation of judge tasks from those judgehosts
  • Missing: Marking the judging with setInternalError()

Scenario 2: Lines ~470-480

When compilation fails but a previous result was successful:

  • The code creates an InternalError and disables the current judgehost
  • Missing: Invalidation of judge tasks from the current judgehost
  • Missing: Marking the judging with setInternalError()

Comparison with Similar Code

In the same file, when a compilation error occurs normally (lines ~450-460), the code properly:

// Invalidate judgetasks
$this->em->getConnection()->executeStatement(
    'UPDATE judgetask SET valid=0 WHERE jobid=:jobid',
    ['jobid' => $judging->getJudgingid()]
);

This pattern is missing in the "compilation results are different" cases.

Impact

  • Judgings remain in perpetual "processing" state
  • Judge tasks continue to be assigned to judgehosts
  • Database accumulates invalid/conflicting records
  • System resources are wasted on un-finishable judgings
  • Manual database intervention may be required to fix affected judgings

Proposed Fix

Add proper cleanup in both inconsistency detection scenarios:

  1. Invalidate judge tasks associated with problematic judgehosts
  2. Mark the judging with the internal error using setInternalError()
  3. Ensure database flush to persist changes

Example for Scenario 1:

// After creating InternalError
$this->em->getConnection()->executeStatement(
    'UPDATE judgetask SET valid=0'
    . ' WHERE jobid=:jobid AND judgehostid IN'
    . ' (SELECT judgehostid FROM judgehost WHERE hostname=:hostname)',
    [
        'jobid' => $judging->getJudgingid(),
        'hostname' => $hostname,
    ]
);
$judging->setInternalError($error);
$this->em->flush();

Steps to Reproduce

  1. Set up multiple judgehosts with different configurations
  2. Submit a solution that compiles successfully on one judgehost but fails on another (e.g., due to different compiler versions or system libraries)
  3. Observe that an internal error is created
  4. Check that judge tasks remain valid in the database
  5. Observe that the judging continues to be processed indefinitely

Additional Notes

This appears to be a systematic issue where error handling paths don't follow the same cleanup procedures as normal compilation error handling. The missing invalidation and error marking prevents the system's normal recovery mechanisms from functioning properly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions