Skip to content

Conversation

jasondaming
Copy link
Member

Summary

Adds a comprehensive "Common Causes of Loop Overruns" section to the debugging documentation, explaining what loop overruns are, their symptoms, common causes, and how to avoid/diagnose them.

Changes

  • New section explaining loop overruns (periodic methods taking >20ms)
  • Documents common causes:
    • Blocking operations (Thread.sleep, synchronous I/O, busy-wait loops)
    • Excessive computation (complex calculations, large data structures, unoptimized algorithms)
    • Excessive logging (print statements, NetworkTables updates, Shuffleboard updates)
    • Hardware/sensor issues (CAN timeouts, I2C/SPI timeouts, USB enumeration)
  • Provides tips for avoiding loop overruns (Notifier, profiling, caching, removing print statements)
  • Links to VisualVM profiling documentation

Fixes #2315

Adds comprehensive section explaining what loop overruns are and their common causes including blocking operations, excessive computation, excessive logging, and hardware issues. Includes tips for avoiding and diagnosing loop overruns.

Fixes wpilibsuite#2315

## Common Causes of Loop Overruns

Loop overruns occur when the robot's periodic methods (``robotPeriodic()``, ``teleopPeriodic()``, etc.) take longer than 20ms to complete. When this happens, the Driver Station will display a warning and the robot code may behave unpredictably. Here are common causes:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be an example of the warning so that people searching will find it.

### Excessive Logging or Print Statements

- **System.out.println() in loops**: Console output is slow, especially when called frequently
- **Verbose NetworkTables updates**: Updating many NetworkTables entries every loop iteration
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Network tables updates should be fast. What might be slow is getting whatever data that is being put to NT

- Profile your code to identify slow sections (see :ref:`docs/software/advanced-gradlerio/profiling-with-visualvm:profiling with visualvm`)
- Remove or reduce print statements, especially in frequently-called code
- Cache values that are expensive to compute rather than recalculating every loop
- Use the Driver Station log to identify which periodic method is causing overruns
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend going one level deeper on how to interpret the log in this case.


### Blocking Operations

- **Thread.sleep() or wait()**: Never use blocking sleep or wait calls in periodic methods
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- **Thread.sleep() or wait()**: Never use blocking sleep or wait calls in periodic methods
- **``Thread.sleep()`` or ``wait()``**: Never use blocking sleep or wait calls in periodic methods


### Excessive Logging or Print Statements

- **System.out.println() in loops**: Console output is slow, especially when called frequently
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- **System.out.println() in loops**: Console output is slow, especially when called frequently
- **Print statements (e.g. ``System.out.println``) in loops**: Console output is slow, especially when called frequently


### Tips to Avoid Loop Overruns

- Use :doc:`Notifier </docs/software/convenience-features/scheduling-functions>` for operations that need precise timing independent of the main loop
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps addPeriodic() is simpler?

### Hardware/Sensor Issues

- **Synchronous CAN calls**: Some motor controller methods may block waiting for a response
- **I2C or SPI timeouts**: Faulty sensors or loose connections can cause communication timeouts
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- **I2C or SPI timeouts**: Faulty sensors or loose connections can cause communication timeouts
- **I²C or SPI timeouts**: Faulty sensors or loose connections can cause communication timeouts


- **System.out.println() in loops**: Console output is slow, especially when called frequently
- **Getting data to publish to NetworkTables**: While NetworkTables updates themselves are fast, retrieving complex data (e.g., vision processing results, large arrays) to publish can be slow
- **Excessive Shuffleboard updates**: Sending large amounts of data to the dashboard
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- **Excessive Shuffleboard updates**: Sending large amounts of data to the dashboard
- **Excessive dashboard updates**: Sending large amounts of data to the dashboard (Shuffleboard, Elastic, etc.)

- Profile your code to identify slow sections (see :ref:`docs/software/advanced-gradlerio/profiling-with-visualvm:profiling with visualvm`)
- Remove or reduce print statements, especially in frequently-called code
- Cache values that are expensive to compute rather than recalculating every loop
- **Check the Driver Station log** to identify which periodic method is causing overruns. The log will show timestamps and which robot mode was active when the overrun occurred. Look for patterns - if overruns only happen during teleop, check ``teleopPeriodic()`` and subsystems used during teleop. If they occur consistently, check ``robotPeriodic()`` for code that runs regardless of mode.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- **Check the Driver Station log** to identify which periodic method is causing overruns. The log will show timestamps and which robot mode was active when the overrun occurred. Look for patterns - if overruns only happen during teleop, check ``teleopPeriodic()`` and subsystems used during teleop. If they occur consistently, check ``robotPeriodic()`` for code that runs regardless of mode.
- **Check the Driver Station log** to identify which periodic method is causing overruns. The log will show timestamps and which robot mode was active when the overrun occurred. Look for patterns - if overruns only happen during teleop, check ``teleopPeriodic()`` and subsystems used during teleop. If they occur consistently, check ``robotPeriodic()`` for code that runs regardless of mode. The log also shows the amount of time each subsystem's ``periodic()`` method ran, which can help pinpoint which code is causing the overruns.

- **USB device enumeration**: Plugging/unplugging USB devices during operation

### Tips to Avoid Loop Overruns

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Use ``Trigger``s to check conditions without a loop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Document common ways loop overruns occur

3 participants