Skip to content

Inconsistent data after calling stop #1546

@asgothian

Description

@asgothian

I have started to run more and more often into the issue that shepherd.db and nvbackup.json seem to be inconsistent.

What I could observe:
I had a pair of files with significantly different write times (shepherd.db from NOV 9th, nvbackup.json from NOV 7th) When running the adapter with these 2 files, I would consistently get errors when trying to operate the zigbee network. I tried a number of options to fix this (different coordinator location, different coordinator hardware instance (i.e. 2 Sonoff USB Sticks with identical firmware)

  • consistently - when starting the network with the 'original' USB stick, it would start up fine, run for a while and then collapse, generating this message:
2025-11-09 20:00:06.024 - error: zigbee.0 (78849) Send command to 0x84b4dbfffead9df4 failed with no error code (SRSP - AF - dataRequest after 6000ms)

At random, I would also get this error when trying to send a message to the network:

2025-11-09 19:54:01.284 - warn: zigbee.0 (78352) ELEVATED:EXSET2 (a9df) caught error ZCL command 0x8cf681fffeee45fc/1 genOnOff.off({}, {"timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":false,"direction":0,"reservedBits":0,"writeUndiv":false}) failed (Data request failed with error: 'undefined' (25)) when setting value for device 0x8cf681fffeee45fc.

Note that this message includes the error message raised from ZH, which should be the one below - which does not go into the timeout - it appears less than 800 ms after the command is sent.

ZCL command 0x8cf681fffeee45fc/1 genOnOff.off({}, {"timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":false,"direction":0,"reservedBits":0,"writeUndiv":false}) failed (Data request failed with error: 'undefined' (25))
  • consistently, when trying to start with the 'alternate' USB stick, I would get the error
2025-11-09 20:16:51.561 - error: zigbee.0 (79709) Starting zigbee-herdsman problem : target adapter aps link key data table size insufficient (size=3)

I was able to remove this behaviour by performing the following steps:

  • remove the existing nvbackup.json
  • start the adapter with the 'original' stick.
  • trigger a 'herdsman.stop' without stopping the adapter, i.e. rewriting the nvbackup.json cleanly

The network seems stable since I did this, and I have the 2 files for comparison - they are significantly different.

I have already identified the main cause:
The ioBroker.zigbee Adapter (which I am maintaining) calls herdsman.stop when it is stopped by ioBroker. The adapter itself requests 30 s for this to complete, as generating the coordinator backup (if supported) can take up to 20 seconds for large networks, especially when connected via LAN. (at least that is the longest I was able to observe myself)
Unfortunately, ioBroker will send a kill to the process after 4.5 s, independent of the status of the stop command. (see this issue: backup).

While I am working with the ioBroker team to get this resolved, I was confronted with a valid question - why does the Herdsman not keep these files consistent in case something bad occurs while the herdsman is stopped.

This is a valid question in my eyes. One possible option would be to delay the rename of both files until both files are written, but this carries the risk of no files being saved at all if something goes wrong at that time.

I would like to propose 2 stopgap measures for the zigbee-herdsman as additional features:

  • an option to disable the call to backup at herdsman.stop (ideally using a parameter 'nobackup')
  • the ability to change the backup and database save timeouts to a different value by exposing them via the herdsman options.

I can try to provide a PR with both, but would need pointers towards any special actions I need to take after editing TS files before the herdsman is run. So far, I have not done any TS development.

A.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions