Skip to content

Dataset results are sometimes just not saved to postgres #510

@jeanetteclark

Description

@jeanetteclark

Rarely, a dataset will seemingly finish processing and save to postgres (according to the logs) but it's row in postgres is not actually updated. I just saw an example of this checking a dataset for ESS-DIVE. Here is some log output:

20251020-21:13:07: [DEBUG]: Completed running quality suite. [edu.ucsb.nceas.mdqengine.Worker:222]
20251020-21:13:07: [DEBUG]: Searching for sequence id for pid: ess-dive-4c1829de1b8a2ec-20251020T211201461 [edu.ucsb.nceas.mdqengine.Worker:277]
20251020-21:13:07: [DEBUG]: Getting all runs (backward) for suiteId: ess-dive.data.center.suite-1.3.0, metadataId: ess-dive-4c1829de1b8a2ec-20251020T211201461, minDate: 2025-10-01T21:12:01.992Z, 2025-10-31T21:12:01.992Z [edu.ucsb.nceas.mdqengine.model.Run:227]
20251020-21:13:07: [DEBUG]: Getting next run: metadataId: ess-dive-4c1829de1b8a2ec-20251020T211201461, suiteId: ess-dive.data.center.suite-1.3.0 [edu.ucsb.nceas.mdqengine.model.Run:74]
20251020-21:13:07: [DEBUG]: Recursion level: 1 [edu.ucsb.nceas.mdqengine.model.Run:75]
20251020-21:13:07: [DEBUG]: Checking for next backward pid (obsoletes) [edu.ucsb.nceas.mdqengine.model.Run:149]
20251020-21:13:07: [DEBUG]: traversing backward to obsoletes: ess-dive-e9a1984660cb28a-20251020T205156799 [edu.ucsb.nceas.mdqengine.model.Run:158]
...unrelated statements for the version chain recursion...
20251020-21:13:07: [DEBUG]: Checking for next forward pid (obsoletedBy) [edu.ucsb.nceas.mdqengine.model.Run:134]
20251020-21:13:07: [DEBUG]: Reached end of forward (obsoletedBy) chain at pid: ess-dive-4c1829de1b8a2ec-20251020T211201461 [edu.ucsb.nceas.mdqengine.model.Run:145]
20251020-21:13:07: [DEBUG]: Shutting down store [edu.ucsb.nceas.mdqengine.model.Run:246]
20251020-21:13:07: [DEBUG]: Done getting all runs (in DataONE obsolescence chain) for : metadata PID: ess-dive-4c1829de1b8a2ec-20251020T211201461, suite id: ess-dive.data.center.suite-1.3.0 [edu.ucsb.nceas.mdqengine.model.Run:247]
20251020-21:13:07: [DEBUG]: Using found sequenceId: ess-dive-8c3d37b0a3679ee-20251020T182116464 [edu.ucsb.nceas.mdqengine.Worker:302]
20251020-21:13:07: [DEBUG]: Saving to persistent storage: metadata PID: ess-dive-4c1829de1b8a2ec-20251020T211201461, suite id: ess-dive.data.center.suite-1.3.0 [edu.ucsb.nceas.mdqengine.model.Run:272]
20251020-21:13:07: [DEBUG]: Done saving to persistent storage: metadata PID: ess-dive-4c1829de1b8a2ec-20251020T211201461, suite id: ess-dive.data.center.suite-1.3.0 [edu.ucsb.nceas.mdqengine.model.Run:277]
20251020-21:13:07: [DEBUG]: Updating sequenceId to ess-dive-8c3d37b0a3679ee-20251020T182116464 [edu.ucsb.nceas.mdqengine.Worker:313]
20251020-21:13:07: [DEBUG]: Updating runs with new sequence id: ess-dive-8c3d37b0a3679ee-20251020T182116464 [edu.ucsb.nceas.mdqengine.model.Run:269]
20251020-21:13:07: [DEBUG]: Updating modified runs to datastore. [edu.ucsb.nceas.mdqengine.model.Run:510]

based on this, the run should have been updated with the status of success and the run result. However, the db still shows:

metadig=> select results, suite_id, timestamp, status from runs where suite_id = 'ess-dive.data.center.suite-1.3.0' and metadata_id='ess-dive-4c1829de1b8a2ec-20251020T211201461';
                                       results                                       |             suite_id             |           timestamp           |   status   
-------------------------------------------------------------------------------------+----------------------------------+-------------------------------+------------
 <?xml version="1.0" encoding="UTF-8"?>                                             +| ess-dive.data.center.suite-1.3.0 | 2025-10-20 21:13:08.037554+00 | processing
 <ns0:run xmlns:ns0="https://nceas.ucsb.edu/mdqe/v1">                               +|                                  |                               | 
    <timestamp>2025-10-20T21:13:08.037Z</timestamp>                                 +|                                  |                               | 
    <objectIdentifier>ess-dive-4c1829de1b8a2ec-20251020T211201461</objectIdentifier>+|                                  |                               | 
    <suiteId>ess-dive.data.center.suite-1.3.0</suiteId>                             +|                                  |                               | 
    <nodeId>urn:node:ESS_DIVE</nodeId>                                              +|                                  |                               | 
    <runStatus>processing</runStatus>                                               +|                                  |                               | 
    <errorDescription></errorDescription>                                           +|                                  |                               | 
    <sequenceId>ess-dive-8c3d37b0a3679ee-20251020T182116464</sequenceId>            +|                                  |                               | 
    <runCount>1</runCount>                                                          +|                                  |                               | 
 </ns0:run>                                                                         +|                                  |                               | 
                                                                                     |                                  |                               | 
(1 row)

So far I haven't been able to see any rhyme or reason to when this happens. Updating the dataset usually resolves the issue because the newer version runs as expected.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions