Skip to content

Check mandatory and unexpected arguments before creating an instance.#219

Open
YooSunYoung wants to merge 5 commits intorefactor-3from
refactor-4
Open

Check mandatory and unexpected arguments before creating an instance.#219
YooSunYoung wants to merge 5 commits intorefactor-3from
refactor-4

Conversation

@YooSunYoung
Copy link
Copy Markdown
Contributor

@YooSunYoung YooSunYoung commented Mar 3, 2026

Fixes #214

This fix comes from @nitrosx 's commnet from the chat.

Can we do the following:

we ingest the file with just the file.
dataset title should be the name of the file
add keyword "Failed ingestion"
ownergroup should be a default one define in configuration
log an error with the filename, so we can alert the curators to check the dataset

This will allow us to have already the record in SciCat, althought I'm not 100% convinced that is the best solution
Thoughts?

TODO:

  • Add fall back default values
  • Add keyword 'Failed Ingestion' for fall-back ingestion
  • Add tests

I have to work on something else for next couple days but I'll come back to it...
Or anyone can pick this up and continue working on them if needed : D...

@YooSunYoung
Copy link
Copy Markdown
Contributor Author

YooSunYoung commented Mar 11, 2026

@nitrosx

I tried implementing just fallback-ingestion simply with the file, but then the mandatory fields like 'owner' will be ambiguous...

So in this PR, I tried using fallback schema instead, that ensures all mandatory fields are retrieved/constructed.

And we should probably have integration test/unit test with filewriter and scicat to make sure this fallback schema never fails.

In the current version, we can't have the keyword of Failed Ingestion because it will not fail unless there is a critical issue. And I think it should just crash if there is a critical issue.

How does this approach sound to you??

@YooSunYoung YooSunYoung marked this pull request as ready for review March 11, 2026 15:00
@YooSunYoung YooSunYoung requested a review from nitrosx March 11, 2026 15:00
@YooSunYoung
Copy link
Copy Markdown
Contributor Author

I thought about having a default fallback schema used by default but then it might not work with other facilities files...

So I added it as a configuration instead.
But the ess-fallback.imsc.yml file is still included in the package and we can use the name prefix as the name easily in the configuration.

Comment on lines 260 to +281
@@ -258,6 +276,9 @@ def main() -> None:
schema_id=metadata_schema.id,
logger=logger,
)
# Merge fallback variable map and the selected schema variable map.
# Order is important so that the selected schema variable configuration is preferred.
variable_map = {**fallback_variable_map, **variable_map}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would change the logic so the fallback schema is used only if computing variable_map fails.
What if a field is not defined on purpose in the schema?
With this logic, it would get assigned a default value which might not be the intention of the data curator

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable map doesn't always end up in the metadata unless it's used in the metadata definition itself.
So having default values in the variable map should be fine I think.

Comment on lines +291 to +296
# Merge fallback schema definitions and the selected schema definitinos.
# Order is important so that the selected schema schema configuration is preferred.
fallback_schema_definitions = (
{} if fallback_schema is None else fallback_schema.schema
)
schema_definitions = {**fallback_schema_definitions, **metadata_schema.schema}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as before!!!
Please use the fallback only if computing metadata_schema results in an error

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My idea was that the fallback schema definitions will only have the minimum set of definitions that you must have in the metadata schema.
But I guess it's hard to prevent from abusing the fallback schema or/and confuse the users who write metadata schema file...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants