-
Notifications
You must be signed in to change notification settings - Fork 6
Description
What guide are you proposing revisions for?
- EML Best Practices guide
- Dataset Design For Special Cases
- Domain Specific Guides
Chapter and Revision Information
Chapter Number: 2
Chapter Title: Model-Based Datasets
Current Version/Commit: Version 2 prerelease
File Path: https://github.com/EDIorg/data-package-best-practices/blob/main/guide-special-cases/model-based.qmd
Reviewer(s): James Laundre, Gabe Kamener, Li Kui (GC)
Review Date: 2025-09-08
Revision overview
1. Some recommendations, such as the use of CodeMeta, have not been widely adopted within our IMC (only 3 out of 24 cases). We recommend reframing this from a strong recommendation to an optional practice.
2. The chapter requires revisions to improve the overall logical flow and language clarity.
3. Much of the content in the model chapter references the code chapter and the large dataset chapter. Ensure consistency and alignment across these chapters.
Detailed Content Feedback
Major Issues
- Issue: [Remove all the content related to CodeMeta to “Optional” consideration towards the end of the document]
- Location: [Section/line number]
- Current text: [Quote relevant text if needed]
- Suggested change: [Your recommendation]
- Rationale: [CodeMeta is referenced throughout the document; however, it has not been widely adopted within our community over the past four years since the best practice was introduced. Therefore, it should not be presented as a recommended practice but rather as an optional one.]
- Issue: [“Referencing models in EML” section move below “Model parameters” section]
- Location:
- Current text:
- Suggested change:
- **Rationale:[The section on “Referencing models in EML” is of lower priority. We first need to determine which archiving method will be used before addressing how the source data should be cited in the EML document.]
- Issue: [Change the EML examples in the “Referencing models in EML” section]
- Location:
- Current text:
- Suggested change:
- **Rationale:[Currently, the EML cites the source data package, which is necessary but does not align with the emphasis of the “Referencing models in EML” section. What is needed instead is an example EML that demonstrates how to cite models hosted outside of EDI. ]
- Issue: [Add “Model environment” below the “Model parameter” section]
- Location:
- Current text:
- Suggested change:
- **Rationale:[The model environment—such as the version of R, specific R packages, or Python libraries used—is critical, as the model may not run successfully without this information. Advanced users are encouraged to include a container image within the package to enable full replication of the model.]
- Issue: [modify the language to have better logical flow and smooth read, see James Laundre’s edits.]
- Location:
- Current text:
- **Suggested change:See James Laundre's edited version of the chapter:https://docs.google.com/document/d/18-bsn0Q3QY-sADe73FNuRCQhr6tELjgMUfeGhy0b6wc/edit?usp=sharing
- **Rationale:[The flow of the chapter need to be improved]
- Issue: [reorganization of the “Model components” flow chart ]
-
Location:
-
Current text:
-
**Suggested change: Model parameters are an integral part of the code and should always be grouped with the “code,” regardless of the chosen model packaging option.The “environment” should be included alongside the “parameters” as part of the model setup process.The code should therefore encompass three key components: script, parameters, and environment.
-
**Rationale:[
Minor Issues
- **Line/Section [X]:[Line 79, hyperlink needs to be updated:large-offline.html]
- **Line/Section [X]:[line 49, need some orcid ID or organization id below the email address, per EML best practice ]
- **Line/Section [X]:[Suggest add this example into the “Example data packages in EDI“: https://portal.edirepository.org/nis/metadataviewer?packageid=knb-lter-mcr.2011.1 ]