-
Notifications
You must be signed in to change notification settings - Fork 56
[rrfs-mpas-jedi] Allow MPASJEDI write analysis into init.nc
directly at cold-start cycles
#949
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: rrfs-mpas-jedi
Are you sure you want to change the base?
Conversation
…o to write DA increments into init.nc directly
…list.atmosphere.analysis
Convert to a draft while doing retro tests. |
We just updated rrfs-workflow to make it work on newly OS-upgraded GaeaC6. Here is the error message:
I ncdump'ed the diag.nc and history.nc files and they did miss the global attributes. @SamuelTrahanNOAA FYI, skipping not-found attributes in MPAS-IO does has unexpected results here. We need to examine this issue further. Thanks! |
Do you have sample files to look at? I'm baffled by the absence of that variable. It should be in all the mpas history and diag files. |
Sorry, I tagged another sam. Here are sample files:
I think you should be able to repeat this by running a MPAS forecast using your update MPAS-Model. |
I suspect I know the cause and solution, but I must investigate further to be certain. Recall that I added code to only write an attribute if it was defined. That's because JEDI was writing MPAS files that lacked some attributes Jedi wanted to write. MPAS uses the archaic NetCDF 3 format, which cannot support defining attributes in data mode. My fix for that was to not write the attribute unless it was defined. The MPAS may be using the same section of code to define attributes when NetCDF is in definition mode. In that case, my "fix" will prevent MPAS from defining attributes. To fix my fix, I need it to allow defining attributes when NetCDF is in definition mode. |
The branch isn't compiling on GAEA. I have a potential fix, but I'm getting an error about a missing target. Is your branch pointing to the correct version of the MPAS-Model?
My clone is here:
I compiled with this command in the sorc directory:
|
@SamuelTrahanNOAA I am sorry that when I updated this PR, I forgot to include the |
@SamuelTrahanNOAA Could you try the latest |
I am attempting this now. The build hasn't finished yet, but I'll update you when I have something useful to report. |
init.nc
directly at cold-start cyclesinit.nc
directly at cold-start cycles
This branch, even without my changes, segfaults. There seems to be something wrong with "scalar 12." It starts at 0.468376E+09 and ends at NaN. Log lines with `global min, max scalar 12`
Here's a log from the failing job:
This is where I built it:
|
Variable
EDIT: The job reads another file, which is a symlink to that one: Some NaNs
|
@guoqing-noaa Do you know why the init.nc contains gibberish in |
In current rrfs-workflow, which has been used by lot of us and run correctly, the MPAS-Model hash is:
I think you may run 3DVAR using the GSIBEC, which does not work with spack-stack-1.9.x.
Thanks! |
I think this is expected as GFS grib2 files does not have |
@SamuelTrahanNOAA I found that you have been running getkf. |
I reproduced the failure, but I cannot run the full system. It wants these files, which I don't have. When I point to your copy of these files, the workflow runs to the failure point: Missing: Solution: I'll rerun with the fix to my fix and see if it works. |
@SamuelTrahanNOAA Yes, please use: This is a staged copy of 30 ensembles which can be used by anyone so that we don't need to run the ensemble system by ourselves. |
I have a fix that works for the mpas forecast. I'm retesting it on the full workflow with jedivar now. |
Fix is here: I'm still running the full workflow test. |
I closed my MPAS-Model PR since it went to the wrong branch. I'm not sure what branch you're using for MPAS-Model. If you can tell me where, I'll PR to it. Otherwise, you'll find it here: |
The workflow has proceeded further, with global attributes happily present, but now it is failing due to bad data in Any thoughts?
Some NaNsThey're scattered about the file with other meaningless numbers like
|
@SamuelTrahanNOAA |
I checked init.nc, and volg is never initialized. Also, it is invalid in the time 0 output. I suspect nothing in the model ever writes to the volg variable, and it's outputting whatever happened to be in memory. If you used ncks to add volg to init.nc, it would retain that initial value. That's not a fix, it's a kludge. Either the model should write valid data to volg, or it shouldn't write volg at all. I might be wrong, though, and I won't know until I try a version of this workflow that actually works. Can you tell me where I can find the best known working version of this workflow? |
The MPAS-Model log file seems upset about the lack of volg. Apparently it was expected in the input file.
|
I think the error is in your namelist.init_atmosphere where you disable initialization of the tempo scheme, despite using the tempo scheme in namelist.atmosphere: &preproc_stages
config_static_interp = false
config_native_gwd_static = false
config_native_gwd_gsl_static = false
config_vertical_grid = true
config_met_interp = true
config_input_sst = false
config_frac_seaice = true
config_tempo_rap = false
/ Note the |
Now that I've looked deeper, it appears the init_atmosphere doesn't even know how to initialize volg. It's possible to provide boundary conditions for it, but not initialize it. Inside the model, I don't think volg is initialized either. It remains invalid until after the first time the physics updates it. Developers probably thought this was okay since volg is only a diagnostic quantity. It appears MPAS can't handle uninitialized diagnostic quantities that are expected to be in the initial state. |
@SamuelTrahanNOAA Thanks for digging into this. But this behavior is NOT limited to this PR. I think every GSL MPAS realtime and retro runs has the same issue. The latest rrfs-workflow.v2 does not init |
The only reason why it would cause a crash is if MPASSIT tries to perform floating-point operations on the |
This time it did fail. It appears the volg is still in histlist_3d. The obvious fix is to remove it.
|
In a meeting, we decided to modify MPAS-Model's Registry.xml to initialize There's several errors about targets already existing, like this one:
Also, there are some errors about libraries that don't exist.
All but one tell me to read about policy CMP0002. This is the one that doesn't.
|
I tried reverting mpas and mpas-jedi, and explicitly setting |
@SamuelTrahanNOAA There is only one copy of MPAS-Model in rrfs-workflow, i.e. sorc/MPAS-Model. The issue you got may be due to some of your local changes. I can update my branch to include the change in init Registry.xml for your test. Thanks! |
Previously, MPASJEDI cannot write analysis into
init.nc
at cold-start cycles due to global attribute mismatch.Thank @SamuelTrahanNOAA for his great work updating the MPAS-Model so that the MPAS I/O interface can ignore global attribute mismatch and allow writing DA analysis into
init.nc
directly. Check with @SamuelTrahanNOAA for more details on this and Sam's recent model changes can be found at: RRFSx/MPAS-Model#5This PR is to incorporate Sam Trahan's model changes into rrfs-workflow and retire the previous temporary
ncks
method.