-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error handling in cmor_CV.c #245
Comments
I used negative number to count the error message. @doutriaux1 used I could not find any return values that would affect the computation error messages. Can you tell me how you encountered this issue? |
My apologies @dnadeau4, I misunderstood the problem. The issue is due to the fact that many of the checks performed in cmor_CV_checkParentExpID do not return an error code (most of the loops that end with |
Do you think that when a cmor_handle_error is indicates an error has been encountered that ierr should be set to -1 or +1, rather than 0? Would that be the right thing to do? Is it possible? |
@taylor13, I'm not sure if your question was directed at me, but the work I did for #171 involved recording (via cmor_handle_error_var) when a critical error ( I hope that's helpful? :) |
The program sends a SIGTERM and the number of error I only work 10%-20% of my time for PCMDI now until next year. |
Can we leave this open as a suggested future change? No need to address this issue, but we might want to in the future. |
My apologies for not being clear. I believe there is a genuine issue here. In some cases where there is a problem in Also, this and this is just a warning, so I don't think I have convinced myself that the functions called in |
Pushing to 3.3.1, cannot be fix in 3.3.0 release time frame. |
@ehogan I have very confused with all this #245 (comment) Some of the place where you want to I will push this to 3.3.2, but I am not sure I can work on this, since I only work 8 hour/week for PCMDI. If you can create a PR, I might be able to merge it if my tests do not fails. |
@ehogan @jonseddon just to let you know that @doutriaux1 is now working on CMOR3. How are you doing with CMIP6? I have not heard from you in a long time. |
This issue is similar to what I observed in #540. Should we have all of the CMOR CV functions return a positive error count rather than just -1? |
It would be really nice to hear why some errors are "-1" and others "1". There must have been a good reason for making the distinction. Anyone remember one? @dnadeau4? @doutriaux1? |
A lot of the functions in cmor_CV.c get called in Lines 2963 to 2989 in 9511753
cmor_setGblAttr accumulates these errors and returns them in cmor_write , which then accumulates them with other errors for this test:Lines 4476 to 4511 in 9511753
It doesn't distinguish between positive and negative errors, just nonzero. This works fine as long as all the functions return negative integers for errors, which all of the functions in the above example do return. However, if you were to mix up positive and negative errors you could wind up making it look like there were no errors. That was one of the issues I encountered while working with PrePARE in #540 where negative errors from Lines 492 to 493 in 9511753
There was one place that returned a positive integer for a warning in cmor_CV.c. Lines 656 to 664 in 9511753
Should the above check be for an error instead of a warning? Another interesting thing to note of Lines 4410 to 4434 in 9511753
However, other errors in cmor_write are 1.Lines 4504 to 4543 in 9511753
The documentation just says that cmor_write returns 0 when successful.https://cmor.llnl.gov/mydoc_cmor3_api/#cmor_write There's a statement in cmor_tables.c that is checking if Lines 871 to 874 in 9511753
However, cmor_CV_set_entry doesn't seem to return anything aside from 0.Lines 2228 to 2285 in 9511753
|
I think to apply the "keep going like a compiler and accumulate the number of errors" method of CMOR, functions like |
@durack1 @taylor13 Lines 1045 to 1063 in 9511753
On line 1048, shouldn't that be PARENT_EXPERIMENT_ID rather than PARENT_ACTIVITY_ID ?
Lines 1049-1051 will make the function return 0 if "no parent" is found in the above attribute from CMIP6_CV.json. Does that mean we can ignore checking the other attributes in |
@mauzey1 there are two identifiers, the |
@durack1 If there is no |
Lines 656 to 664 in 7b59863
Above are lines from the function cmor_CV_checkSourceID in cmor_CV.c. They check whether a "source_id" object in the CV file has a "source" section. If the number of objects in the "source_id" object is -1, then it will give the warning message saying that the "source" section is not defined. However, this part will return 1 even though it is a warning rather than a error. The next lines of code will search through the "source_id" object to find the "source" section, which would give a warning message and break from the "source_id" section search loop if "source" is not found.
Should the above section of code be changed so that it breaks from the loop but not return? The break is already there but is not reached due to the return. Or should the message be an error instead of a warning? |
@mauzey1 just to confirm this is using the cmip6-cmor-tables/Tables/CMIP6_CV.json file isn't it? I ask, as for this file we do the work of accumulating contributions that are registered in CMIP6_CVs/CMIP6_source_id.json and then create each of the |
@durack1 Yes, that is the file I am referring to. |
@mauzey1 so as we create this file, it will always have a |
@durack1 The section I mentioned would only be reached if the "source_id" was not a valid JSON object in cmor_CV.json. For example, if you were processing a file with PrePARE with a "source_id" of "PCMDI-test-1-0"
Should that be considered a critical error since there is something wrong with the table files? |
@mauzey1 as we are generating the CMIP6_CV.json file, and control all the content, the only time that this test (and the result) would be useful is in the case that a user has "hacked" the CV file, and included a json-invalid entry. I think in the case that anything is found wrong with the file(s), then a fatal error should be thrown and the user alerted to the error that was encountered. I make this statement knowing that you have implemented comprehensive testing, so a false positive fatal error is incredibly unlikely. |
@taylor13 Would you like to comment on the logic of the code section posted in #245 (comment)? If the data file doesn't have the attribute |
Apologies for the delay getting to this. Before commenting on what you asked me to comment on, I agree with @durack1 regarding the check on the source information in the CV. Now for experiment_id and parent, etc. First, I'll outline how I would have constructed the logic: Given an experiment_id, the experiment_id CV specifies the only acceptable options for parent_experiment_id and the only acceptable options for parent_activity_id. We need to check that what the user specifies for parent_experiment_id and parent_activity_id is consistent with those options. If one of the options in the CV is for parent_experiment_id and parent_activity_id to be "no parent", then the user may specify "" for the values, but CMOR should change the empty strings to "no parent" before writing the file. If the parent_experiment_id != "no parent" (and != ""), then the user must specify the following attributes (but the values are just examples):
If the user fails to specify these attributes, a fatal error should be raised. Also when parent_experiment_id != "no parent", CMOR should set parent_mip_era = "CMIP6" . (It's probably o.k. for CMOR to invariably set parent_mip_era="CMIP6"). If parent_experiment_id = "no parent", then the first 5 entries should either be set to "" or "no parent"; if the user has specified something else, we should raise an error. (or we could simply override what the user specifies with "no parent" and raise a warning telling the user we have done that.) If the user has specified "", CMOR should replace that with "no parent" before writing the file. If parent_experiment_id = "no parent", then the last two entries above (the times) may or may not be specified by the user. Unless the user specifies a value, we should omit these two attributes for the "no parent" case. If the user does specify values, they should be recorded in the file. It is likely that currently CMOR is less thorough than outlined above, but CMOR should be at least consistent with the above. Perhaps @mauzey1 can say whether there are obvious inconsistencies or things that are unclear before I try to address the specifics. thanks for attempting to address this issue raised so long ago. |
@taylor13 So if the file has parent_experiment_id = "" or doesn't have parent_experiment_id in global attributes, then we should replace it with parent_experiment_id = "no parent" if that is the valid parent experiment in the CV? |
The CV will never have ""; it will have "no parent" if there is no parent experiment. I think the user is allowed to specify for parent "", if there is "no parent", but CMOR should probably write out the attribute with the value "no parent". I guess if CMOR currently just omits the attribute in this case, that would be o.k., but I don't think CMOR should ever specify "" for the value of an attribute written to a file. That seems difficult for a user to interpret. Do you agree? |
Darn I miss these @taylor13 discussions. 😄 |
@taylor13 Here is a change to the code that will replace "" with "no parent" for the parent_experiment_id value. Instead of exiting, it would continue to the next part of
|
@taylor13 @mauzey1 (and why the hell not @doutriaux1) you might want to double check (CMIP6_experiment_id.html)[https://wcrp-cmip.github.io/CMIP6_CVs/docs/CMIP6_experiment_id.html] (or the source json file) to make sure that this assumption holds. I think we cleaned it up, but not 100% sure |
Some of the functions in the cmor_CV.c file return a value of
-1
when an error is encountered, while others return a value of1
.This causes problems at this point in the cmor.c file, since the return values from each call to the
cmor_CV_check*
functions are added together. Depending on the combination of issues with the CVs in thecmor_dataset
JSON file, this can result in the returnedierr
being0
.Would it be possible to use a consistent return value (i.e.
1
so theierr
used in the error message indicates the number of errors) when an error is encountered for the functions in cmor_CV.c file, please?The text was updated successfully, but these errors were encountered: