Skip to content

Regex in CMIP6_CV.json to test *_index attributes #281

@neumannd

Description

@neumannd

The CMIP6_CV.json contains regular expressions to test the global attributes physics_index, initialization_index, forcing_index and realization_index for correctness. These global attributes should be integers (CMIP6 Global Attributes, DRS, Filenames, Directory Structure, and CV’s). Therefore, the CMOR PrePARE.py script) just checks the type of these attributes and does not use the regex of CMIP6_CV.json.

However, the regular expression provided in CMIP6_CV.json seems to check for an arbitrary number of [ in front and ] behind the integer. I don't understand, why this is done. This seems to contradict CMIP6 Global Attributes, DRS, Filenames, Directory Structure, and CV’s.

evaluation of the regular expression

In the CMIP6_CV.json the regex for testing the *_index attributes is written as:

^\\[\\{0,\\}[[:digit:]]\\{1,\\}\\]\\{0,\\}$

The first \ of each \\ escapes the second \. That's clear. Without escapes we have

^\[\{0,\}[[:digit:]]\{1,\}\]\{0,\}$

I assume that we have a POSIX Basic Regular Expression. That means that \[ and \] are taken literally. \{n,\} are intepreted as: "the sign/character/number left of this expression may appear n to infinite times". The ^ and $ are the beginning and end of a line, respectively. Thus, we have

^                 : beginning of the line
\[\{0,\}          : `[` appears zero to infinite times
[[:digit:]]\{1,\} : a digit between `0` and `9` appears one to infinite times
\]\{0,\}          : `]` appears zero to infinite times
$                 : end of the line

These values would be captured by the regular expression:

1
123
42
53253262

But also these values would be captured by the regular expression:

[1435]
[[123]]
[[123]
[123]]
[123]]]]]]]]]

I would have expected this regular expression

^[[:digit:]]\\{1,\\}$

or

^[[0-9]]\\{1,\\}$
^[[:digit:]]+$
^[[0-9]]+$

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions