-
Notifications
You must be signed in to change notification settings - Fork 113
Description
While reading discussions I would run across expressions of cell values that look like "ranges" expressed as x:y or a-b, and I could not infer if there were formal parsing rules for non-trivial values that were not obviously ontology terms or free text. I had previously assumed that I simply hadn't finished digesting the specification yet, but after re-reading the section explicitly about value types, https://sdrf.quantms.org/specification.html#sdrf-file-standardization, I did not see anything explaining how to interpret this case.
The documentation offers a few special cases explaining how to parse specific numerical values with units in https://sdrf.quantms.org/dev/templates/ms-proteomics.html#field-guidelines. The SDRF Terms page seems to list a set of manually defined parsing rules in a variety of formats. Would it be better to define these in terms of machine readable regular expressions or a formal specification name like is done for dates (ISO 8601 date), or is there another piece I am missing?
Is the parsing rule process defined as below?
Is the value an object? Then parse key value pairs
Else is the value an ontology term? Then resolve the term
Else the value is free text.
Does the column name have a parsing rule?
Then parse the value accordingly.
Else does the column name have an implied value type from an ontology?
Then parse the value according to that XSD.
Else the value is raw text.