diff --git a/specs/jsonschema-core.md b/specs/jsonschema-core.md index e4886fd5..df905215 100644 --- a/specs/jsonschema-core.md +++ b/specs/jsonschema-core.md @@ -135,14 +135,14 @@ depending on the type: - *array*: An ordered list of instances, from the JSON "array" value - *number*: An arbitrary-precision, base-10 decimal number value, from the JSON "number" value -- *string*: A string of Unicode code points, from the JSON "string" value +- *string*: A string of [Unicode] code points, from the JSON "string" value Whitespace and formatting concerns, including different lexical representations of numbers that are equal within the data model, are thus outside the scope of JSON Schema. Extensions to JSON Schema that wish to work with such differences in lexical representations SHOULD define keywords to precisely interpret formatted strings within the data model rather than relying on having the -original JSON representation Unicode characters available. +original JSON representation available. Since an object cannot have two properties with the same key, behavior for a JSON document that tries to define two properties with the same key in a single @@ -335,14 +335,14 @@ considered to be implicitly anchored at either end. All regular expression keywords in this specification and its companion documents are un-anchored. Regular expressions SHOULD be built with the "u" flag (or equivalent) to provide -Unicode support, or processed in such a way which provides Unicode support as +[Unicode] support, or processed in such a way which provides Unicode support as defined by ECMA-262. Furthermore, given the high disparity in regular expression constructs support, schema authors SHOULD limit themselves to the following regular expression tokens: -- individual Unicode characters, as defined by the [JSON +- individual Unicode code points, as defined by the [JSON specification][rfc8259]; - simple atoms: `.` (any character except line terminator); - simple character classes (`[abc]`), range character classes (`[a-z]`); @@ -2628,3 +2628,4 @@ to the document. [rfc8259]: https://www.rfc-editor.org/info/rfc8259 [rfc8288]: https://www.rfc-editor.org/info/rfc8288 [application/schema+json]: ../ietf/json-schema-media-types.md +[Unicode]: https://www.unicode.org/versions/Unicode16.0.0/ diff --git a/specs/jsonschema-validation.md b/specs/jsonschema-validation.md index e658f3a4..cdad596f 100644 --- a/specs/jsonschema-validation.md +++ b/specs/jsonschema-validation.md @@ -218,8 +218,8 @@ The value of this keyword MUST be a non-negative integer. A string instance is valid against this keyword if its length is less than, or equal to, the value of this keyword. -The length of a string instance is defined as the number of its characters as -defined by [RFC 8259][rfc8259]. +The length of a string instance is defined as the number of [Unicode] code +points that make up the string. #### `minLength` @@ -228,8 +228,8 @@ The value of this keyword MUST be a non-negative integer. A string instance is valid against this keyword if its length is greater than, or equal to, the value of this keyword. -The length of a string instance is defined as the number of its characters as -defined by [RFC 8259][rfc8259]. +The length of a string instance is defined as the number of [Unicode] code +points that make up the string. Omitting this keyword has the same behavior as a value of 0. @@ -917,3 +917,4 @@ to the document. [rfc3987]: https://www.rfc-editor.org/info/rfc3987 [rfc8259]: https://www.rfc-editor.org/info/rfc8259 +[Unicode]: https://www.unicode.org/versions/Unicode16.0.0/