-
Notifications
You must be signed in to change notification settings - Fork 23k
SanitizerConfig - define valid config #41945
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
/cc @otherdaniel maybe interesting to you as well. |
5847f78 to
832797f
Compare
| ### Valid configuration | ||
|
|
||
| The configuration object structure allows for the declaration of filter options that are contradictory or redundant, such as specifying an element in both allow and remove lists, or listing an attribute in a list multiple times. | ||
| In order to avoid any ambiguity, methods that take a `SanitizerConfig` instance require that a _valid_ configuration object be passed, and will throw a {{jsxref("TypeError")}} if an invalid configuration is used. | ||
|
|
||
| In a valid sanitizer configuration: | ||
|
|
||
| - Either the `elements` or `removeElements` array may be defined, but not both | ||
| - Either the `attributes` or `removeAttributes` array may be defined, but not both | ||
| - The `replaceWithChildrenElements` array, if defined, may not have any elements in common with `elements` or `removeElements` | ||
| - No array may contain duplicate elements or attributes | ||
| - If the global `attributes` array is defined: | ||
| - An element may define any or none of `attributes` and `removeAttributes` | ||
| - An element's `attributes` must not share any values in common with the global `attributes` array | ||
| - An element's `removeAttributes` array may only contain values that are also present in the global `attributes` array. | ||
| - If `dataAttributes` is `true` the global and element attribute arrays must not contain `data-*` attributes (since these will automatically be allowed). | ||
| - If the global `removeAttributes` array is defined: | ||
| - An element may specify either `attributes` or `removeAttributes`, but not both | ||
| - An element's `attributes` or `removeAttributes` array, depending on which (if either) is defined, must not share any values in common with the global `removeAttributes` array. | ||
| - The global `dataAttributes` array must not be defined. | ||
|
|
||
| The empty object `{}` is a valid configuration. | ||
|
|
||
| > [!NOTE] | ||
| > The conditions above are from the perspective of a web developer. | ||
| > The [validity check defined in the specification](https://wicg.github.io/sanitizer-api/#sanitizerconfig-valid) is slightly different because it is executed after canonicalization of the configuration, such as adding `removeElements` when both are missing, and adding default namespaces. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@evilpie Thanks very much - extremely helpful. This is now updated following the spec pattern of splitting the element attribute behaviour based on whether the global attributes or removeAttributes are defined.
There is a slight reduction in repetition, and of course {} is valid in this context.
Following reading this, it seems there have been other changes I haven't track, such as most of the sanitizer methods now returning booleans. I've updated those too. You can ignore those changes if you want - this one is the complicated bit.
| The specified element is added to the [`elements`](/en-US/docs/Web/API/SanitizerConfig#elements) list in this sanitizer's configuration. | ||
| If the element is already present in the list, then the existing entry is first removed and the new definition is appended to the end of the list. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note, for all these methods I have removed comments about what happens under the hood.
A spec author needs to know at this level that the internal configuration might specify either an allow or remove list, and what the update process is different in each case.
A website developer does not - all they need to know at this level is that specifying an element will ensure that it is allowed.
| Note that if you need both per-element add-attribute and remove-attribute lists, they must be added in a single call to this method (since if done in two calls, the second call will replace the element definition added in the first call). | ||
|
|
||
| The specified element is removed from the sanitizer configuration [`removeElements`](/en-US/docs/Web/API/SanitizerConfig#removeelements) or [`replaceWithChildrenElements`](/en-US/docs/Web/API/SanitizerConfig#replacewithchildrenelements) lists if present. | ||
| The element can be specified with arrays of attributes that are allowed or disallowed on elements of that type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I think I agree with your general idea of not leaking too many internals. There is one specific case here that should be worth calling out.
The following does nothing and should produce a warning:
s = new Sanitizer({ removeElements: ["div"] })
s.allowElement({name: "div", attributes: ["class"] })
This is because you can't have removeElements combined with per-element attributes/removeAttributes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @evilpie - this is quite a tricky API. Addressed as part of 5aa40c7
I think I missed this "point" because it isn't an invalid configuration (so not in algorithm) - it's an impossible configuration inferred from the fact that if you have removeElements you can't have elements.
To address this I added a note to the relevant point in Valid configuration and in allowElement() covered this in the return value (the warning) and expanded out the description.
I have also tried to capture what the allow and remove configuration means in the overview and that the type doesn't change after construction - see Sanitizer() constructor.
Anyway, I think we're getting there.
-
One thing, is there a recommendation on whether you should use an allow or remove configuration. My thinking would be that an
allowconfiguration would almost always be better, in particular if you start from the default Sanitizer.My reasoning is that it is generally more restrictive and easier to make restrictive. With remove lists if other elements are later identified as risks, you don't need to update the sanitizer.
In addition, it allows this case of per-element attributes, which I think is likely be to useful.Not sure where remove configurations would be better. Perhaps where you are worried about a very particular subset of elements/attributes?
With the current examples, I think it's not clear that you can have an allow list of elements and a remove list of attributes (and vice versa). It might not really be an interesting use case to call out, though.
W.r.t. #41945 (comment) This is mentioned now. I don't know how interesting it is.
|
With the current examples, I think it's not clear that you can have an allow list of elements and a remove list of attributes (and vice versa). It might not really be an interesting use case to call out, though. |
files/en-us/web/api/html_sanitizer_api/using_the_html_sanitizer_api/index.md
Outdated
Show resolved
Hide resolved
| Note that it is not valid to specify both allowed and removed elements in a configuration, or both allowed and removed attributes. | ||
| You can. however. have a configuration that allows elements and removes attributes, or vice versa. | ||
|
|
||
| ##### Allow configurations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI Expanded this out to
- provide better allow/remove links and to be able to provide more information, as these provide different opportunities.
- Not sure if we can provide "which should I use" advice other than what I put in the first paragraph below.
- Wanted to show some cases of using
Sanitizeras this was quite SantizerConfig centric - and most should preferSanitizer
| Note that the transformation function doesn't have to sanitize the input string in this case (although it can), because you can use the sanitizer API for that. | ||
| What trusted types provide in this case is information about where potentially unsafe strings are injected, making it easier to locate them and check that the sanitizer is appropriately configured. | ||
| Note that the behavior of the transformation function in this case will depend on the website policy (which might be to reject all use of the unsafe methods). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note, trusted type feedback on how trusted types and unsafe methods was not very "precise". What I took from it was that you should always use Trusted Types and that the policy for trusted types is whatever the website says it is. In other words, we shouldn't really offer guidance on to use the transformation, or a sanitizer or whatever.
files/en-us/web/api/html_sanitizer_api/using_the_html_sanitizer_api/index.md
Outdated
Show resolved
Hide resolved
| You can also use set the attribute properties using the same element object passed to {{domxref("Sanitizer.allowElement()")}}. | ||
|
|
||
| Note however that you can't specify both element `attributes` and `removeAttributes` in one call. Attempting to do so will raise an exception. | ||
| Note that it is impossible to define per-element attribute behavior on a Sanitizer with a remove configuration, as the (needed) `elements` array is not present. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI another note hat you can't define per-element attributes with a remove configuraiton
| ### Allow and remove configurations | ||
|
|
||
| One of the main implications of the previous section is that a valid configuration can specify either `elements` or `removeElements` arrays (but not both) and either the `attributes` or `removeAttributes` arrays (but not both). | ||
|
|
||
| A configuration that has the `elements` and/or `attributes` arrays is referred to as an [allow configuration](/en-US/docs/Web/API/HTML_Sanitizer_API#allow_configurations), as it defines the sanitization behavior in terms of the values that are allowed to be present in the output. | ||
| A [remove configuration](/en-US/docs/Web/API/HTML_Sanitizer_API#remove_configurations) is one that has either of `removeElements` and/or `removeAttributes`, and defines the behavior in terms of the values that will be removed from the output. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought worth having a "thin" section here on the difference, since this is where you will create the configurations.
Fixes #41634
Defines what a valid SanitizerConfig is more precisely and links to this from the places where an invalid config can throw a TypeError.
Related docs work can be tracked in #41649