-
-
Notifications
You must be signed in to change notification settings - Fork 408
Description
The regex pattern here:
Line 253 in bc5843b
| re_url = r'((?:%s)(?::\/\/\S+))' % schemes_patterns |
doesn't ignore all IRC formatting characters. Bold and monospace formatting, for example, will be included in the match if it's right up against the end of the URL.
Additionally, text before the protocol (with no whitespace or other word boundary) doesn't appear to be ignored either. For example, "bold of youhttps://github.com/sopel-irc/sopel" (note lack of space between you and https) resulted in url trying to fetch a page, when it should have been ignored because youhttps is not a valid protocol. Hypothetically there could be custom protocols that end with one of the known strings, and Sopel should make sure the whole protocol string matches what it's looking for.
This leniency affects core message dispatch because it's used to build PreTrigger.urls:
Lines 269 to 271 in bc5843b
| # Search URLs after CTCP parsing | |
| self.urls = tuple( | |
| web.search_urls(self.args[-1], schemes=url_schemes)) |
Of note: web.search_urls() has a clean parameter that causes it to run web.trim_urls() on the found matches, which the PreTrigger code doesn't make use of.