Skip to content

Commit 6df0c85

Browse files
authored
Merge pull request #119 from ampproject/add/20-rewrite-amp-urls-transformer
2 parents fe02ccc + b2a8478 commit 6df0c85

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

46 files changed

+2316
-206
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -144,6 +144,7 @@ Note that this only lets you check whether an error "category" popped up. It can
144144
| [`AmpRuntimeCss`](https://github.com/ampproject/amp-toolbox-php/blob/main/src/Optimizer/Transformer/AmpRuntimeCss.php) | Transformer adding `https://cdn.ampproject.org/v0.css` if server-side-rendering is applied (known by the presence of the `<style amp-runtime>` tag). AMP runtime css (`v0.css`) will always be inlined as it'll get automatically updated to the latest version once the AMP runtime has loaded. |
145145
| [`PreloadHeroImage`](https://github.com/ampproject/amp-toolbox-php/blob/main/src/Optimizer/Transformer/PreloadHeroImage.php) | Transformer that optimizes image rendering times for hero images by adding preload and serverside-rendered `<img>` tags when possible. Viable hero images are `<amp-img>` tags, `<amp-video>` tags with a `poster` attribute as well as `<amp-iframe>` and `<amnp-video-iframe>` tags with a `placeholder` attribute. The first viable image that is encountered is used by default, but this behavior can be overridden by adding the `data-hero` attribute to a maximum of two images. The preloads only work work images that don't use `srcset`, as that is not supported as a preload in most browsers. The serverside-rendered image will not be created for `<amp-video>` tags. |
146146
| [`ReorderHead`](https://github.com/ampproject/amp-toolbox-php/blob/main/src/Optimizer/Transformer/ReorderHead.php) | Transformer applying the head reordering transformations to the HTML input. `ReorderHead` reorders the children of `<head>`. Specifically, it orders the `<head>` like so:<br>(0) `<meta charset>` tag<br>(1) `<style amp-runtime>` (inserted by `AmpRuntimeCss`)<br>(2) remaining `<meta>` tags (those other than `<meta charset>`)<br>(3) AMP runtime `.js` `<script>` tag<br>(4) AMP viewer runtime `.js` `<script>`<br>(5) `<script>` tags that are render delaying<br>(6) `<script>` tags for remaining extensions<br>(7) `<link>` tag for favicons<br>(8) `<link>` tag for resource hints<br>(9) `<link rel=stylesheet>` tags before `<style amp-custom>`<br>(10) `<style amp-custom>`<br>(11) any other tags allowed in `<head>`<br>(12) AMP boilerplate (first `<style>` boilerplate, then `<noscript>`) |
147+
| [`RewriteAmpUrls`](https://github.com/ampproject/amp-toolbox-php/blob/main/src/Optimizer/Transformer/RewriteAmpUrls.php) | Transformer that rewrites AMP runtime URLs to decide what version of the runtime to use. This allows you to do such things as switching to the LTS version or disabling ES modules.|
147148
| [`ServerSideRendering`](https://github.com/ampproject/amp-toolbox-php/blob/main/src/Optimizer/Transformer/ServerSideRendering.php) | Transformer applying the server-side rendering transformations to the HTML input. This does immediately on the server what would normally be done on the client _after_ the runtime was downloaded and executed to process the DOM. As such, it allows for the removal of the boilerplate CSS that _hides_ the page while it has not yet been processed on the client, drastically improving time it takes for the First Contentful Paint (FCP).|
148149
| [`TransformedIdentifier`](https://github.com/ampproject/amp-toolbox-php/blob/main/src/Optimizer/Transformer/TransformedIdentifier.php) | Transformer applying the transformed identifier transformations to the HTML input. This is what marks an AMP document as "already optimized", so that the AMP runtime does not need to process it anymore. |
149150

bin/sync-amp-toolbox-test-suite.php

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -70,7 +70,7 @@ function ensureDirExists($directory)
7070
for ($index = 0; $index < $zip->numFiles; $index++) {
7171
$archivedPath = $zip->statIndex($index)['name'];
7272

73-
if (substr($archivedPath, -5) !== '.html') {
73+
if (substr($archivedPath, -5) !== '.html' && substr($archivedPath, -11) !== 'config.json') {
7474
continue;
7575
}
7676

composer.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
"require": {
77
"php": "^5.6 || ^7.0 || ^8.0",
88
"ext-dom": "*",
9+
"ext-filter": "*",
910
"ext-iconv": "*",
1011
"ext-json": "*",
1112
"ext-libxml": "*"

src/Amp.php

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -150,8 +150,11 @@ public static function isRuntimeScript(DOMNode $node)
150150
}
151151

152152
if (
153+
// @TODO Compare performance against single regex.
153154
substr($src, -6) !== '/v0.js'
155+
&& substr($src, -7) !== '/v0.mjs'
154156
&& substr($src, -14) !== '/amp4ads-v0.js'
157+
&& substr($src, -15) !== '/amp4ads-v0.mjs'
155158
) {
156159
return false;
157160
}

src/Attribute.php

Lines changed: 17 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@ interface Attribute
3535
const CHARSET = 'charset';
3636
const CLASS_ = 'class'; // Underscore needed because 'class' is a PHP keyword.
3737
const CONTENT = 'content';
38+
const CROSSORIGIN = 'crossorigin';
3839
const CUSTOM_ELEMENT = 'custom-element';
3940
const CUSTOM_TEMPLATE = 'custom-template';
4041
const DECODING = 'decoding';
@@ -59,6 +60,7 @@ interface Attribute
5960
const MEDIA = 'media';
6061
const NAME = 'name';
6162
const NOLOADING = 'noloading';
63+
const NOMODULE = 'nomodule';
6264
const OBJECT_FIT = 'object-fit';
6365
const OBJECT_POSITION = 'object-position';
6466
const ON = 'on';
@@ -89,21 +91,25 @@ interface Attribute
8991
const TYPE_HTML = 'text/html';
9092
const TYPE_JSON = 'application/json';
9193
const TYPE_LD_JSON = 'application/ld+json';
94+
const TYPE_MODULE = 'module';
9295
const TYPE_TEXT_PLAIN = 'text/plain';
9396

94-
const REL_AMPHTML = 'amphtml';
95-
const REL_CANONICAL = 'canonical';
96-
const REL_DNS_PREFETCH = 'dns-prefetch';
97-
const REL_ICON = 'icon';
98-
const REL_NOAMPHTML = 'noamphtml';
99-
const REL_NOFOLLOW = 'nofollow';
100-
const REL_PRECONNECT = 'preconnect';
101-
const REL_PREFETCH = 'prefetch';
102-
const REL_PRELOAD = 'preload';
103-
const REL_PRERENDER = 'prerender';
104-
const REL_STYLESHEET = 'stylesheet';
97+
const REL_AMPHTML = 'amphtml';
98+
const REL_CANONICAL = 'canonical';
99+
const REL_DNS_PREFETCH = 'dns-prefetch';
100+
const REL_ICON = 'icon';
101+
const REL_MODULEPRELOAD = 'modulepreload';
102+
const REL_NOAMPHTML = 'noamphtml';
103+
const REL_NOFOLLOW = 'nofollow';
104+
const REL_PRECONNECT = 'preconnect';
105+
const REL_PREFETCH = 'prefetch';
106+
const REL_PRELOAD = 'preload';
107+
const REL_PRERENDER = 'prerender';
108+
const REL_STYLESHEET = 'stylesheet';
105109

106110
const DATA_AMP_STORY_PLAYER_POSTER_IMG = 'data-amp-story-player-poster-img';
107111
const DATA_HERO = 'data-hero';
108112
const DATA_HERO_CANDIDATE = 'data-hero-candidate';
113+
114+
const CROSSORIGIN_ANONYMOUS = 'anonymous';
109115
}

src/Dom/Document.php

Lines changed: 30 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -346,6 +346,7 @@ public function __construct($version = '', $encoding = null)
346346
$this->originalEncoding = (string)$encoding ?: Encoding::UNKNOWN;
347347
parent::__construct($version ?: '1.0', Encoding::AMP);
348348
$this->registerNodeClass(DOMElement::class, Element::class);
349+
$this->options = Option::DEFAULTS;
349350
}
350351

351352
/**
@@ -500,7 +501,7 @@ public function loadHTMLFragment($source, $options = [])
500501
$options = [Option::LIBXML_FLAGS => $options];
501502
}
502503

503-
$this->options = array_merge(Option::DEFAULTS, $options);
504+
$this->options = array_merge($this->options, $options);
504505

505506
$this->reset();
506507

@@ -2022,7 +2023,7 @@ public function __get($name)
20222023
}
20232024

20242025
// Mimic regular PHP behavior for missing notices.
2025-
trigger_error(self::PROPERTY_GETTER_ERROR_MESSAGE . $name, E_USER_NOTICE); // phpcs:ignore WordPress.PHP.DevelopmentFunctions,WordPress.Security.EscapeOutput,Generic.Files.LineLength.TooLong
2026+
trigger_error(self::PROPERTY_GETTER_ERROR_MESSAGE . $name, E_USER_NOTICE);
20262027
return null;
20272028
}
20282029

@@ -2059,6 +2060,32 @@ public function createElement($name, $value = null)
20592060
return $element;
20602061
}
20612062

2063+
/**
2064+
* Create new element node.
2065+
*
2066+
* @link https://php.net/manual/domdocument.createelement.php
2067+
*
2068+
* This override only serves to provide the correct object type-hint for our extended Dom/Element class.
2069+
*
2070+
* @param string $name The tag name of the element.
2071+
* @param array $attributes Attributes to add to the newly created element.
2072+
* @param string $value Optional. The value of the element. By default, an empty element will be created.
2073+
* You can also set the value later with Element->nodeValue.
2074+
* @return Element|false A new instance of class Element or false if an error occurred.
2075+
*/
2076+
public function createElementWithAttributes($name, $attributes, $value = null)
2077+
{
2078+
$element = parent::createElement($name, $value);
2079+
2080+
if (!$element instanceof Element) {
2081+
return false;
2082+
}
2083+
2084+
$element->addAttributes($attributes);
2085+
2086+
return $element;
2087+
}
2088+
20622089
/**
20632090
* Check whether the CSS maximum byte count is enforced.
20642091
*
@@ -2074,7 +2101,7 @@ public function isCssMaxByteCountEnforced()
20742101
*
20752102
* @param int $maxByteCount Maximum number of bytes to limit the CSS to. A negative number disables the limit.
20762103
*/
2077-
public function enforceCssMaxByteCount($maxByteCount = AMP::MAX_CSS_BYTE_COUNT)
2104+
public function enforceCssMaxByteCount($maxByteCount = Amp::MAX_CSS_BYTE_COUNT)
20782105
{
20792106
$this->cssMaxByteCountEnforced = $maxByteCount;
20802107
}

src/Dom/Element.php

Lines changed: 180 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@
77
use AmpProject\Optimizer\CssRule;
88
use DOMAttr;
99
use DOMElement;
10+
use DOMException;
1011

1112
/**
1213
* Class AmpProject\Dom\Element.
@@ -19,6 +20,20 @@
1920
final class Element extends DOMElement
2021
{
2122

23+
/**
24+
* Regular expression pattern to match events and actions within an 'on' attribute.
25+
*
26+
* @var string
27+
*/
28+
const AMP_EVENT_ACTIONS_REGEX_PATTERN = '/((?<event>[^:;]+):(?<actions>(?:[^;,\(]+(?:\([^\)]+\))?,?)+))+?/';
29+
30+
/**
31+
* Regular expression pattern to match individual actions within an event.
32+
*
33+
* @var string
34+
*/
35+
const AMP_ACTION_REGEX_PATTERN = '/(?<action>[^(),\s]+(?:\([^\)]+\))?)+/';
36+
2237
/**
2338
* Error message to use when the __get() is triggered for an unknown property.
2439
*
@@ -77,6 +92,170 @@ public function setAttribute($name, $value)
7792
return parent::setAttribute($name, $value);
7893
}
7994

95+
/**
96+
* Adds a boolean attribute without value.
97+
*
98+
* @param string $name The name of the attribute.
99+
* @return DOMAttr|false The new or modified DOMAttr or false if an error occurred.
100+
* @throws MaxCssByteCountExceeded If the allowed max byte count is exceeded.
101+
*/
102+
public function addBooleanAttribute($name)
103+
{
104+
$attribute = new DOMAttr($name);
105+
$result = $this->setAttributeNode($attribute);
106+
107+
if (!$result instanceof DOMAttr) {
108+
return false;
109+
}
110+
111+
return $result;
112+
}
113+
114+
/**
115+
* Copy one or more attributes from this element to another element.
116+
*
117+
* @param array|string $attributes Attribute name or array of attribute names to copy.
118+
* @param Element $target Target Dom\Element to copy the attributes to.
119+
* @param string $defaultSeparator Default separator to use for multiple values if the attribute is not known.
120+
*/
121+
public function copyAttributes($attributes, Element $target, $defaultSeparator = ',')
122+
{
123+
foreach ((array) $attributes as $attribute) {
124+
if ($this->hasAttribute($attribute)) {
125+
$values = $this->getAttribute($attribute);
126+
if ($target->hasAttribute($attribute)) {
127+
switch ($attribute) {
128+
case Attribute::ON:
129+
$values = self::mergeAmpActions($target->getAttribute($attribute), $values);
130+
break;
131+
case Attribute::CLASS_:
132+
$values = $target->getAttribute($attribute) . ' ' . $values;
133+
break;
134+
default:
135+
$values = $target->getAttribute($attribute) . $defaultSeparator . $values;
136+
}
137+
}
138+
$target->setAttribute($attribute, $values);
139+
}
140+
}
141+
}
142+
143+
/**
144+
* Register an AMP action to an event.
145+
*
146+
* If the element already contains one or more events or actions, the method
147+
* will assemble them in a smart way.
148+
*
149+
* @param string $event Event to trigger the action on.
150+
* @param string $action Action to add.
151+
*/
152+
public function addAmpAction($event, $action)
153+
{
154+
$eventActionString = "{$event}:{$action}";
155+
156+
if (! $this->hasAttribute(Attribute::ON)) {
157+
// There's no "on" attribute yet, so just add it and be done.
158+
$this->setAttribute(Attribute::ON, $eventActionString);
159+
return;
160+
}
161+
162+
$this->setAttribute(
163+
Attribute::ON,
164+
self::mergeAmpActions(
165+
$this->getAttribute(Attribute::ON),
166+
$eventActionString
167+
)
168+
);
169+
}
170+
171+
/**
172+
* Merge two sets of AMP events & actions.
173+
*
174+
* @param string $first First event/action string.
175+
* @param string $second First event/action string.
176+
* @return string Merged event/action string.
177+
*/
178+
public static function mergeAmpActions($first, $second)
179+
{
180+
$events = [];
181+
foreach ([$first, $second] as $eventActionString) {
182+
$matches = [];
183+
$results = preg_match_all(self::AMP_EVENT_ACTIONS_REGEX_PATTERN, $eventActionString, $matches);
184+
185+
if (! $results || ! isset($matches['event'])) {
186+
continue;
187+
}
188+
189+
foreach ($matches['event'] as $index => $event) {
190+
$events[$event][] = $matches['actions'][ $index ];
191+
}
192+
}
193+
194+
$valueStrings = [];
195+
foreach ($events as $event => $actionStringsArray) {
196+
$actionsArray = [];
197+
array_walk(
198+
$actionStringsArray,
199+
static function ($actions) use (&$actionsArray) {
200+
$matches = [];
201+
$results = preg_match_all(self::AMP_ACTION_REGEX_PATTERN, $actions, $matches);
202+
203+
if (! $results || ! isset($matches['action'])) {
204+
$actionsArray[] = $actions;
205+
return;
206+
}
207+
208+
$actionsArray = array_merge($actionsArray, $matches['action']);
209+
}
210+
);
211+
212+
$actions = implode(',', array_unique(array_filter($actionsArray)));
213+
$valueStrings[] = "{$event}:{$actions}";
214+
}
215+
216+
return implode(';', $valueStrings);
217+
}
218+
219+
/**
220+
* Extract this element's HTML attributes and return as an associative array.
221+
*
222+
* @return string[] The attributes for the passed node, or an empty array if it has no attributes.
223+
*/
224+
public function getAttributesAsAssocArray()
225+
{
226+
$attributes = [];
227+
if (! $this->hasAttributes()) {
228+
return $attributes;
229+
}
230+
231+
foreach ($this->attributes as $attribute) {
232+
$attributes[ $attribute->nodeName ] = $attribute->nodeValue;
233+
}
234+
235+
return $attributes;
236+
}
237+
238+
/**
239+
* Add one or more HTML element attributes to this element.
240+
*
241+
* @param string[] $attributes One or more attributes for the node's HTML element.
242+
*/
243+
public function addAttributes($attributes)
244+
{
245+
foreach ($attributes as $name => $value) {
246+
try {
247+
$this->setAttribute($name, $value);
248+
} catch (DOMException $e) {
249+
/*
250+
* Catch a "Invalid Character Error" when libxml is able to parse attributes with invalid characters,
251+
* but it throws error when attempting to set them via DOM methods. For example, '...this' can be parsed
252+
* as an attribute but it will throw an exception when attempting to setAttribute().
253+
*/
254+
continue;
255+
}
256+
}
257+
}
258+
80259
/**
81260
* Magic getter to implement lazily-created, cached properties for the element.
82261
*
@@ -95,10 +274,7 @@ public function __get($name)
95274
}
96275

97276
// Mimic regular PHP behavior for missing notices.
98-
trigger_error(
99-
self::PROPERTY_GETTER_ERROR_MESSAGE . $name,
100-
E_USER_NOTICE
101-
); // phpcs:ignore WordPress.PHP.DevelopmentFunctions,WordPress.Security.EscapeOutput,Generic.Files.LineLength.TooLong
277+
trigger_error(self::PROPERTY_GETTER_ERROR_MESSAGE . $name, E_USER_NOTICE);
102278

103279
return null;
104280
}

src/Exception/FailedToGetCachedResponse.php

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ final class FailedToGetCachedResponse extends RuntimeException implements Failed
1313
{
1414

1515
/**
16-
* Instantiate a FailedToGetCachedResponseData exception for a URL if the cached response data could not be
16+
* Instantiate a FailedToGetCachedResponse exception for a URL if the cached response data could not be
1717
* retrieved.
1818
*
1919
* @param string $url URL that failed to be fetched.

0 commit comments

Comments
 (0)