Problem/Motivation
The filter system should be upgraded to HTML5 to match modern standards. All modern browsers now parse HTML5, our themes use an HTML5 doctype and so Drupal should treat HTML input and output the same way.
Steps to reproduce
Currently the filter system parses and outputs XHTML only. This causes PHP warnings when trying to parse modern tags:
>>> use \Drupal\Component\Utility\Html;
>>> Html::normalize("<p><figure><figcaption>Caption</figcaption></figure><video></video></p>");
PHP Warning: DOMDocument::loadHTML(): Tag figure invalid in Entity, line: 1 in /var/www/html/drupal/core/lib/Drupal/Component/Utility/Html.php on line 289
PHP Warning: DOMDocument::loadHTML(): Tag figcaption invalid in Entity, line: 1 in /var/www/html/drupal/core/lib/Drupal/Component/Utility/Html.php on line 289
PHP Warning: DOMDocument::loadHTML(): Tag video invalid in Entity, line: 1 in /var/www/html/drupal/core/lib/Drupal/Component/Utility/Html.php on line 289
=> "<p><figure><figcaption>Caption</figcaption></figure><video></video></p>"
Also, <br />
and <img />
tags are self-closing. In HTML5, these are void elements and should be output simply as <br>
and <img>
.
Proposed resolution
Use masterminds/html5
instead of DOMDocument
to parse and output HTML in the \Drupal\Component\Utility\Html
utility class.
Remaining tasks
User interface changes
None
API changes
\Drupal\Component\Utility\Html::load()
, \Drupal\Component\Utility\Html::serialize()
and \Drupal\Component\Utility\Html::normalize()
will parse and output HTML5 instead of XHTML. Input filters or other code that relies on this utility class will also now parse and output HTML5 instead of XHTML.
This does mean there are some minor changes to output, but these should not affect valid HTML documents. Invalid HTML may have some minor changes, e.g. <br></br>
was previously corrected to <br>
alone, but now is corrected to <br><br>
following the HTML5 parsing model.
Data model changes
None
Release notes snippet
The filter system has been upgraded to output HTML5.