CKEditor 4.1 RC was just released, which announces the availability of the new Advanced Content Filter (ACF) functionality. This functionality was introduced to address
Leverage CKEditor's Advanced Content Filter
Problem/Motivation
http://buytaert.net/from-aloha-to-ckeditor mentions the 8 functional gaps that I identified in CKEditor. This issue is about addressing one of those: "pasting text should match the current text format" (see #1260052-148: Candidate WYSIWYG editors, point 2).
(It also makes it possible for sites needing to take control over the user-generated content further — see point 8 in the aforementioned issue comment.)
- The main user-facing problem: inaccurate WYSIWYG
- users copy/paste content with specific formatting from other web pages (or text processing applications like Word)
- in every WYSIWYG in Drupal until today, upon pasting this content, it would very closely resemble the original in the WYSIWYG editor
- upon saving, and thus viewing the (filtered) end result, it suddenly looks completely different
The WYSIWYG editor was lying to the end user — it was not actually WYSIWYG!
CKEditor's Advanced Content Filter solves that. All we have to do, is set CKEditor's
allowedContent
setting to indicate which tags and attributes are allowed by the text format, and everything else will happen automatically.- The main site-level problem: anonymous users can attack admin users
- If an anonymous or authenticated user (with no permission to use a text format that allows
<iframe>
or<script>
tags, for obvious reasons) disables the WYSIWYG editor (e.g. by disabling JS in his browser) inserts e.g.<iframe src="javascript:alert(0)"></iframe>
tag, no harm will be done to end users.But … harm can be done to admin users (e.g. when an admin modifies a comment), because the WYSIWYG editor will obediently preview whatever HTML exists, instead of first making it match the restrictions defined in the text format.
By using ACF, that security problem is a thing of the past.
(The CKEditor module in Drupal 7 solves this by doing a round trip to the server to do XSS filtering when necessary; this extra roundtrip is very bad for performance though; by doing it this way, no round trips or complex logic are necessary.)
- The secondary problem: UI allows to configure attributes that are not allowed by the text format
- The secondary problem is: assume the
<a>
tag is allowed by a text format. However, thetarget
attribute on this tag is specifically disallowed. In pretty much every WYSIWYG editor out there today, you would have to mess with very detailed/deep UI configuration settings to get rid of the parts of the UI that allow you to set thetarget
attribute value.With ACF, this happens automatically.
(Or rather, this will happen automatically once we upgrade to the latest CKEditor.)
Proposed resolution
Use CKEditor's Advanced Content Filter. We don't have to do anything on the CKEditor side of things, besides just generating and setting the proper value for the allowedContent
setting.
However, to be able to generate the proper value for that setting, we must know which HTML tags and attributes are allowed by the text format. It is possible in theory to employ blackbox testing, but in practice that's not realistic, because we'd need to test for every possible HTML tag + attribute combination. That's … nigh impossible, and at the very least computationally expensive.
That's why I'm proposing to introduce filter_get_allowed_tags_by_format()
and an "allowed tags callback" for filters of the type FILTER_TYPE_HTML_RESTRICTOR
. (This was originally proposed over at #1782838: WYSIWYG in core: round one — filter types, but back then it was not something you could touch and see; now you can.)
Remaining tasks
All done; needs reviews!
User interface changes
None.
API changes
- Filters of the
FILTER_TYPE_HTML_RESTRICTOR
type may now define an "allowed tags callback". - New utility function:
filter_get_allowed_tags_by_format()
. Allows modules to gain insight about which HTML tags are allowed by a certain text format.