From #1260052-93: Candidate WYSIWYG editors:
From my comment in http://drupal.org/node/1260052#comment-6743142 I had a look at how these two editors compared in terms of pasting from Word. I didn't use the examples in #89 because these both look to have been setup to strip pretty much everything, so it wouldn't be much of a comparison.
I used:
http://www.aloha-editor.org/demos/3col/ and
http://nightly-v4.ckeditor.com/3952/samples/inlinebycode.htmlI used http://nicedit.com/demos.php as a control since I know it does no stripping, leaving in the weird classes, spans, styles, tags, measurements etc., just to check that Word wasn't giving me good markup in the first place ;-)
Both seemed to work pretty well, stripping out all the the worst. A few things I did notice:
- CKeditor didn't strip out the 'align' attribute.
- Ckeditor allowed various inline styles through, particularly prevalent being margin-left with pt measurements for Word indentations
- Aloha allowed but CKeditor converted to which seems better
- Aloha performed slightly better at pasting lists from older documents, yet both regularly failed, pasting a disc character instead and filling the space between the disc and the text with nbsps, and CKeditor applying margin styles as well
- CKeditor occasionally allowed DIVS to be pasted in, and them coming from Word they never made any sense
- Aloha performed better with tables, with CKEditor applying a width to every single cell, and applying border, cell padding, cell spacing etc. to tables which didn't seem desirable in this context. Aloha basically took tables completely over with its plugin, removing/replacing all such stuff.
I get the feeling that many of the above observations with CKeditor may just be down to configuration? In any case, in my testing they both clean up the major cruft very well, but with Aloha performing slightly better at retaining formatting where it can, and cleaning up tables, which gives it a few extra points for me currently in terms of 'paste from word'.
Upon switching from Aloha Editor to CKEditor, we identified 8 functional gaps (#1260052-148: Candidate WYSIWYG editors). One of which was improved pasting from Word, based on the above feedback.
The key way that was going to be addressed in CKEditor is via https://dev.ckeditor.com/ticket/9829. That's now part of CKEditor 4.1 (already part of Drupal 8 right now). They called it "ACF" (Advanced Content Filter): http://ckeditor.com/blog/CKEditor-4.1-Released.
However, to fully leverage that in their "PFW" (Paste From Word) feature, they still have to tackle this ticket: http://dev.ckeditor.com/ticket/9991. It will only be done after CKE 4.2. Which still leaves plenty of time before Drupal 8 release, but this issue exists to ensure it gets tackled before release.
Reference document to test?
Ideally, we would have a reference Word/Pages document, paste that, and have reference target (cleaned up) markup.
Current status
Over at #1936392: Ensure CKEditor only allows what the text format allows (to guarantee accurate WYSIWYG & security), we're working on configuring ACF to closely match what Drupal's text format filters allow. So, to evaluate Paste From Word, you currently should apply the patch over there. Easily test it on simplytest.me: http://simplytest.me/project/drupal/357ac577dfd1817ad3a72dabee9cb01fe7aad577?patch[]=http://drupal.org/files/ckeditor_acf-1936392-6.patch
With Basic HTML, you'll note that a lot of cruft is stripped out: no class attributes, but there still are empty span tags, for example.
With Full HTML, you'll note that a lot of cruft is still allowed: class attributes, and so on. But this is because ACF also allows that for this text format.
The thing is that even when using Full HTML, we'd expect e.g. the class attributes to stripped away, even though class attributes are allowed by the text format.