Hi all,
I looked into the issue of content pasted from word|(open|libre)office again. This is how it is supposed to work:
1. The paste plugin creates an off-screen contenteditable div.
2. The paste plugin adds a handler for the paste event, which redirects pastes into the off-screen div.
3. When something is pasted, the content is collected from this div, and the insertHtml command is used to insert it into the real document.
4. Aloha calls any content handlers registered for insertHtml on the content, and each of them gets a chance to modify the content.
5. The result is inserted into the document
Item 4 needs a bit of work. If you get some word content, for example
http://www.houstonmethodist.org/WordPasteTest, you'll see that it does clean up nicely, except MsoNormal remains. It also doesn't handle openoffice content, and it should clean up the use of entities (such as ) which isn't valid in xhtml5.
It therefore looks like we need to write our own content handler.
I'll look into this during the day so I can report on it this afternoon (my time, for the Texas people it will be morning :-).
regards,
Izak