Sharing non-form Elements

15 views
Skip to first unread message

Jeremy Whitlock

unread,
Sep 4, 2009, 1:56:50 AM9/4/09
to MobWrite
Hi All,
I'm new to mobwrite but I was thinking how cool it would be to
take a WYSIWYG editor, like YUI's editor, and hooking up mobwrite to
it. The problem is that from what I can tell looking at the
documentation and the sources, mobwrite only shares forms and form
objects. I don't mind writing my own implementation that works with
other non-form elements but before I went down that path, I figured I
would ask here.

Take care,

Jeremy

Joe Walker

unread,
Sep 4, 2009, 5:39:52 AM9/4/09
to mobw...@googlegroups.com

The essence of why it's hard is that with text, the characters are not related to each other in any way. With rich text, characters follow each other in a sequence "<p>" that has meaning as a sequence, but is broken out of sequence. And also sequences are related to each other (e.g. "</p>"). Diff operations which chop text around would need to be aware of these semantic links as they decide what to accept and reject. I think this means a re-write of diff-match-patch to be based on a tree rather than a stream of characters.

Joe.

Neil Fraser

unread,
Sep 4, 2009, 11:06:30 AM9/4/09
to MobWrite
On Fri, Sep 4, 2009 at 6:56 AM, Jeremy Whitlock
<jcscoob...@gmail.com>wrote:
> I'm new to mobwrite but I was thinking how cool it would be to
> take a WYSIWYG editor, like YUI's editor, and hooking up mobwrite to
> it. The problem is that from what I can tell looking at the
> documentation and the sources, mobwrite only shares forms and form
> objects. I don't mind writing my own implementation that works with
> other non-form elements but before I went down that path, I figured I
> would ask here.

Well, if you aren't too worried about pathalogical cases where HTML
may get mangled (as Joe pointed out), all you need to do is create
your own getClientText and setClientText functions. See
mobwrite_form.js for five different examples of this. This is trivial
(~10 minutes of programming) and will get your editor synchronizing.
But every time there's a sync, your cursor will disappear since
setClientText is replacing the entire content. To fix this, you'll
need to write a patchClientText function which either uses ranges to
gently insert the content without disturbing the cursor, or else
captures the cursor location, makes the edits then restores the cursor
location. This function can get 'interesting', there's an example of
one in mobwrite_form.js

On Sep 4, 11:39 am, Joe Walker <jwal...@mozilla.com> wrote:
> Diff operations which chop text around would need to be aware of these
> semantic links as they decide what to accept and reject. I think this means
> a re-write of diff-match-patch to be based on a tree rather than a stream of
> characters.

Correct, best-effort patching is not good enough when accidentally
mangling a </TABLE> tag would mean the rest of the page becomes
invisible. There's a wiki page about this with a possible work-
around:
http://code.google.com/p/google-diff-match-patch/wiki/Plaintext
As for tree-based differencing, I've looked into it very seriously.
It a completely different set of algorithms, and a very tough problem
to handle efficiently. If a tree-based version of DMP were written,
it could be plugged into MobWrite and rich text could be synced
without any modifications.

Jeremy Whitlock

unread,
Sep 4, 2009, 12:23:25 PM9/4/09
to MobWrite
> The essence of why it's hard is that with text, the characters are not
> related to each other in any way. With rich text, characters follow each
> other in a sequence "<p>" that has meaning as a sequence, but is broken out
> of sequence. And also sequences are related to each other (e.g. "</p>").
> Diff operations which chop text around would need to be aware of these
> semantic links as they decide what to accept and reject. I think this means
> a re-write of diff-match-patch to be based on a tree rather than a stream of
> characters.

Agreed. That's what I was expecting. So does Bespin just do text
then? From what I saw, Bespin mimics a textarea, just like all other
rich text editors, yet still enables this multi-user synchronization
and usage. How did you guys about doing this?

Jeremy Whitlock

unread,
Sep 4, 2009, 12:34:07 PM9/4/09
to MobWrite
> Well, if you aren't too worried about pathalogical cases where HTML
> may get mangled (as Joe pointed out), all you need to do is create
> your own getClientText and setClientText functions.  See
> mobwrite_form.js for five different examples of this.  This is trivial
> (~10 minutes of programming) and will get your editor synchronizing.
> But every time there's a sync, your cursor will disappear since
> setClientText is replacing the entire content.  To fix this, you'll
> need to write a patchClientText function which either uses ranges to
> gently insert the content without disturbing the cursor, or else
> captures the cursor location, makes the edits then restores the cursor
> location.  This function can get 'interesting', there's an example of
> one in mobwrite_form.js

This is what I gathered by looking at the sources. I'll give this a
try just to see if I can get a somewhat working version.

> Correct, best-effort patching is not good enough when accidentally
> mangling a </TABLE> tag would mean the rest of the page becomes
> invisible.  There's a wiki page about this with a possible work-
> around:
>  http://code.google.com/p/google-diff-match-patch/wiki/Plaintext
> As for tree-based differencing, I've looked into it very seriously.
> It a completely different set of algorithms, and a very tough problem
> to handle efficiently.  If a tree-based version of DMP were written,
> it could be plugged into MobWrite and rich text could be synced
> without any modifications.

I'll look into this. Since a tree-based DMP might be useful to the
masses, why don't we try to write one? I'd like to offer to help.

(P.S. - I tried replying earlier to this but Google Groups either lost
it or I clicked the wrong button. If I appear to double post, it's
because of that reason.)

Joe Walker

unread,
Sep 4, 2009, 8:19:31 PM9/4/09
to mobw...@googlegroups.com

It's all just text, and meta-data, and smoke, and mirrors. We're not doing any tree-diffs.

The biggest challenge so far is author colorization. According to the roadmap I'm supposed to be annotating the text with colors for who made what change. We're taking a biiig step towards rich-text here, and I'm resisting for all of the reasons above, although in this case I think it can be done without tree-diff, it's not easy.

Joe.

Joe Walker

unread,
Sep 5, 2009, 5:34:12 AM9/5/09
to mobw...@googlegroups.com
On Fri, Sep 4, 2009 at 4:06 PM, Neil Fraser <neil....@gmail.com> wrote:

On Fri, Sep 4, 2009 at 6:56 AM, Jeremy Whitlock
<jcscoob...@gmail.com>wrote:
> I'm new to mobwrite but I was thinking how cool it would be to
> take a WYSIWYG editor, like YUI's editor, and hooking up mobwrite to
> it.  The problem is that from what I can tell looking at the
> documentation and the sources, mobwrite only shares forms and form
> objects.  I don't mind writing my own implementation that works with
> other non-form elements but before I went down that path, I figured I
> would ask here.

Well, if you aren't too worried about pathalogical cases where HTML
may get mangled (as Joe pointed out), all you need to do is create
your own getClientText and setClientText functions.  See
mobwrite_form.js for five different examples of this.  This is trivial
(~10 minutes of programming) and will get your editor synchronizing.
But every time there's a sync, your cursor will disappear since
setClientText is replacing the entire content.  To fix this, you'll
need to write a patchClientText function which either uses ranges to
gently insert the content without disturbing the cursor, or else
captures the cursor location, makes the edits then restores the cursor
location.  This function can get 'interesting', there's an example of
one in mobwrite_form.js

When dealing with the formedness of HTML, surely the patholological case is hugely worrying. If you have a way to trick mobwrite into 'mis-spelling' an element, you've likely got yourself an XSS flaw, and one of the worst kind - an XSS flaw in a potentially social application, with comet-like updates == fast reproducing web-worm.

Joe.

Neil Fraser

unread,
Sep 5, 2009, 10:58:47 AM9/5/09
to MobWrite
On Sep 5, 11:34 am, Joe Walker <jwal...@mozilla.com> wrote:
> When dealing with the formedness of HTML, surely the patholological case is
> hugely worrying. If you have a way to trick mobwrite into 'mis-spelling' an
> element, you've likely got yourself an XSS flaw, and one of the worst kind -
> an XSS flaw in a potentially social application, with comet-like updates ==
> fast reproducing web-worm.

Fortunately if you execute:
document.body.innerHTML = "<script>alert('hello')</script>";
where the document is an editable frame (Midas) no browser will
execute the script. It has been a couple of years since I looked at
this in detail, but I believe that:
<P onmouseover="alert('hello')">World</P>
also does not execute in editable mode. This is one of the few times
that I've known browsers to uniformly do the right thing. :)

Indeed at one point (maybe to this day) Google Docs prepends every
document with <script>location="/";</script> so that if your browser
failed to make it into editable mode, you'll get harmlessly redirected
away before any XSS evil gets executed.

The real risk is that a </table> tag gets corrupted and a chunk of the
document either disappears or enters some weird state where the
browser (rightly) doesn't have a clue how to edit it.

Joe Walker

unread,
Sep 5, 2009, 12:39:13 PM9/5/09
to mobw...@googlegroups.com

So you're saying there's no way to inject any active-content, misappropriate any css, overlay any elements, trick any event handlers, or any of the other tricks I haven't thought of, on any supported web browser?

I'm not saying you're wrong but my concern is that this is security by practice rather than security by design, and that's what's got us into this mess. In reality, I guess you need to work out what the risks are to your website. It could be adequate in some places.

Joe.

Neil Fraser

unread,
Sep 5, 2009, 11:58:09 PM9/5/09
to MobWrite
On Sep 5, 6:39 pm, Joe Walker <j...@getahead.org> wrote:
> So you're saying there's no way to inject any active-content, misappropriate
> any css, overlay any elements, trick any event handlers, or any of the other
> tricks I haven't thought of, on any supported web browser?

A couple of years ago I did an analysis of the client-side issues and
found that if content-editable was on, then the user was safe.
However, as you said that's "security by practice rather than by
design". Thus since the application I was working on had a fairly
high-profile, an independent server-side filter was written which
striped out all known script embedding. Thus we had two independent
layers of defense.
Reply all
Reply to author
Forward
0 new messages