Here is my experimentation code in the Extension Developer Javascript
console.
var composer = document.getElementById('msgcomposeWindow');
var frame = composer.getElementsByAttribute('id', 'content-
frame').item(0);
if(frame.editortype != 'textmail') {
print('Sorry, you are not composing in plain text.');
return;
}
var doc = frame.contentDocument.documentElement;
// XXX: This does not work because newlines are not in the string!
var text = doc.textContent;
print('Message content:');
print(text);
print('');
// Do a TreeWalker through the composition window DOM instead.
var body = doc.getElementsByTagName('body').item(0);
var acceptAllNodes = function(node) { return
NodeFilter.FILTER_ACCEPT; };
var walker = document.createTreeWalker(body, NodeFilter.SHOW_TEXT
| NodeFilter.SHOW_ELEMENT, { acceptNode: acceptAllNodes }, false);
var lines = [];
var justDidNewline = false;
while(walker.nextNode()) {
if(walker.currentNode.nodeName == '#text') {
lines.push(walker.currentNode.nodeValue);
justDidNewline = false;
}
else if(walker.currentNode.nodeName == 'BR') {
if(justDidNewline)
// This indicates back-to-back newlines in the message text.
lines.push('');
justDidNewline = true;
}
}
for(a in lines) {
print(a + ': ' + lines[a]);
}
I would appreciate any feedback as to whether I'm on the right track.
I also have some specific questions:
* Does `doc.textContent` really not have newlines? How stupid is
that? I'm hoping it's just a bug with the Javascript console but I
suspect not. A previous message on this board suggests this but if I
don't have the newlines then that's useless to me.
* Is the TreeWalker correct? I first tried `NodeFilter.SHOW_TEXT`
but it did not traverse into the `<SPAN>`s which contain the quoted
material in a reply. Similarly, it seems funny to `FILTER_ACCEPT`
every node and then manually cherry-pick it later, but I had the same
problem where if I rejected a `SPAN` node, the walker would not step
inside.
* Consecutive `<BR>`s break the naive implementation because there
is no `#text` node in between them. So I manually detect them and
push empty lines on my array. Is it really necessary to do that much
manual work to access the message content?
Thanks very much.
>Does `doc.textContent` really not have newlines? How stupid is that?
>
That's 100% spec. foo<br>bar's text content has no newlines, although it
displays as two lines.
>Is it really necessary to do that much manual work to access the message content?
>
>
I think the easiest way is to create a range over the body and convert
that to a string, which should convert <br> to newlines. It also gives
you a textual representation of HTML content, which you may find useful.
--
Warning: May contain traces of nuts.
Of course, you're right. For some reason I had assumed that
Thunderbird would have had a convenient hook to access the message
content and I was just venting.
> >Is it really necessary to do that much manual work to access the message content?
>
> I think the easiest way is to create a range over the body and convert
> that to a string, which should convert <br> to newlines. It also gives
> you a textual representation of HTML content, which you may find useful.
Thank you very much. I hadn't considered a range. I will look into
that.
in Composer window
window.gMsgCompose.editor.outputToString('text/plain', FLAGS);
Read more on MDC about possible FLAGS to use.
Other valid MIME to use with outputToString is 'text/html'.
> I would appreciate any feedback as to whether I'm on the right track.
> I also have some specific questions:
>
> * Does `doc.textContent` really not have newlines?
Yes and no. There is not a CR and/or LF, but there is a lot of <BR>.
Editor internally use DOM document (HTML), so in most cases it just
ignore CR and LF.
If You want CF and/or LF, use outputToString with proper flags.
BTW, which kind of extension You working on?
Anyway, for most tasks it is better to use exiting high-level objects,
like gMsgCompose, instead fighting with DOM tree.
--
Arivald
That sounds great. Thank you very much.
> > I would appreciate any feedback as to whether I'm on the right track.
> > I also have some specific questions:
>
> > * Does `doc.textContent` really not have newlines?
>
> Yes and no. There is not a CR and/or LF, but there is a lot of <BR>.
> Editor internally use DOM document (HTML), so in most cases it just
> ignore CR and LF.
> If You want CF and/or LF, use outputToString with proper flags.
Yes, I have come to realize more about Mozilla and XUL and it makes
sense that the composer is much more sophisticated than just a
<textarea>.
> BTW, which kind of extension You working on?
> Anyway, for most tasks it is better to use exiting high-level objects,
> like gMsgCompose, instead fighting with DOM tree.
Thanks. I've been slightly frustrated at the lack of introductory
material on Thunderbird extensions, especially in task-oriented
format. The API coverage is nice but it's a bit of a learning curve
to start. So I fall back on my bad Firebug+DOM-browser habits of just
querying the DOM with Javascript until I get something that works!
I am trying to make an extension where I can compose in Markdown
format, but the mail will be sent multipart/alternative with the HTML
version being rendered from the Markdown, perhaps by the Showdown
library. If I can accomplish this task, or even get close, I'd like
to get it into the extension registry, but first thing's first!