BDT Put processing

Ian C

unread,

Apr 4, 2016, 8:09:55 AM4/4/16

to CorinthiaTeam

Hi Peter,

I have been tracing some behaviour in the BDT Put processing.

Namely if I start with a document that has a single paragraph and append a paragraph.

The resulting internal nodes are switched around compared to the original.

In that within the text node we start with nodes

text-sequence-decls

p

After the round trip I get

p

p
text-sequence-decls

If however I start with a document that has two paragraphs

text-sequence-decls
p

p

I end with

text-sequence-decls
p

p

The new paragraph is appended as I would expect.

This means my test results are happy. But for the single paragraph case it is not, because the swapping of nodes.

The end document looks the same in OO of course.

Poking around in the BDT put code the difference is in the processing of.

    // Find the last node
    DFNode *last = NULL;
    if (concrete->last != NULL) {
        last = concrete->last;
        while ((last->prev != NULL) && !isVisible(ctx,last->prev))
            last = last->prev;
    }

In the single paragraph case the text-sequence-decls is the last node.

Whereas it is one of the p in the two paragraph case.

And is a function of the isVisible. The text-sequence-decls are not visible.

Can you shed any light as to why this is they way it is? I assume the isVisible is correct as being false for the text-sequence-decls since they cannot be seen in the editor?

I also see a degree of other juggling happening in the put processing, but generally it seems ok, I think I get what is going on.

With perhaps a query on the
// Fixup stage - move nodes backwards as much as possible to their previous prevHidden

Section

--

Cheers,

Ian C

Peter Kelly

unread,

Apr 8, 2016, 10:34:58 PM4/8/16

to corinthiateam

Hi Ian, sorry for delay in replying… I am busy traveling at the moment and will be so for another few days.

I’ve had a look at the issue and while you’re correct in assuming isVisible should return false for text-sequence-decls, the BDT algorithm is not behaving as expected in the 1-paragraph case. In general, when you modify an element next to an invisible element, it may interpret the invisible argument as being either before or after. I’m a bit fuzzy on the details of exactly what it does here and will have to investigate it further and get back to you.

However, what I recommend in all situations where the schema requires elements to be in a specific order (like having text-sequence-decls appear before any paragraphs), is to add code to explicitly fix this up after the BDT algorithm has run for the parent node in question (in this case, the <office:text> element). This “fixup” process should be done wherever you have some invisible elements that are not a form of content, and have to appear either at the beginning or end of the list of children for that node (possibly in conjunction with other such children).

So in the ODFTextPut function, right after the call to ODFContainerPut, you should add code to move the TEXT_SEQUENCE_DECLS element, if it exists, to the start of the child list.

--
Dr. Peter M. Kelly
kell...@gmail.com

https://www.pmkelly.net/

PGP key: https://www.pmkelly.net/pgp-key

(fingerprint 5435 6718 59F0 DD1F BFA0 5E46 2523 BAA1 44AE 2966)

--
You received this message because you are subscribed to the Google Groups "CorinthiaTeam" group.
To unsubscribe from this group and stop receiving emails from it, send an email to corinthiatea...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

signature.asc

Peter Kelly

unread,

Apr 8, 2016, 11:04:26 PM4/8/16

to corinthiateam

BTW, I noticed that dfconvert was crashing on OS X and was only able to get it successfully running on Linux. I found out that this is due to the use of the translateXMLEnumName array when writing to the concrete.json file.

For some reason or another, this array has become out of sync with the set of tag ids that are defined. Actually, translateXMLEnumName isn’t necessary at all; you can just use the DFTagName function to translate a numeric element/attribute tag into a string. This must be done with respect to a particular document, since the mappings may differ between documents (in the case where the document contains elements or attributes that are not pre-defined).

So for example where you have code like this:

fprintf(jsonFile, "{ \"%s\": \"%s\"}", translateXMLEnumName[n->attrs[i].tag], escStr);

Replace it with this:

fprintf(jsonFile, "{ \"%s\": \"%s\"}", DFTagName(n->doc,n->attrs[i].tag), escStr);

And then get rid of the translateXMLEnumName array altogether.

I assume the reason this was working on Linux and not OS X was that the tag name it was trying to look up was just beyond the end of the array, and the compiler or linker on Linux happened to allocate a bit of extra space at the end of the array which prevented a crash when an attempt was made to read beyond the end, whereas this presumably was not the case under OS X. Either way, it’s undefined behaviour, so it was only a matter of luck that this worked (some of the time).

--
Dr. Peter M. Kelly
kell...@gmail.com

https://www.pmkelly.net/

PGP key: https://www.pmkelly.net/pgp-key

(fingerprint 5435 6718 59F0 DD1F BFA0 5E46 2523 BAA1 44AE 2966)

signature.asc

Ian C

unread,

Apr 9, 2016, 6:43:39 AM4/9/16

to Peter Kelly, corinthiateam

Hi Peter,

just pushed a change that uses DFTagName.

I was using that in other spots.

And re the BDT it works happily for my cases if I have

     // Find the last node
     DFNode *last = NULL;
     if (concrete->last != NULL) {
         last = concrete->last;
    }

     // Reinsert all the nodes in the correct order
     for (int i = count-1; i >= 0; i--) {
         DFNode *con = conChildren[i];

This is ignore the walking back trying to skip the invisible nodes.

--

Cheers,

Ian C

Reply all

Reply to author

Forward