Error when processing table with split instrText

Eduardo Fernandes

unread,

Jan 2, 2017, 1:58:30 PM1/2/17

to xdocreport

Hi.

First of all thank you very much for so useful piece of software. Nice job guys.

I'm using XDocReport version 1.0.6, on Windows 10 with JDK8.

I just peek the example from your distribution. It works very well.

The problem arises when I manually edit the mergefield inside the table. The, as you well know, the field is split in several instrText which, apparently, XDocReport doesn't manage correctly. If I do the same in the body text (outside the table) everything works fine.

I can't edit the templates because they are sent by an external system and I just have to process them.

Do you have any suggestion? Is this a know issue? I couldn't find it out after seeking the issue site.

Many thanks in advance.

/Eduardo

DocxProjectWithFreemarkerAndImageList.docx

DocxProjectWithFreemarkerAndImageList-Split.docx

DocxProjectWithFreemarkerAndImageList.java

Angelo zerr

unread,

Jan 2, 2017, 4:24:57 PM1/2/17

to xdocr...@googlegroups.com

Hi Eduardo,

XDocReport should support split of mergefield, but it seems that it doesn't support the split of Hyperlink.

It's a bug, I suggest you that you create an issue at https://github.com/opensagres/xdocreport/issues but to be honnest with you, I haveno time to fix it.

Regard's Angelo

--
You received this message because you are subscribed to the Google Groups "xdocreport" group.
To unsubscribe from this group and stop receiving emails from it, send an email to xdocreport+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Eduardo Fernandes

unread,

Jan 13, 2017, 8:18:14 AM1/13/17

to xdocreport

Hi.

Sorry for the delay in answering. I took a short vacation :)

Thanks for your response. What I've observed is that, indeed, the standard MERGEFIELD are processed correctly but the HYPERLINK not, if they are outside the table.

Nevertheless if you edit (and then spli) any of the fields (MERGEFIELD or HYPERLINK) inside the table we got an error. Please find another docx example I've used to reproduce the error. In this case the problem is not related to the HYPERTEXT but with the developer.lastName field inside the table.

I'll try to build and fix to clean the document.xml on the fly and let you know.

Best regards and thanks again.

To unsubscribe from this group and stop receiving emails from it, send an email to xdocreport+...@googlegroups.com.

DocxProjectWithFreemarkerAndImageList.docx

Eduardo Fernandes

unread,

Jan 13, 2017, 8:47:32 AM1/13/17

to xdocreport

Hello again Angelo.

I think I could create a simple filter to wipe out the elements which interfere with your parser. I would apply a similar logic on ALL instrText fields and join them together and ignore splits. I've created a simple prototype and it works smoothly with the previous template with splits at bot sides of the table and also fix hyperlink field.

The filter creates a very small temporary linear dom of nodes and after encountering the separate special field removes any intermediate node and merge the chars to a single instrText field. The filter also converts word format (MERGEFIELD field_name) to FreeMarker syntax (MERGEFIELD ${field_name}) and things like that. I suppose this it not interesting for you. The filter joins any instrText elements so it works also with [#if....] etc FreeMarker elements.

Would you be interested in the code? It would be nice if it could be integrated in your library so I could delete some lines of code from our side.

Best regards.

Angelo zerr

unread,

Jan 13, 2017, 9:09:26 AM1/13/17

to xdocr...@googlegroups.com

Hi Eduardo,

We will be happy to accept your PR, but the topic about merged insrText is the main problem that XDocReport try to manage and we need to be careful when this code changes. That's why we have created a lot of JUnit. I know the code is very hard to understand, but if you wish to contribute to XDocReport:

* please follow the same idea than existing code (don't use DOM preprocessor but update the existing code SAX preprocessor)

* please support Freemarker and Velocity both

* please add several JUnit like https://github.com/opensagres/xdocreport/tree/master/integrationtests/fr.opensagres.xdocreport.core.test/src/test/java/fr/opensagres/xdocreport/document/docx/preprocessor

Here a sample of JUNit https://github.com/opensagres/xdocreport/blob/master/integrationtests/fr.opensagres.xdocreport.core.test/src/test/java/fr/opensagres/xdocreport/document/docx/preprocessor/DocxPreprocessorWith2InstrText.java which merge 2 instrText

I know it's a big work and I hope you will be not discouraged, but please understand that XDocReport must to continue to be stable and I could not find time to fix your contribution if there is some problems. Hope you will understand.

Reagrd's Angelo

--
You received this message because you are subscribed to the Google Groups "xdocreport" group.

To unsubscribe from this group and stop receiving emails from it, send an email to xdocreport+unsubscribe@googlegroups.com.

Eduardo Fernandes

unread,

Jan 13, 2017, 9:26:24 AM1/13/17

to xdocreport

Hi.

Thanks for the info.

Comments: I'm not using DOM to fix it. I'm using SAX. I create a temporary in advance list of elements to remove them later if this is the case. The list of nodes built in advance is very reduced and very very shot (typically 20 nodes) and never will roll out of the current run element. The memory footprint will not change for that. The reason to do that is kept the code clean. I don't have to add extra states to the SAX processing keeping the logic very clean and at the end of the separate element we just flush the pending nodes removing unnecessary entries. You have to store the pending entries in same place doesn't matter your solution. In fact your code do that already when processing. I thing that they are compatible philosophically speaking.

I have specific junit tests already and of course any existent test will also corroborate the change since everything will go through the new code.

In your case I'll remove the FreeMarket automatic translation so don't worry about it. In our case it is mandatory but in your case it is useless.

We have already 500 tests specific to word merge generation (written in a proprietary framework) and all of them work fine with the new code. I'l boil it a bit before send you any fix.

Many many thanks for your time on this.

Regards.

To unsubscribe from this group and stop receiving emails from it, send an email to xdocreport+...@googlegroups.com.

Angelo zerr

unread,

Jan 13, 2017, 9:32:40 AM1/13/17

to xdocr...@googlegroups.com

Eduardo

It's a fantastic news and we will be happy to accept your PR. If XDocReport tests are passing, it's a very good news! Please add the more tests that you can.

Are you interested to become a XDocReport commiter? If you are, don't hesitate to tell us.

Thanks

Regard's Angelo

To unsubscribe from this group and stop receiving emails from it, send an email to xdocreport+unsubscribe@googlegroups.com.

Eduardo Fernandes

unread,

Jan 13, 2017, 10:24:52 AM1/13/17

to xdocreport

Many thanks. I'll try to find out some time to merge our fix into you code and send you a push request. Right now it is implemented as a Stream Filter so I didn't any change in library. The filter receives the docx input stream and generates a new InputStream which is the one sent to XDocReport.

Thanks again.

Regards

/Eduardo

Eduardo Fernandes

unread,

Jan 13, 2017, 10:26:21 AM1/13/17

to xdocreport

For sure, thanks for the invitation. I'll think about that. Right now I have enough trouble with my own libraries :)

El viernes, 13 de enero de 2017, 15:32:40 (UTC+1), Angelo escribió:

Eduardo Fernandes

unread,

Jan 13, 2017, 2:07:09 PM1/13/17

to xdocreport

Hi Angelo.

I was reading your code a bit. To implement the change we have two options:

1) merge my changes with your DocXBufferedDocumentContentHandler or

2) implements a previous handler which sends events to your handler already filtered

Advantages of 2 over 1: The instrText processing gets completely isolated from your core processing. You could remove later any instrText merging logic since any instrText will arrive to the next handler already joined. For me it looks much more clear than merging two processing together. Of course the performance will be exactly the same in a pure SAX style.

Normally I prefer specialized handlers to avoid this kind of mixing and handlers piping is very useful in this context.

What do you think?

Regards.

El viernes, 13 de enero de 2017, 15:32:40 (UTC+1), Angelo escribió:

Message has been deleted

Eduardo Fernandes

unread,

Jan 13, 2017, 2:52:38 PM1/13/17

to xdocreport

For sure, options 2 looks a lot more object oriented.

Also, if you think it is a good idea, I would to add setter to a instrText modifier, to enable programmers to dynamically change variable names, syntax, etc to each mergefield in the very same parsing cycle. This enable, for example, add/remove loop cycles replacing variable by "container.variable", add manual support to extra syntax in fields, etc, etc.

Regards.

Angelo zerr

unread,

Jan 13, 2017, 3:44:30 PM1/13/17

to xdocr...@googlegroups.com

Eduardo,

It's little hard for me to understand your idea. It was a long time that I have developped this feature.