How to preserve line breaks in form Text multiline fields?

1116 views
Skip to first unread message

Skip Gaede

unread,
Oct 13, 2014, 7:37:05 PM10/13/14
to pdfnet-w...@googlegroups.com
Hi,

I have done some web searches about this topic (How to preserve line breaks in text fields on a form) and the answer was to set the field as rich text. A <return> is treated as a paragraph break <p>, and a shift+<return> is treated as a line break <br/>. Creating a multiline Text field with and without rich text enabled revealed no difference in the annotation file. Is there a way to modify the exported annotations so that when the annotations are read back in line breaks will be inserted at the correct spots?

Original Input:
[code]
1. Line 1<return>
2. Line 2<return>
3. Line 3<return>
[/code]

Reload Annotations:
[code]
1. Line 1 2. Line 2 3. Line 3
[/code]

--skip

Matt Parizeau

unread,
Oct 14, 2014, 1:57:32 PM10/14/14
to pdfnet-w...@googlegroups.com
Hi Skip,

When you say set the field as rich text do you mean setting the contenteditable attribute? I remember playing around with this attribute and it would add <p> and <br/> when editing the element, though it didn't seem to be exactly consistent across browsers.

Is the issue that you're having an issue displaying the text in a normal element? If you set the css styles word-wrap: break-word and white-space: pre-wrap then maybe that's what you want? For example:
<div style="word-wrap:break-word; white-space:pre-wrap">
multi
line
text
</div>

And this should preserve the line breaks.

Matt Parizeau
Software Developer
PDFTron Systems Inc.

Skip Gaede

unread,
Jan 21, 2015, 10:26:08 AM1/21/15
to pdfnet-w...@googlegroups.com
Hi Matt,

I discovered that the field data was stored in the annotation file twice, with two encodings. Looking at the second encoding, the <return>s were encoded as 0xA. Changing them to &#xA; (an encoded <return>) caused the data to be displayed correctly. (I did not figure this out myself; I found a post by someone else who had figured it out first.)  This substitution is done only within the data for multiline entries, and is done during the GET transaction.

--Skip


Skip Gaede

unread,
Jan 21, 2015, 1:54:03 PM1/21/15
to pdfnet-w...@googlegroups.com
Hi Matt,

I'm not at all familiar with the approach you described. I am creating the form document in Microsoft Word, saving it as a PDF and bringing it into Adobe Acrobat Pro where I then add fields to the document. The multiline fields have the following properties shown in the attached file.

The technique of substituting a quoted line break for the 0xA character works from a visual stand point, however, if I then bring up the form and try to edit the contents, the experience is less than fantastic. What happens is that placing the cursor at the end of the line and hitting backspaces results in the cursor backing up, but the characters are not replaced with whitespace. Aft6er backing up, typing new characters appears to occur on a new layer, overlaying the old characters (which are still visible). Saving the file and reloading it produces the correct results, though.If this is expected behavior, I can live with it, but it would be nice if there was a way to make it work more cleanly. 

Thanks, Skip

Matt Parizeau

unread,
Jan 22, 2015, 12:45:44 PM1/22/15
to pdfnet-w...@googlegroups.com
Hi Skip,

In my previous response I think I was confused because you mentioned <p> and <br/> so I had thought you wanted a multiline div element in WebViewer.

Can you clarify a little bit about what you're doing so we're on the same page. If I understand correctly you're creating a multiline text field in the PDF, then you convert to XOD. Is the problem seen in the XOD file with editing the form field? Or is the problem when you edit in WebViewer, export it and then import back to the PDF? So when you're editing the merged PDF there are issues?

Could you send an example PDF and XOD that has this problem?

Thanks,

Matt Parizeau
Software Developer
PDFTron Systems Inc.

Skip Gaede

unread,
Jan 23, 2015, 11:09:39 AM1/23/15
to PDFTron WebViewer on behalf of Matt Parizeau
Hi Matt,

Thank you for responding. I had not considered seeing how the PDF Form was handled by Acrobat Reader and comparing the two experiences (xod vs PDF). I’m also fairly new to PDF Forms if that makes any difference.

What am I trying to do? We are trying to create a training program and up until recently, the course used Word documents which were submitted via email. The form offers the possibility of allowing the student to fill the form out online and to be processed by computer. One of the forms is one the student fills out at home and may do in more than one session. Right now we have a prototype running under WordPress.

Back to findings:

Filling out the XOD form: 
Starting with an empty form, I can fill out both single line and multiline fields and editing of the data in either type of field works as expected. When the form is saved, the data is presented twice in the annotation file: once in plain text and once encoded in XML. In both cases, the end-of-line is represented with a literal LF (0xA).

A snippet of the XML encoding looks like this:

v=“1. Line 1
2. Line 2
3. Line 3”

The end-of-line character when read back in is converted to a single space so we see

1. Line 1 2. Line 2 3. Line 3

As reported earlier, I tweaked the AnnotationHandler code to replace <LF> inside quoted strings with &#xA; resulting in

v=“1. Line 1&#xA;2. Line 2&#xA;3. Line 3”

On input, the quoted end-of-line is replaced by <LF> and the form looks the way it should. This morning, I realized, and confirmed, that according to the XML spec, “real” control characters inside quoted strings are supposed to be encoded, the way I did it in the code, and that bare line feeds inside quoted strings are replaced by spaces. I believe this needs to be fixed in the PDFTron code.

When the “fixed” annotations are read back in, attempting to edit both single line and multiline data results in the cursor moving backward as the backspace key is pressed, but the backspaced characters are not visibly erased, and typing replacement characters results in the new characters being overlaid on top of the old characters. The backspaced characters are actually erased: I can backspace over three characters in the middle of a word and replace them with one character and the result is a word with 2 fewer characters. (I need to save the change and refresh the form to see this, however.) While typing this, I had an “Aha Moment.” I have been using forms created with the PDFTron Watermark. Is it possible that the Watermark is responsible for the atypical behavior???

Editing PDF Form in Acrobat Reader works as expected. 

Thanks,
Skip

Archive.zip

Matt Parizeau

unread,
Jan 23, 2015, 1:46:29 PM1/23/15
to pdfnet-w...@googlegroups.com
Hi Skip,

Thanks for the detailed explanation and the files.

I believe you're running into a bug where a duplicate field is being added to the page. So when you edit one of the fields you can see the other field underneath. Also you're right, the field values aren't being escaped correctly.

I've created a new build that includes a fix for the duplicate elements and should properly escape the values. Can you try out this build and let me know if it fixes your issues: http://pdftron.com/ID-zJWLuhTffd3c/WebViewer/WebViewer_1.8.1.29238.zip

Matt Parizeau
Software Developer
PDFTron Systems Inc.

Skip Gaede

unread,
Jan 26, 2015, 12:39:20 PM1/26/15
to PDFTron WebViewer on behalf of Matt Parizeau
Hi Matt,

You’re fantastic. Both problems fixed as slick as a whistle.

Now I know why they pay you big bucks,
—skip

--
You received this message because you are subscribed to a topic in the Google Groups "PDFTron WebViewer" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/pdfnet-webviewer/_6eM4kfKez4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to pdfnet-webview...@googlegroups.com.
To post to this group, send email to pdfnet-w...@googlegroups.com.
Visit this group at http://groups.google.com/group/pdfnet-webviewer.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages