ContentReplacer doesn't replace text with colon

96 views
Skip to first unread message

Ruslan

unread,
Mar 28, 2019, 12:41:48 PM3/28/19
to PDFTron PDFNet SDK
Hello,
I have pdf doc with 3 text strings:
[tag]
[:tag:]
[ta:g]

And try to replace them with code:
public static byte[] ReplaceContent(byte[] content) {
ContentReplacer replacer = CreateContentReplacer();
using (var doc = new PDFDoc(content, content.Length)) {
doc.InitSecurityHandler();

PageIterator itr = doc.GetPageIterator();
while (itr.HasNext()) {
Page page = itr.Current();
replacer.Process(page);
itr.Next();
}
return doc.Save(SDFDoc.SaveOptions.e_remove_unused);
}
}

private static ContentReplacer CreateContentReplacer() {
var replacer = new ContentReplacer();
replacer.AddString("tag", "!REPLACED!");
replacer.AddString(":tag:", "!REPLACED!");
replacer.AddString("ta:g", "!REPLACED!");
return replacer;
}

The problem is: [:tag:] and [ta:g] are not replaced.

How can I replace them?

Ryan

unread,
Apr 1, 2019, 5:29:31 PM4/1/19
to PDFTron PDFNet SDK
Hello thank you for the detailed question.

It would be important to know why replacing text with a colon in it is important for you?

Are you creating these PDF files with the text "[ta:g]" in them? If not, who is?
In either case, can this key value be changed to not include colons?

rbat...@appulate.com

unread,
Apr 2, 2019, 12:59:45 PM4/2/19
to PDFTron PDFNet SDK
Hello,
Yes it's important.

We do not control the creation of the doc with "[ta:g]", we receive the doc from another system, so we can not change the values not to include colon. And we need to replace the tag.

Is it a bug? Is it a known not documented feature? How can I replace such a tag?

P.S. I'm not able to replace the text with ElementReader and ElementWriter as "[:tag:]" actually is not a single text run, but a sequence of textruns. That is why I need ContentReplacer to work in my case.


Ryan

unread,
Apr 3, 2019, 6:18:48 PM4/3/19
to PDFTron PDFNet SDK
Thank you for the clarifications.

We are using regex internally, and right now, it is only accepting 0-9,a-z,A-Z, underscore, hyphen and whitespace, in the key between the square brackets.

I see that this is not documented at this time.

We are open to adding colon as accepted text in the key, but we would have to expand our unit tests first to make sure that there are no regressions.

In the meantime, are you aware of any other characters that might be in your keys? such as & ; % $ etc.?

rbat...@appulate.com

unread,
Apr 4, 2019, 1:38:28 PM4/4/19
to PDFTron PDFNet SDK
Thanks!
As I know, we only need colon and underscore.
Could you tell me the approximate time to fix it (week, month, 3 months, etc.) in case of "no regression"?

Ryan

unread,
Oct 21, 2019, 5:16:17 PM10/21/19
to PDFTron PDFNet SDK
Thank you for your patience.

Starting tomorrow morning (Vancouver Canada time), our stable production builds will now support colons and underscores.

Please let us know how this new build works for you.
Reply all
Reply to author
Forward
0 new messages