Impossible to get search results quads using TextSearch

80 views
Skip to first unread message

Dhaya Benmessaoud

unread,
Apr 7, 2017, 11:28:26 AM4/7/17
to PDFTron WebViewer
Hi,

I'm trying to implement a way of searching text in a document that ignores diacritics. With TextSearch, I'm able successfully get the results I want, but I cannot retrieve the quads for any results (I get an empty array). That means I cannot focus the document on my result by selecting the relevant part.

Is there something I'm missing or is this a bug in the SDK. For information, I have the same issue with the PDFNet iOS SDK.


function* search() {
    const pattern = 'tâche';
    const doc = yield pdf.document.getPDFDoc();
    const search = yield PDFNet.TextSearch.create();
    const modes = PDFNet.TextSearch.Mode.e_highlight | PDFNet.TextSearch.Mode.e_reg_expression | PDFNet.TextSearch.Mode.e_ambient_string;
    
    yield search.begin(doc, replaceDiacritics(pattern), modes);
    const firstResult = yield search.run();
    
    console.log(firstResult); // successfully displays the first occurrence
    console.log(yield firstResult.highlights.hasNext()); // false
    console.log(yield firstResult.highlights.getCurrentQuads()); // [] - no quads :(
}

PDFNet.runGeneratorWithCleanup(search());


Regards,
Dhaya

Justin Jung

unread,
Apr 7, 2017, 4:46:08 PM4/7/17
to PDFTron WebViewer on behalf of Dhaya Benmessaoud
Hello Dhaya,

Can you contact us via sup...@pdftron.com with the document?

Justin Jung
Software Developer
PDFTron Systems Inc.

--
You received this message because you are subscribed to the Google Groups "PDFTron WebViewer" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pdfnet-webviewer+unsubscribe@googlegroups.com.
To post to this group, send email to pdfnet-webviewer@googlegroups.com.
Visit this group at https://groups.google.com/group/pdfnet-webviewer.
For more options, visit https://groups.google.com/d/optout.

Justin Jung

unread,
May 2, 2017, 8:04:55 PM5/2/17
to PDFTron WebViewer
Hello Dhaya,

It appears the problem is that you need to call begin() on Highlights before hasNext() and getCurrentQuads(). For example the following code appears to do what you would expect:

const pattern = 'tâche';
const doc = yield PDFNet.PDFDoc.createFromURL(tache_sample_url);
const search = yield PDFNet.TextSearch.create();
const modes = PDFNet.TextSearch.Mode.e_highlight | PDFNet.TextSearch.Mode.e_reg_expression | PDFNet.TextSearch.Mode.e_ambient_string;
    
yield search.begin(doc, pattern, modes);
const firstResult = yield search.run();
    
console.log(firstResult); // successfully displays the first occurrence
console.log(firstResult.highlights.begin(doc))
console.log(yield firstResult.highlights.hasNext()); // true
console.log(yield firstResult.highlights.getCurrentQuads()); // returns a valid quad

Note that Highlights objects act as an iterator in order to handle results spanning multiple pages so to obtain all possible results you will need to loop over the Highlight object in the following way

for(firstResult.highlights.begin(doc); yield firstResult.highlights.hasNext(); firstResult.highlights.next()) {
// call getCurrentQuads and getCurrentPageNumber
}

Dhaya Benmessaoud

unread,
May 4, 2017, 11:47:22 AM5/4/17
to PDFTron WebViewer
Nice, that works now! Thanks for your answer :)
Reply all
Reply to author
Forward
0 new messages