Highlighting search query in pdf files

34 views
Skip to first unread message

victor

unread,
Apr 1, 2010, 10:28:11 AM4/1/10
to XTF Developer list
Hi,
Presently I am working on xtf search engine for XML-TEI and PDF files.
When Tei files are displayed the searched terms are highlighted in
display.

But in pdf files the the searched terms are not highlighted. Can
anyone suggest how to make it possible for pdf files also. Can pdf
files be changed to XML -TEI files to make it feasible??

Thanks in advance,
Victor
vidy...@gmail.com

dan haig

unread,
Apr 1, 2010, 11:35:37 AM4/1/10
to xtf-...@googlegroups.com
We're currently working on a solution to provide for hit highlighting
in PDFs even for advanced search results. In order for it to work, one
needs the search result snippets to come out in the order in which
they occur in the text, so the PDF highlighting can do the math on the
coordinates to properly find the instances in the file. We were
disappointed to see that xtf-2.2 has not addressed this, as far as I
could tell in my initial forays into it, so we're hacking it out in
the java ourselves from here.

We have tho pushed xtf-2.2 jars out to production and they're working
fine, nary a hiccup as far as that goes, so here's to a seriously bug
free release!!! Thanks XTF!!!!

.d

> --
> You received this message because you are subscribed to the Google Groups "XTF Developer list" group.
> To post to this group, send email to xtf-...@googlegroups.com.
> To unsubscribe from this group, send email to xtf-devel+...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/xtf-devel?hl=en.
>
>

Martin Haye

unread,
Apr 1, 2010, 1:01:10 PM4/1/10
to xtf-...@googlegroups.com
Hi Dan,

Great to hear the XTF 2.2 code is working great for everyone.

Just a quick note: you can traverse the hits in document order from dynaXML
simply by doing this:

<xsl:for-each select="//xtf:hit"> ...

--Martin

Martin Haye

unread,
Apr 1, 2010, 1:49:39 PM4/1/10
to xtf-...@googlegroups.com
Hello Victor,

That is a pretty tricky thing to do. I don't know how to get Acrobat to
highlight the hits, and even if I could, how would one handle people that
don't use Acrobat (e.g. Mac users)?

What we do here at CDL is pretty involved. We have a separate tool that
renders the PDF pages as images, and overlays the yellow hit boxes. It's a
large and complex mechanism to include in XTF, one foremost problem being it
moves XTF away from being pure Java (the rendering tool is based on Poppler
which is in C++).

Still, you can see what we've done if you like. A more in-depth description
is here:
<http://www.cdlib.org/services/publishing/tools/display_technology.html> and
the interface itself is here: <http://escholarship.org>

--Martin


> From: victor <vidy...@gmail.com>
> Reply-To: <xtf-...@googlegroups.com>

vidya nand

unread,
Apr 1, 2010, 2:06:17 PM4/1/10
to xtf-...@googlegroups.com
Sir, thanks a lot for the reply!!!

Victor
vidy...@gmail.com

Brouk

unread,
Apr 2, 2010, 5:56:30 AM4/2/10
to XTF Developer list
Hi,
I use this "http://partners.adobe.com/public/developer/en/acrobat/
PDFOpenParameters.pdf", It's a little bit heavy when openning large
pdfs.But works fine on the small ones.


----------------------------
<xsl:template match="term" mode="text ......>
-----
<xsl:call-template name="rawDisplay.url">
<xsl:with-param name="path"
select="concat(concat($path,'#toolbar=0&amp;search='),$keyword)"/>
</xsl:call-template>


.........

</xsl:template>
--------------------------
If it can help

brouk


On 1 avr, 20:06, vidya nand <vidya...@gmail.com> wrote:
> Sir, thanks a lot for the reply!!!
>
> Victor

> vidya...@gmail.com


>
> On Thu, Apr 1, 2010 at 7:49 PM, Martin Haye <martin.h...@ucop.edu> wrote:
> > Hello Victor,
>
> > That is a pretty tricky thing to do. I don't know how to get Acrobat to
> > highlight the hits, and even if I could, how would one handle people that
> > don't use Acrobat (e.g. Mac users)?
>
> > What we do here at CDL is pretty involved. We have a separate tool that
> > renders the PDF pages as images, and overlays the yellow hit boxes. It's a
> > large and complex mechanism to include in XTF, one foremost problem being
> > it
> > moves XTF away from being pure Java (the rendering tool is based on Poppler
> > which is in C++).
>
> > Still, you can see what we've done if you like. A more in-depth description
> > is here:
> > <http://www.cdlib.org/services/publishing/tools/display_technology.html>
> > and
> > the interface itself is here: <http://escholarship.org>
>
> > --Martin
>

> > > From: victor <vidya...@gmail.com>
> > > Reply-To: <xtf-...@googlegroups.com>
> > > Date: Thu, 1 Apr 2010 07:28:11 -0700 (PDT)
> > > To: XTF Developer list <xtf-...@googlegroups.com>
> > > Subject: [xtf-devel] Highlighting search query in pdf files
>
> > > Hi,
> > > Presently I am working on xtf search engine for XML-TEI and PDF files.
> > > When Tei files are displayed the searched terms are highlighted in
> > > display.
>
> > > But in pdf files the the searched terms are not highlighted. Can
> > > anyone suggest how to make it possible for pdf files also. Can pdf
> > > files be changed to XML -TEI files to make it feasible??
>
> > > Thanks in advance,
> > > Victor

> > > vidya...@gmail.com


>
> > > --
> > > You received this message because you are subscribed to the Google Groups
> > "XTF
> > > Developer list" group.
> > > To post to this group, send email to xtf-...@googlegroups.com.
> > > To unsubscribe from this group, send email to

> > > xtf-devel+...@googlegroups.com<xtf-devel%2Bunsu...@googlegroups.com>


> > .
> > > For more options, visit this group at
> > >http://groups.google.com/group/xtf-devel?hl=en.
>
> > --
> > You received this message because you are subscribed to the Google Groups
> > "XTF Developer list" group.
> > To post to this group, send email to xtf-...@googlegroups.com.
> > To unsubscribe from this group, send email to

> > xtf-devel+...@googlegroups.com<xtf-devel%2Bunsu...@googlegroups.com>

Reply all
Reply to author
Forward
0 new messages