Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Extracting HTML

0 views
Skip to first unread message

Jordi Aragones vilella

unread,
May 11, 2008, 2:45:08 PM5/11/08
to
Hi!

I have a problem, and I can't find the solution.

I'm developing an app, (with ruby on rails) and I'm trying to create a
PDF with some information that I have in my database.

The situation is the next one: In my database I have some fields that
contains HTML source, that text it was added using FCKeditor. And now
I'm trying to built a PDF using PDF::Writer, but the problem that I have
is when I'm trying to insert my text in the PDF appears (as is logical)
the HTML tags. My question is... Is there any function/plugin that
allows me to skip that source? Or convert from HTML to Text?

Thank you very much for your time.

Jordi
--
Posted via http://www.ruby-forum.com/.

Dan Diebolt

unread,
May 11, 2008, 2:50:23 PM5/11/08
to
[Note: parts of this message were removed to make it a legal post.]

require 'hpricot'

doc=Hpricot("<h1>Hello World</h1>")
doc.inner_text
=> "Hello World"

Jordi Aragones vilella

unread,
May 11, 2008, 3:21:23 PM5/11/08
to
Dan Diebolt wrote:
> require 'hpricot'
>
> doc=Hpricot("<h1>Hello World</h1>")
> doc.inner_text
> => "Hello World"

Hi!! Thanks a lot for your answer. I'm a newbie and I still need to
learn a lot from Ruby and his libraries...

It worked fine!! And now, I will deep a bit in this library, because I
suppose that with that one, I will be able to save some tags for my PDF
code, isn't it? (For example, bold, underline...).

Thanks again for your answer! :)

Phlip

unread,
May 11, 2008, 3:55:10 PM5/11/08
to
Jordi Aragones vilella wrote:

>> require 'hpricot'

> Hi!! Thanks a lot for your answer. I'm a newbie and I still need to
> learn a lot from Ruby and his libraries...
>
> It worked fine!! And now, I will deep a bit in this library, because I
> suppose that with that one, I will be able to save some tags for my PDF
> code, isn't it? (For example, bold, underline...).

Use a SAX-style XML parser to parse your strings as XHTML. SAX means the
parser calls a method for each tag it finds, so if you bind a <b> or <u> tag
you can stream the contents into PDF.

Now google for [ruby xml sax], because I don't know what Ruby's SAX solution
is!

--
Phlip


0 new messages