Fetching cdata content of tags with Mojo::DOM?

47 views
Skip to first unread message

Charlie Brady

unread,
Jul 21, 2014, 3:38:31 PM7/21/14
to mojol...@googlegroups.com

I have some translation files which are XML documents, which contain some
cdata elements:

...
<entry>
<base>ACTIVATE</base>
<trans>Activate</trans>
</entry>
<entry>
<base>UNREG_DESC</base>
<trans>
<![CDATA[
To obtain more information about blah, please visit our website
<A HREF="http://www.domain.com/" TARGET=_top>
http://www.domain.com/</A>
]]>
</trans>
</entry>
...

I'm trying to turn this into a hash from which I can look up the
translations:

my %lexicon = map
{ $entry->base->content => $entry->trans->content}
$dom->find('lexicon entry')->each;

but then I find that $lexicon{UNREG_DESC} gives me '<![CDATA[ ...' when I
want 'To obtain more information ...'. I think I need to initialise a new
Mojo::DOM object with ''<![CDATA[ ...' but I can't figure out what to do
next. Hints please.

Charlie Brady

unread,
Jul 21, 2014, 3:59:31 PM7/21/14
to mojol...@googlegroups.com

On Mon, 21 Jul 2014, Charlie Brady wrote:

> I'm trying to turn this into a hash from which I can look up the
> translations:
>
> my %lexicon = map
> { $entry->base->content => $entry->trans->content}
> $dom->find('lexicon entry')->each;
>
> but then I find that $lexicon{UNREG_DESC} gives me '<![CDATA[ ...' when I
> want 'To obtain more information ...'. I think I need to initialise a new
> Mojo::DOM object with ''<![CDATA[ ...' but I can't figure out what to do
> next. Hints please.

I can give my own hint:

http://mojolicio.us/perldoc/Mojo/DOM#contents

So I can do:

my %lexicon = map
{ $entry->base->content => $entry->trans->contents->first->content}

Charlie Brady

unread,
Jul 21, 2014, 4:02:41 PM7/21/14
to mojol...@googlegroups.com

On Mon, 21 Jul 2014, Charlie Brady wrote:

> So I can do:

Without the stupid c&p error:

my %lexicon = map
{ my $entry = $_ ;

Charlie Brady

unread,
Jul 21, 2014, 4:33:31 PM7/21/14
to mojol...@googlegroups.com
No, that doesn't work for me, because of whitespace.

cf.

$dom->parse('<!-- test --><b>123</b>')

and:

$dom->parse(' <!-- test --><b>123</b>')

Can anybody suggest anything, should of using s/// and parsing a new
Mojo::DOM object? I can see Mojo::DOM::text, but dont see how I could use
it.


Charlie Brady

unread,
Jul 21, 2014, 4:47:43 PM7/21/14
to mojol...@googlegroups.com
Sorry, everyone. I need more sleep. That should say:

.. short of using s/// and parsing a new Mojo::DOM object.

I have my code working that way now, but it seems excessive to parse those
fragments all over again.

sri

unread,
Jul 21, 2014, 4:57:27 PM7/21/14
to mojol...@googlegroups.com, charli...@budge.apana.org.au
It's nice that you want to share your solutions, but please don't spam the list.

--
sebastian

Charlie Brady

unread,
Jul 21, 2014, 8:16:54 PM7/21/14
to sri, mojol...@googlegroups.com

On Mon, 21 Jul 2014, sri wrote:

> It's nice that you want to share your solutions, but please don't spam the
> list.

Sorry if you thought I did that. I thought I was asking for help. I'd
still like some help if someone is able to provide it.

sri

unread,
Jul 22, 2014, 2:51:22 AM7/22/14
to mojol...@googlegroups.com, kra...@googlemail.com, charli...@budge.apana.org.au
> It's nice that you want to share your solutions, but please don't spam the
> list.

Sorry if you thought I did that.

I tried to be nice about the warning, but your passive aggressive response makes me believe that you're just not taking it serious. Therefore i'm afraid i'm forced to temporarily ban you from this list. In the future, please be nice when participating here.

--
sebastian
Reply all
Reply to author
Forward
This conversation is locked
You cannot reply and perform actions on locked conversations.
0 new messages