$filter = qr/one|two|four/;
$simpletable =~ s{(<tr.+?</tr>)}{
$1 =~ /$filter/ ? '' : $1;
}iges;
Is there more efficiant way to do the same thing and using perl only?(don't
like idea of capturing $1 and doing substitution with same content)
--
Matija
Matija Papec wrote:
Would
$simpletable=~s[<tr.+?$filter.+?</tr>][]iges;
work?
There are modules on CPAN for parsing HTML, and I've often seen the
advice here to use those modules rather than roll your own.
Andras Malatinszky <nob...@dev.null> wrote:
>> $1 =~ /$filter/ ? '' : $1;
>> }iges;
>>
>> Is there more efficiant way to do the same thing and using perl only?(don't
>> like idea of capturing $1 and doing substitution with same content)
>
>
>Would
>
>$simpletable=~s[<tr.+?$filter.+?</tr>][]iges;
>
>work?
Not quite; it would always start matching from first '<tr' in $simpletable.
>There are modules on CPAN for parsing HTML, and I've often seen the
>advice here to use those modules rather than roll your own.
I'm not in position to use additional modules, but I'll take a look at CPAN.
Do you have some favorite module?
--
Matija
See below.
"Matija Papec" <mpa...@yahoo.com> wrote in message
news:4ti1dvs7nor9ktcsr...@4ax.com...
Ignore previous post - that was just Outlook Express and its
auto-post-before-I-finished-typing option :-(.
See below.
"Matija Papec" <mpa...@yahoo.com> wrote in message
news:4ti1dvs7nor9ktcsr...@4ax.com...
HTML::TreeBuilder is a fine module.
Tested data and code:
-----><8-----
<html>
<head>
<title>Tutorial for HTML::TreeBuilder</title>
</head>
<body>
<h1 align = 'center'>Tutorial for HTML::TreeBuilder</h1>
<table align = 'center' border = '1'>
<tr>
<th>Outer table has 1 row with 2 columns</th>
<td>
<table border = '1'>
<tr>
<th colspan = '2'>Inner table has 3 rows with 2 columns</th>
</tr>
<tr>
<th>Row Two/Column One</th><td>Row Two/Column Two</td>
</tr>
<tr>
<th>Row Three/Column One</th><td>Row Three/Column Two</td>
</tr>
</table>
</td>
</tr>
</table>
</body>
</html>
-----><8-----
-----><8-----
#!/usr/bin/perl
#
# Name:
# test-html-treebuilder.pl.
#
# Author:
# Ron Savage
# http://savage.net.au/index.html.
use strict;
use warnings;
use HTML::TreeBuilder;
# -----------------------------------------------
sub find_nested_content
{
my($root) = @_;
print "Looking for nested content. \n";
print "\n";
my($first_tr, $last_tr);
for ($root -> look_down(
sub
{
# Find the 1st & last <tr>s, so we can report them.
return 0 if ($_[0] -> tag() ne 'tr');
(! $first_tr) && ($first_tr = $_[0]);
$last_tr = $_[0];
return 0;
}))
{
}
print "Text of 1st tr in nested table: ", $first_tr -> as_text(), ". \n"
if ($first_tr);
print "\n";
print "Text of last tr in nested table: ", $last_tr -> as_text(), ". \n" if
($last_tr);
print "\n";
} # End of find_nested_content.
# -----------------------------------------------
sub find_nested_table
{
my($root) = @_;
print "Looking for nested table. \n";
print "\n";
# Find the 1st <table>, because I know it contains a <td>.
my($nested_table);
$root -> look_down(_tag => 'table',
sub
{
# Find the 1st <td>, because I know it contains the nested <table>.
$nested_table = $_[0] -> look_down(_tag => 'td',
sub
{
# Find the nested <table>.
return $_[0] -> look_down(_tag => 'table');
});
return $nested_table;
});
if ($nested_table)
{
print "Found nested table. \n";
}
else
{
print "Did not find nested table. \n";
}
print "\n";
$nested_table;
} # End of find_nested_table.
# -----------------------------------------------
my($file_name) = '/temp/test-html-treebuilder.html';
my($root) = HTML::TreeBuilder -> new();
$root -> parse_file($file_name) || die("Can't parse $file_name");
my($nested_table) = find_nested_table($root);
find_nested_content($nested_table) if ($nested_table);
$root -> delete();
-----><8-----
"Ron Savage" <r...@savage.net.au> wrote:
>> I'm not in position to use additional modules, but I'll take a look at
>CPAN.
>> Do you have some favorite module?
>
>HTML::TreeBuilder is a fine module.
Looks interesting.. why do you use empty foreach loop in find_nested_content
and why nearby sub is always returning zero?
--
Matija