Following is a snippet of code which uses HTML::TableContentParser to
pull table data out of an HTML page. I've gotten it to work, but, I'd
like to better understand why/how (particularly the long array/hash
references). I've read the Perllol and Perlref documents, but would
really appreciate a plain language walk-through. Thanks.
my $cServerInventoryURL = '...';
my $cServerInventoryHTML = get $cServerInventoryURL || die "Couldn't
get $cServerInventoryURL";
$p = HTML::TableContentParser->new(); # This seems to assign
a reference handle for the parser object
$tables = $p->parse($cServerInventoryHTML); # This assigns a
reference to an array containing all the tables in the HTML
$t=$tables->[5]; # This
appears to assign a reference to the 6th table object in the HTML page
##### HERE ARE THE REFERENCES IN QUESTION. THEY WORK, I JUST AM NOT
CLEAR WHY #####
for $r (@{$t->{rows}}) {
$cell2=$r->{cells}[1]{data}; #Cell 2 contains server name
$cell9=$r->{cells}[8]{data}; #Cell 9 contains environment list
print "-------------------------\n";
print " cell2 - $cell2\n";
print " cell9 - $cell9\n";
}
use Data::Dumper;
print Dumper $t;
This will be your greatest aid in understanding the structure.
> appears to assign a reference to the 6th table object in the HTML page
>
> ##### HERE ARE THE REFERENCES IN QUESTION. THEY WORK, I JUST AM NOT
> CLEAR WHY #####
> for $r (@{$t->{rows}}) {
The hashref pointed to by $t contains an element "rows", which is itself
an arrayref. We need to deference the reference to be able to use most
array operators on it. "for" in this context (also "foreach") sets $r to
be a reference to each element of the arrayref in turn.
> $cell2=$r->{cells}[1]{data}; #Cell 2 contains server name
> $cell9=$r->{cells}[8]{data}; #Cell 9 contains environment list
This syntax is syntactic sugar: written out longhand it would read
$cell2 = $r->{cells}->[1]->{data};
So: $r is a hashref containing an element "cells", that is itself an
arrayref. ->[1] dereferences the second element of this arrayref, giving
a hashref with an element "data", which gives a scalar value.
You need to be running under
use strict;
use warnings;
Check out the docs - for example perlstyle or perlfaq3 - for why.
Mark
> Hi and TIA.
>
> Following is a snippet of code which uses HTML::TableContentParser to
> pull table data out of an HTML page. I've gotten it to work, but, I'd
> like to better understand why/how (particularly the long array/hash
> references). I've read the Perllol and Perlref documents, but would
> really appreciate a plain language walk-through. Thanks.
>
> my $cServerInventoryURL = '...';
> my $cServerInventoryHTML = get $cServerInventoryURL || die "Couldn't
> get $cServerInventoryURL";
>
> $p = HTML::TableContentParser->new(); # This seems to assign
> a reference handle for the parser object
> $tables = $p->parse($cServerInventoryHTML); # This assigns a
> reference to an array containing all the tables in the HTML
> $t=$tables->[5]; # This
> appears to assign a reference to the 6th table object in the HTML page
HTML::TableContentParser->parse() returns a reference to an array, with
each element of that array corresponding to one of the tables in the
HTML source.
>
> ##### HERE ARE THE REFERENCES IN QUESTION. THEY WORK, I JUST AM NOT
> CLEAR WHY #####
> for $r (@{$t->{rows}}) {
$t contains a reference to a hash.
$t->{rows} is the value of the element of that hash with key 'rows',
which contains a reference to an array -- one element per row of the
table. $r will iterate over the rows in the table.
> $cell2=$r->{cells}[1]{data}; #Cell 2 contains server name
$r is a reference to a hash.
$r->{cells} is an element of that hash, which is a reference to an
array.
$r->{cells}[1] is the second element of that array, which is a
reference to a hash.
$r->{cells}[1]{data} is the value of that hash with key 'data', and
contains the non-table-tag content of the cell of that table, the
server name.
> $cell9=$r->{cells}[8]{data}; #Cell 9 contains environment list
Likewise
HTH.
--
Jim Gibson
So...
for $r (@{$t->{rows}}) {
{rows} is a hash key within a hash referenced by $t
$t->{rows} contains an array reference
@{$t->{rows}} de-references this array reference, returning an actual
array
$r is an element of this array and contains a reference to a hash
thus...
$r->{cells}[1]{data}
{cells} is a hash key within the hash referenced by $r
$r->{cells} contains a reference to an array
[1] is the 2nd array element referenced by $r->{cells} and contains a
hash reference
{data} is a hash key referenced by the hash referenced by $r->{cells}
[1]
{data} is apparently not yet another reference, but an actual data
element
Almost.
It does not return an actual array. It returns a list.
See:
perldoc -q difference
What is the difference between a list and an array?
> $r is an element of this array
$r is an element of this list.
> thus...
>
> $r->{cells}[1]{data}
>
> {cells} is a hash key within the hash referenced by $r
> $r->{cells} contains a reference to an array
> [1] is the 2nd array element referenced by $r->{cells} and contains a
> hash reference
> {data} is a hash key referenced by the hash referenced by $r->{cells}
> [1]
You've gotten all of that right though.
You are on your way!
--
Tad McClellan
email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
"
An array has a changeable length. A list does not. An array is
something you can push or pop, while a list is a set of values.
"
But you *can* pop etc a dereferenced arrayref.
mark@hermes:~$ perl -Mstrict -Mwarnings -le 'my $c=[qw(a b c)];print pop
@{$c}; print "@{$c}"'
c
a b
mark@hermes:~$
So - the snippet of code above initializes an arrayref, and then pops
its dereferenced value. According to the faq this would make the
deferenced value an array since I can pop it.
Am I missing something?
thanks,
Mark
In the example @{$t->{rows}} *is* an array, which in list context
*returns* a list.
John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order. -- Larry Wall
Thanks,
Mark
But you cannot, in general, pop etc the list that a foreach iterates over.
>>> mark@hermes:~$ perl -Mstrict -Mwarnings -le 'my $c=[qw(a b c)];print
>>> pop @{$c}; print "@{$c}"'
>>> c
>>> a b
>>> mark@hermes:~$
>>>
>>> So - the snippet of code above initializes an arrayref, and then pops
>>> its dereferenced value. According to the faq this would make the
>>> deferenced value an array since I can pop it.
>>>
>>> Am I missing something?
>>
>> In the example @{$t->{rows}} *is* an array, which in list context
>> *returns* a list.
> OK - got it. That's pretty subtle :)
Which is why I thought it deserved a mention.