Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Complex (for me) reference deconstruction

0 views
Skip to first unread message

sjca...@gmail.com

unread,
Sep 5, 2008, 12:52:55 PM9/5/08
to
Hi and TIA.

Following is a snippet of code which uses HTML::TableContentParser to
pull table data out of an HTML page. I've gotten it to work, but, I'd
like to better understand why/how (particularly the long array/hash
references). I've read the Perllol and Perlref documents, but would
really appreciate a plain language walk-through. Thanks.

my $cServerInventoryURL = '...';
my $cServerInventoryHTML = get $cServerInventoryURL || die "Couldn't
get $cServerInventoryURL";

$p = HTML::TableContentParser->new(); # This seems to assign
a reference handle for the parser object
$tables = $p->parse($cServerInventoryHTML); # This assigns a
reference to an array containing all the tables in the HTML
$t=$tables->[5]; # This
appears to assign a reference to the 6th table object in the HTML page

##### HERE ARE THE REFERENCES IN QUESTION. THEY WORK, I JUST AM NOT
CLEAR WHY #####
for $r (@{$t->{rows}}) {
$cell2=$r->{cells}[1]{data}; #Cell 2 contains server name
$cell9=$r->{cells}[8]{data}; #Cell 9 contains environment list

print "-------------------------\n";
print " cell2 - $cell2\n";
print " cell9 - $cell9\n";
}

Mark Clements

unread,
Sep 5, 2008, 2:43:43 PM9/5/08
to
sjca...@gmail.com wrote:
> Hi and TIA.
>
> Following is a snippet of code which uses HTML::TableContentParser to
> pull table data out of an HTML page. I've gotten it to work, but, I'd
> like to better understand why/how (particularly the long array/hash
> references). I've read the Perllol and Perlref documents, but would
> really appreciate a plain language walk-through. Thanks.
>
> my $cServerInventoryURL = '...';
> my $cServerInventoryHTML = get $cServerInventoryURL || die "Couldn't
> get $cServerInventoryURL";
>
> $p = HTML::TableContentParser->new(); # This seems to assign
> a reference handle for the parser object
> $tables = $p->parse($cServerInventoryHTML); # This assigns a
> reference to an array containing all the tables in the HTML
> $t=$tables->[5]; # This
OK - so $t now contains a reference to a hash, $tables containing an
arrayref (a reference to an array). Try using Data::Dumper to print it out.

use Data::Dumper;
print Dumper $t;

This will be your greatest aid in understanding the structure.

> appears to assign a reference to the 6th table object in the HTML page
>
> ##### HERE ARE THE REFERENCES IN QUESTION. THEY WORK, I JUST AM NOT
> CLEAR WHY #####
> for $r (@{$t->{rows}}) {

The hashref pointed to by $t contains an element "rows", which is itself
an arrayref. We need to deference the reference to be able to use most
array operators on it. "for" in this context (also "foreach") sets $r to
be a reference to each element of the arrayref in turn.

> $cell2=$r->{cells}[1]{data}; #Cell 2 contains server name
> $cell9=$r->{cells}[8]{data}; #Cell 9 contains environment list

This syntax is syntactic sugar: written out longhand it would read
$cell2 = $r->{cells}->[1]->{data};
So: $r is a hashref containing an element "cells", that is itself an
arrayref. ->[1] dereferences the second element of this arrayref, giving
a hashref with an element "data", which gives a scalar value.

You need to be running under

use strict;
use warnings;

Check out the docs - for example perlstyle or perlfaq3 - for why.

Mark

Jim Gibson

unread,
Sep 5, 2008, 3:22:45 PM9/5/08
to
In article
<08d0b85d-977c-4536...@w24g2000prd.googlegroups.com>,
<sjca...@gmail.com> wrote:

> Hi and TIA.
>
> Following is a snippet of code which uses HTML::TableContentParser to
> pull table data out of an HTML page. I've gotten it to work, but, I'd
> like to better understand why/how (particularly the long array/hash
> references). I've read the Perllol and Perlref documents, but would
> really appreciate a plain language walk-through. Thanks.
>
> my $cServerInventoryURL = '...';
> my $cServerInventoryHTML = get $cServerInventoryURL || die "Couldn't
> get $cServerInventoryURL";
>
> $p = HTML::TableContentParser->new(); # This seems to assign
> a reference handle for the parser object
> $tables = $p->parse($cServerInventoryHTML); # This assigns a
> reference to an array containing all the tables in the HTML
> $t=$tables->[5]; # This
> appears to assign a reference to the 6th table object in the HTML page


HTML::TableContentParser->parse() returns a reference to an array, with
each element of that array corresponding to one of the tables in the
HTML source.

>
> ##### HERE ARE THE REFERENCES IN QUESTION. THEY WORK, I JUST AM NOT
> CLEAR WHY #####
> for $r (@{$t->{rows}}) {

$t contains a reference to a hash.
$t->{rows} is the value of the element of that hash with key 'rows',
which contains a reference to an array -- one element per row of the
table. $r will iterate over the rows in the table.

> $cell2=$r->{cells}[1]{data}; #Cell 2 contains server name

$r is a reference to a hash.
$r->{cells} is an element of that hash, which is a reference to an
array.
$r->{cells}[1] is the second element of that array, which is a
reference to a hash.
$r->{cells}[1]{data} is the value of that hash with key 'data', and
contains the non-table-tag content of the cell of that table, the
server name.


> $cell9=$r->{cells}[8]{data}; #Cell 9 contains environment list

Likewise

HTH.

--
Jim Gibson

sjca...@gmail.com

unread,
Sep 5, 2008, 6:29:09 PM9/5/08
to
Ok, I think I understand. Thanks for the help.

So...

for $r (@{$t->{rows}}) {

{rows} is a hash key within a hash referenced by $t
$t->{rows} contains an array reference
@{$t->{rows}} de-references this array reference, returning an actual
array
$r is an element of this array and contains a reference to a hash

thus...

$r->{cells}[1]{data}

{cells} is a hash key within the hash referenced by $r
$r->{cells} contains a reference to an array
[1] is the 2nd array element referenced by $r->{cells} and contains a
hash reference
{data} is a hash key referenced by the hash referenced by $r->{cells}
[1]

{data} is apparently not yet another reference, but an actual data
element


Tad J McClellan

unread,
Sep 5, 2008, 7:15:30 PM9/5/08
to
sjca...@gmail.com <sjca...@gmail.com> wrote:
> Ok, I think I understand. Thanks for the help.
>
> So...
>
> for $r (@{$t->{rows}}) {
>
> {rows} is a hash key within a hash referenced by $t
> $t->{rows} contains an array reference
> @{$t->{rows}} de-references this array reference, returning an actual
> array


Almost.

It does not return an actual array. It returns a list.

See:

perldoc -q difference

What is the difference between a list and an array?

> $r is an element of this array


$r is an element of this list.


> thus...
>
> $r->{cells}[1]{data}
>
> {cells} is a hash key within the hash referenced by $r
> $r->{cells} contains a reference to an array
> [1] is the 2nd array element referenced by $r->{cells} and contains a
> hash reference
> {data} is a hash key referenced by the hash referenced by $r->{cells}
> [1]


You've gotten all of that right though.

You are on your way!


--
Tad McClellan
email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"

Mark Clements

unread,
Sep 6, 2008, 1:56:38 AM9/6/08
to
Tad J McClellan wrote:
> sjca...@gmail.com<sjca...@gmail.com> wrote:
>> Ok, I think I understand. Thanks for the help.
>>
>> So...
>>
>> for $r (@{$t->{rows}}) {
>>
>> {rows} is a hash key within a hash referenced by $t
>> $t->{rows} contains an array reference
>> @{$t->{rows}} de-references this array reference, returning an actual
>> array
>
>
> Almost.
>
> It does not return an actual array. It returns a list.
>
> See:
>
> perldoc -q difference
>
> What is the difference between a list and an array?
>
Hmmm... OK - I've read the faq above and it says:

"
An array has a changeable length. A list does not. An array is
something you can push or pop, while a list is a set of values.
"

But you *can* pop etc a dereferenced arrayref.

mark@hermes:~$ perl -Mstrict -Mwarnings -le 'my $c=[qw(a b c)];print pop
@{$c}; print "@{$c}"'
c
a b
mark@hermes:~$

So - the snippet of code above initializes an arrayref, and then pops
its dereferenced value. According to the faq this would make the
deferenced value an array since I can pop it.

Am I missing something?

thanks,

Mark

John W. Krahn

unread,
Sep 6, 2008, 7:00:28 AM9/6/08
to

In the example @{$t->{rows}} *is* an array, which in list context
*returns* a list.

John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order. -- Larry Wall

Mark Clements

unread,
Sep 7, 2008, 3:05:19 AM9/7/08
to
OK - got it. That's pretty subtle :)

Thanks,

Mark

Tad J McClellan

unread,
Sep 8, 2008, 11:23:00 PM9/8/08
to
Mark Clements <mark.clemen...@wanadoo.fr> wrote:
> John W. Krahn wrote:
>> Mark Clements wrote:
>>> Tad J McClellan wrote:
>>>> sjca...@gmail.com<sjca...@gmail.com> wrote:
>>>>> Ok, I think I understand. Thanks for the help.
>>>>>
>>>>> So...
>>>>>
>>>>> for $r (@{$t->{rows}}) {
>>>>>
>>>>> {rows} is a hash key within a hash referenced by $t
>>>>> $t->{rows} contains an array reference
>>>>> @{$t->{rows}} de-references this array reference, returning an actual
>>>>> array
>>>>
>>>> Almost.
>>>>
>>>> It does not return an actual array. It returns a list.
>>>>
>>>> See:
>>>>
>>>> perldoc -q difference
>>>>
>>>> What is the difference between a list and an array?
>>>>
>>> Hmmm... OK - I've read the faq above and it says:
>>>
>>> "
>>> An array has a changeable length. A list does not. An array is
>>> something you can push or pop, while a list is a set of values.
>>> "
>>>
>>> But you *can* pop etc a dereferenced arrayref.


But you cannot, in general, pop etc the list that a foreach iterates over.


>>> mark@hermes:~$ perl -Mstrict -Mwarnings -le 'my $c=[qw(a b c)];print
>>> pop @{$c}; print "@{$c}"'
>>> c
>>> a b
>>> mark@hermes:~$
>>>
>>> So - the snippet of code above initializes an arrayref, and then pops
>>> its dereferenced value. According to the faq this would make the
>>> deferenced value an array since I can pop it.
>>>
>>> Am I missing something?
>>
>> In the example @{$t->{rows}} *is* an array, which in list context
>> *returns* a list.
> OK - got it. That's pretty subtle :)


Which is why I thought it deserved a mention.

0 new messages