Moin,
if you are not interested in reading about my work, please stop now. Thank
you.
Contents:
0. Introduction to Problem
1. Possible Solutions
0. Introduction
===============
After finishing[0] BigInt so far, I am slowly walking myself through the task
of updating my other modules. However, fairly recently I noticed the Perl
Advent Calendar[4] for the first time, and lo and behold, Mark Fowler has
BigInt on it! Wow! :)
Most interesting to mention is that he showed Devel::Size output for
Math::BigInt - something I always wondered: How to get the size of an object
in bytes? Devel::Size certainly slipped under my radar..
Now, it tells us that BigInts use much more memory than a simple scalar or
integer, and that BigFloats use even _more_ memory. BigRats eat less if they
are integers, and more if they are rational numbers:
# perl -MDevel::Size=total_size -le 'print total_size(1)'
16
te@null:~> perl -Mbigrat -MDevel::Size=total_size -le 'print
total_size(Math::BigInt->new("13"))'
259
# perl -Ilib -Mbigrat -MDevel::Size=total_size -le 'print
total_size(Math::BigRat->new("13"))'
589
# perl -Ilib -Mbigrat -MDevel::Size=total_size -le 'print
total_size(Math::BigRat->new("3/4"))'
887
# perl -Mbigrat -MDevel::Size=total_size -le 'print
total_size(Math::BigFloat->new("3.4"))'
767
Interestingly, a BigFloat as integer eats slightly more:
# perl -Mbigrat -MDevel::Size=total_size -le 'print
total_size(Math::BigFloat->new("34"))'
782
While I expected BigFloats to use more memory, I didn't expect them to use so
much more. The same for BigRats. And Devel::Size doesn't tell me why they eat
so much memory.
So I wrote Devel::Size::Report (to be found on CPAN or
http://bloodgate.com/perl/packages/). It produces nifty reports like this[3]:
Size report for '1' (Math::BigInt):
Hash 259 bytes (161 bytes overhead)
Key 'value' => Array 72 bytes (56 bytes overhead)
Scalar 16 bytes
Key 'sign' => Scalar 26 bytes
Total: 259 bytes
The distribution contains a small perl script called psize, which you can use
like this:
# psize "Math::BigInt->new(12)"
Size report for 'Math::BigInt->new(12)' => '12' (Math::BigInt):
Hash 259 bytes (161 bytes overhead)
Key 'value' => Array 72 bytes (56 bytes overhead)
Scalar 16 bytes
Key 'sign' => Scalar 26 bytes
Total: 259 bytes
Using this on BigFloat and BigRat shows us the reason for the memory wastage:
# psize "Math::BigFloat->new(12.3)"
Size report for 'Math::BigFloat->new(12.3)' => '12.3' (Math::BigFloat):
Hash 767 bytes (143 bytes overhead)
Key '_m' => Hash 299 bytes (185 bytes overhead)
Key 'value' => Array 72 bytes (56 bytes overhead)
Scalar 16 bytes
Key '_f' => Scalar 16 bytes
Key 'sign' => Scalar 26 bytes
Key '_e' => Hash 299 bytes (185 bytes overhead)
Key 'value' => Array 72 bytes (56 bytes overhead)
Scalar 16 bytes
Key '_f' => Scalar 16 bytes
Key 'sign' => Scalar 26 bytes
Key 'sign' => Scalar 26 bytes
Total: 767 bytes
# psize "Math::BigRat->new(3/4)"
Size report for 'Math::BigRat->new(3/4)' => '3/4' (Math::BigRat):
Hash 539 bytes (unknown bytes overhead)
Key '_d' => Hash 333 bytes (185 bytes overhead)
Key 'value' => Array 106 bytes (70 bytes overhead)
Scalar 36 bytes
Key '_f' => Scalar 16 bytes
Key 'sign' => Scalar 26 bytes
Key '_n' => Hash 319 bytes (185 bytes overhead)
Key 'value' => Array 92 bytes (56 bytes overhead)
Scalar 36 bytes
Key '_f' => Scalar 16 bytes
Key 'sign' => Scalar 26 bytes
Key 'sign' => Scalar 28 bytes
Total: 539 bytes
(Please ignore the "unknown..." bug in Devel::Size::Report for now)
Woa! But now I know where the memory goes! In all the array/hash overhead!
On 64 Bit machines the counts would vary - I would like to see a report since
I don't have easy access to such a machine.
Question: Why is the overhead per hash/per array _so_ big?
1. Possible Solutions
=====================
After a ot of thinking and toiling I think I came up with a solution. Both
BigRats and BigFloats use BigInts for their "private parts" because that was
easiest. However, BigInts are complicated beasts that have a sign and a all
the sign-handling etc, whereas in praxis we need unsigned big integers for
the parts[1].
One solution could be to create a special package that can only handle
unsigned integers, and therefore doesn't need a sign. However, an idea
particle struck me - we already have such a package. It's called: Calc! :)
There are some problems like:
* BigFloat/BigRat need to know what $CALC is, aka the library BigInt uses
today for low-level math. And that shouldn't change later on (but right now
you couldn't change the library at run-time anyway, because you would end up
with objects that were created with let's say Matter.pm and later on you get
objects created with Antimatter.pm and as soon as you bring them together in
one operation it all goes kaboOM! with a bright flash...)
* It needs a total overhaul of a lot of code - lots of work.
* It changes the internal structure. Well, except the "sign", the internal
parts shouldnt be used anyway (don't look under the hood, I always warned
about this). However, the sign-bit would stay, only _m, _e, _d and _n would
change and no longer be BigInts but $CALC objects. I am also not so worried
about the change, because, afterall, the original BigInt used a completely
different inner structure altogether. And most subclasses that might poke in
the innards (by accident or bug) are written by me, anyway, so it is my task
to fix them.
However, there are a lot of benefits:
* memory requirements per object drop significantly
Here is a simulation of a would-be Math::BigFloat object with the value 123e-2
using Calc (which basically is [ 123 ] to represent 123):
# psize "{ sign => '+', _m => [ 123 ], _e => [2], _es => '-' }"
Size report for 'HASH(0x815e604)' (HASH):
Hash 419 bytes (223 bytes overhead)
Key '_es' => Scalar 26 bytes
Key '_m' => Array 72 bytes (56 bytes overhead)
Scalar 16 bytes
Key '_e' => Array 72 bytes (56 bytes overhead)
Scalar 16 bytes
Key 'sign' => Scalar 26 bytes
Total: 419 bytes
That sounds good, right? Thats about 54% of the memory than it takes now.
One could save further memory by doing away with the hash and using an array,
something Ilya proposed quite a while ago:
# psize "[ '+', [ 123 ], [2], '-' ]"
Size report for '[ '+', [ 123 ], [2], '-' ]' => 'ARRAY(0x815d694)':
Array 296 bytes (100 bytes overhead)
Scalar 26 bytes
Array 72 bytes (56 bytes overhead)
Scalar 16 bytes
Array 72 bytes (56 bytes overhead)
Scalar 16 bytes
Scalar 26 bytes
Total: 296 bytes
However, saving further 30% is (IMHO) not worth the unmaintainable code-mess
that ensures from this. (Mind you, the savings are only that big for small
numbers. Bigger numbers need themselves more storage and thus the overhead is
smaller)
* Since we do no longer use BigInt for the math, we no longer need the flag to
mark the private parts as "don't fondle them". This means that we can
simplify BigInt a bit. We also save the memory for two hash keys named "_f"
altogether (saving above already included).
* Most interesting is that the rewrite would also make it faster. Currently
each math operation incurs the penalty of going to BigInt, which does sign
fiddling etc, just to hand then the actual work to $CALC. With storing the
object directly in BigFloat, we can call
return ... if $CALC->is_zero($x->{_m});
instead of:
return ... if $x->{_m}->is_zero();
That saves us a _a lot_ of overhead, especially if $CALC is Math::BigInt::GMP
or something similiar. This in turn would speedup BigFloat and Bigrat quite a
bit.
In other words, BigFloat and BigRat would no longer be so much slower than
BigInt and gain a bit back on their speed.
For practical reasons[2] I will try to rewrite BigRat first and then see where
this leads me.
Any comments, suggestions, hints, etc are of course welcome.
Thanx for giving me room to present my ideas,
Tels
[0] Well, it is never finished. I already received a few bug/coredump reports
and v1.69 will be required to mop up...
[1] Ignoring the sign for _e (1e-3) in BigFloat for now. But at least the sign
there is only '+' or '-', while BigInt also has inf, -inf, and NaN.
[2] The codebase of BigRat is much smaller than the one from BigFloat :)
[3] The calculation for hashes might be wrong due to not taking into account
the key storage, but I am not sure about this. The testsuite of the package
also shows some interesting "side-effects" of Devel::Size and I am still not
sure if this is a feature, bug in Devel::Size or in Perl)
[4] http://perladvent.org/2003/16th
- --
Signed on Fri Jan 9 19:39:55 2004 with key 0x93B84C15.
Visit my photo gallery at http://bloodgate.com/photos/
PGP key on http://bloodgate.com/tels.asc or per email.
"We have problems like this all of the time," Kirk said, trying to
reassure me. "Sometimes its really hard to get things burning." --
http://tinyurl.com/qmg5
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2-rc1-SuSE (GNU/Linux)
Comment: When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl.
iQEVAwUBP/8CyncLPEOTuEwVAQEQvwf9EmAoK2ZYBoSpg4Qyafa4QIo9xvwiOi4i
9aNL0iHNG8Aa11H9uGPG9mJor+rfbgkkAF1sEQsr+SNQS7ujNJZ/RHK86FG0QMhG
qaoPch53FXGEULwzcFJmtFIBisVj/j0cm6uo/THzhMzYP5mHlzWAmoY851/QYoli
rH0UooUMzZbK9L3+Lrcx9UCq0dXHRFv04gcR9UKSY4wDLMEAg5FWAnNF4uGezcxR
W6O175LdjI2+pjTJm2jU5A75nJM/I0p7eWhxJWUq95fOprgbYjgOcV3k5gvbJOZ+
x8axQwtfnjdfI5WuDSpju6z00OW7EHSVMB2L8X2c5L/fq6bgz29aEg==
=/+wF
-----END PGP SIGNATURE-----
> So I wrote Devel::Size::Report (to be found on CPAN or
> http://bloodgate.com/perl/packages/). It produces nifty reports like this[3]:
>
> Size report for '1' (Math::BigInt):
> Hash 259 bytes (161 bytes overhead)
> Key 'value' => Array 72 bytes (56 bytes overhead)
> Scalar 16 bytes
> Key 'sign' => Scalar 26 bytes
> Total: 259 bytes
Oooh. Nice
> One could save further memory by doing away with the hash and using an array,
> something Ilya proposed quite a while ago:
>
> # psize "[ '+', [ 123 ], [2], '-' ]"
> Size report for '[ '+', [ 123 ], [2], '-' ]' => 'ARRAY(0x815d694)':
> Array 296 bytes (100 bytes overhead)
> Scalar 26 bytes
> Array 72 bytes (56 bytes overhead)
> Scalar 16 bytes
> Array 72 bytes (56 bytes overhead)
> Scalar 16 bytes
> Scalar 26 bytes
> Total: 296 bytes
>
> However, saving further 30% is (IMHO) not worth the unmaintainable code-mess
> that ensures from this. (Mind you, the savings are only that big for small
> numbers. Bigger numbers need themselves more storage and thus the overhead is
> smaller)
How unmaintainable would the code be if you used constant; to set up
named constants to use as your array indexes? I can't think that it would
be any less maintainable than names as hash keys, but IIRC arrays are smaller
and array lookups slightly faster than hashes and hash lookups.
Removing the extra indirection through BigInt certainly seems like a good
way to go.
> [3] The calculation for hashes might be wrong due to not taking into account
> the key storage, but I am not sure about this. The testsuite of the package
> also shows some interesting "side-effects" of Devel::Size and I am still not
> sure if this is a feature, bug in Devel::Size or in Perl)
Have you talked with Dan Sugalski, the author of Devel:Size?
Nicholas Clark
Moin,
On Saturday 10 January 2004 16:53, Nicholas Clark wrote:
> On Fri, Jan 09, 2004 at 08:36:42PM +0100, Tels wrote:
> > So I wrote Devel::Size::Report (to be found on CPAN or
> > http://bloodgate.com/perl/packages/). It produces nifty reports like
> > this[3]:
> >
> > Size report for '1' (Math::BigInt):
> > Hash 259 bytes (161 bytes overhead)
> > Key 'value' => Array 72 bytes (56 bytes overhead)
> > Scalar 16 bytes
> > Key 'sign' => Scalar 26 bytes
> > Total: 259 bytes
>
> Oooh. Nice
:-)
> > One could save further memory by doing away with the hash and using an
> > array, something Ilya proposed quite a while ago:
> >
> > # psize "[ '+', [ 123 ], [2], '-' ]"
> > Size report for '[ '+', [ 123 ], [2], '-' ]' => 'ARRAY(0x815d694)':
> > Array 296 bytes (100 bytes overhead)
> > Scalar 26 bytes
> > Array 72 bytes (56 bytes overhead)
> > Scalar 16 bytes
> > Array 72 bytes (56 bytes overhead)
> > Scalar 16 bytes
> > Scalar 26 bytes
> > Total: 296 bytes
> >
> > However, saving further 30% is (IMHO) not worth the unmaintainable
> > code-mess that ensures from this. (Mind you, the savings are only that
> > big for small numbers. Bigger numbers need themselves more storage and
> > thus the overhead is smaller)
>
> How unmaintainable would the code be if you used constant; to set up
> named constants to use as your array indexes?
That is a very clever idea that never occured to me....
And while I am at it, I would store the sign of "e" as -1 or 0, respectively:
# psize "[ '+', [ 123 ], [2], -1 ]"
# psize "[ '+', [ 123 ], [2], -1 ]"
Size report for '[ '+', [ 123 ], [2], -1 ]' => 'ARRAY(0x815d688)':
Array 286 bytes (88 bytes overhead)
Scalar 26 bytes
Array 72 bytes (56 bytes overhead)
Scalar 16 bytes
Array 72 bytes (56 bytes overhead)
Scalar 16 bytes
Scalar 28 bytes
Total: 286 bytes
# psize "[ '+', [ 123 ], [2], undef ]"
Size report for '[ '+', [ 123 ], [2], undef ]' => 'ARRAY(0x815d678)':
Array 282 bytes (88 bytes overhead)
Scalar 26 bytes
Array 72 bytes (56 bytes overhead)
Scalar 16 bytes
Array 72 bytes (56 bytes overhead)
Scalar 16 bytes
Scalar 24 bytes
Total: 282 bytes
Access like
my $s = $x->[ SIGN_E ];
would thus become:
my $s = ''; $s = '-' if $x->[ SIGN_E ];
and tests:
... if $x->[ SIGN_E ] eq '+';
... if $x->[ SIGN_E ] eq '-';
become
... unless $x->[ SIGN_E ]; # is '+'?
... if $x->[ SIGN_E ]; # is '-'?
> I can't think that it would
> be any less maintainable than names as hash keys, but IIRC arrays are
> smaller and array lookups slightly faster than hashes and hash lookups.
You are right. I am glad that I didn't yet start the rewrite. Maybe later on I
could also rewrite BigInt and save some bytes there. In any event, it is
possible to fake a hash-read access (ike BigInt::Lite does to the $x->{sign}
key), although that does slows these accesses down (but probably not more
than calling an accessor method).
> Removing the extra indirection through BigInt certainly seems like a good
> way to go.
>
> > [3] The calculation for hashes might be wrong due to not taking into
> > account the key storage, but I am not sure about this. The testsuite of
> > the package also shows some interesting "side-effects" of Devel::Size and
> > I am still not sure if this is a feature, bug in Devel::Size or in Perl)
>
> Have you talked with Dan Sugalski, the author of Devel:Size?
I send him an email a few days ago, but haven't seen an answer. Maybe I used
the wrong email, or he is busy, or on vacation. I wanted to dig a bit myself
through Devel::Size to see if I find out where the bug is, but was stumped.
Basically, after looking at a scalar (see, it is 33 bytes) I look at other
scalars (see, they are 40 bytes) and then suddenly the first scalar is, upon
looking again, 44 bytes. And every other new created scalar is also 44 bytes.
I am unsure why this happens and my knowledge about Perl internals is very
sketchy.
Thanx for the feedback,
Tels
- --
Signed on Sat Jan 10 17:13:15 2004 with key 0x93B84C15.
Visit my photo gallery at http://bloodgate.com/photos/
PGP key on http://bloodgate.com/tels.asc or per email.
"Not King yet."
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2-rc1-SuSE (GNU/Linux)
Comment: When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl.
iQEVAwUBQAAnAHcLPEOTuEwVAQF6sAf8Dzc4iH7kcXmfBmp9WPidMpsYc+d21Yjl
gSihM9JI2LGHkrDgIjJZP5ZCpVgbeAwzTvsbQc3l4TRhKP7XntWkHOoLbJ51TfJt
+kYO5oFljBj6vpT9cg/HXxIjeV7oGR3/hQrIdQc5oueLRcD1ymhgwg0fRV/UwVqD
bAoV3AlWBpb5HOlqdqJWMw+PVjI6jTvTKLPNGsGccBeGHeu7MhTysNEJ1XkBy3sb
RZV0zcNGrhvLb/fxXon4V/ZBDDIh9E+bTV9ye4CIcmq/cmrYxk2ehfmz/MxKGhl4
bJn+KdNIdN8mi6TMZcWuwG2D9nFOmV0kZwK8XpyGkNvOL8fvYw8y+Q==
=UDxl
-----END PGP SIGNATURE-----
> I send him an email a few days ago, but haven't seen an answer. Maybe I used
> the wrong email, or he is busy, or on vacation. I wanted to dig a bit myself
> through Devel::Size to see if I find out where the bug is, but was stumped.
> Basically, after looking at a scalar (see, it is 33 bytes) I look at other
> scalars (see, they are 40 bytes) and then suddenly the first scalar is, upon
> looking again, 44 bytes. And every other new created scalar is also 44 bytes.
> I am unsure why this happens and my knowledge about Perl internals is very
> sketchy.
Hello, Tels.
IMO, this inconsistency on size reporting is due to Devel::Size::Report.
I write a codelet for examination as below.
Following codelet shows Devel::Size::total_size() and report_size2()
does not have a problem but report_size() has.
What is shown below is that total_size($ref) called in report_size()
returns <the size of $ref>, but that of neither $u nor $y.
In the report below, particularly <SV = PV...> lines and <LEN = ...>
lines would be worth considering.
(1) <SV = PV...> lines show which scalar is subjected to report size.
Apparently,
<SV = PV(0x1555404) at 0x155c454> is for $u,
<SV = PV(0x1555428) at 0x155c424> is for $y,
<SV = PV(0x1555440) at 0x1586388> is for $ref.
(2) On the 3rd calling of report_size, Dump($ref) reports
CUR = 8 and LEN = 16. (CUR stands for the length of string,
and LEN stands for the size of the string buffer.)
Once $ref is lengthened by assignment of "A longer string",
the string buffer will not shortened, even if a shorter string
is stored after that.
(3) Size of 44 bytes should be of PVIV scalar (containing a string
and an integer). The difference (44 - 40) should be size of integer.
Assignment of integer to a PV scalar (containing a string) causes
upgrading to PVIV automatically. Such a "magic" makes Perl's scalar
can represent both a string and an integer at a time, on request.
#!perl
use Devel::Peek;
use Devel::Size qw(total_size);
my $u = "A string";
my $y = "A longer string";
print STDERR "** Dump(\$u)\n";
Dump($u);
print STDERR "** Dump(\$y)\n";
Dump($y);
print "Devel::Size::total_size\n";
printf "\$u = %s\n", total_size( $u );
printf "\$y = %s\n", total_size( $y );
printf "\$u = %s\n", total_size( $u );
printf "\$y = %s\n", total_size( $y );
print STDOUT "report_size (mimic Devel::Size::Report::report_size)\n";
print STDERR "** report_size (mimic Devel::Size::Report::report_size)\n";
sub report_size {
my $ref = shift;
Dump($ref);
return total_size( $ref );
}
printf "\$u = %s\n", report_size( $u );
printf "\$y = %s\n", report_size( $y );
printf "\$u = %s\n", report_size( $u );
printf "\$y = %s\n", report_size( $y );
print STDOUT "report_size2 (revision of report_size)\n";
print STDERR "** report_size2 (revision of report_size)\n";
sub report_size2 {
Dump($_[0]);
return total_size($_[0]);
}
printf "\$u = %s\n", report_size2( $u );
printf "\$y = %s\n", report_size2( $y );
printf "\$u = %s\n", report_size2( $u );
printf "\$y = %s\n", report_size2( $y );
__END__
[STDOUT output]
Devel::Size::total_size
$u = 33
$y = 40
$u = 33
$y = 40
report_size (mimic Devel::Size::Report::report_size)
$u = 33
$y = 40
$u = 40
$y = 40
report_size2 (revision of report_size)
$u = 33
$y = 40
$u = 33
$y = 40
[STDERR output]
** Dump($u)
SV = PV(0x1555404) at 0x155c454
REFCNT = 1
FLAGS = (PADBUSY,PADMY,POK,pPOK)
PV = 0x16a87cc "A string"\0
CUR = 8
LEN = 9
** Dump($y)
SV = PV(0x1555428) at 0x155c424
REFCNT = 1
FLAGS = (PADBUSY,PADMY,POK,pPOK)
PV = 0x16a8fdc "A longer string"\0
CUR = 15
LEN = 16
** report_size (mimic Devel::Size::Report::report_size)
SV = PV(0x1555440) at 0x1586388
REFCNT = 1
FLAGS = (PADBUSY,PADMY,POK,pPOK)
PV = 0x16a8fbc "A string"\0
CUR = 8
LEN = 9
SV = PV(0x1555440) at 0x1586388
REFCNT = 1
FLAGS = (PADBUSY,PADMY,POK,pPOK)
PV = 0x16a8f0c "A longer string"\0
CUR = 15
LEN = 16
SV = PV(0x1555440) at 0x1586388
REFCNT = 1
FLAGS = (PADBUSY,PADMY,POK,pPOK)
PV = 0x16a8f0c "A string"\0
CUR = 8
LEN = 16
SV = PV(0x1555440) at 0x1586388
REFCNT = 1
FLAGS = (PADBUSY,PADMY,POK,pPOK)
PV = 0x16a8f0c "A longer string"\0
CUR = 15
LEN = 16
** report_size2 (revision of report_size)
SV = PV(0x1555404) at 0x155c454
REFCNT = 1
FLAGS = (PADBUSY,PADMY,POK,pPOK)
PV = 0x16a87cc "A string"\0
CUR = 8
LEN = 9
SV = PV(0x1555428) at 0x155c424
REFCNT = 1
FLAGS = (PADBUSY,PADMY,POK,pPOK)
PV = 0x16a8fdc "A longer string"\0
CUR = 15
LEN = 16
SV = PV(0x1555404) at 0x155c454
REFCNT = 1
FLAGS = (PADBUSY,PADMY,POK,pPOK)
PV = 0x16a87cc "A string"\0
CUR = 8
LEN = 9
SV = PV(0x1555428) at 0x155c424
REFCNT = 1
FLAGS = (PADBUSY,PADMY,POK,pPOK)
PV = 0x16a8fdc "A longer string"\0
CUR = 15
LEN = 16
Regards,
SADAHIRO Tomoyuki
Helo,
On Sunday 11 January 2004 07:38, SADAHIRO Tomoyuki wrote:
> On Sat, 10 Jan 2004 17:23:28 +0100
>
> Tels <perl_...@bloodgate.com> wrote:
> > I send him an email a few days ago, but haven't seen an answer. Maybe I
> > used the wrong email, or he is busy, or on vacation. I wanted to dig a
> > bit myself through Devel::Size to see if I find out where the bug is, but
> > was stumped. Basically, after looking at a scalar (see, it is 33 bytes) I
> > look at other scalars (see, they are 40 bytes) and then suddenly the
> > first scalar is, upon looking again, 44 bytes. And every other new
> > created scalar is also 44 bytes. I am unsure why this happens and my
> > knowledge about Perl internals is very sketchy.
>
> Hello, Tels.
>
> IMO, this inconsistency on size reporting is due to Devel::Size::Report.
>
> I write a codelet for examination as below.
> Following codelet shows Devel::Size::total_size() and report_size2()
> does not have a problem but report_size() has.
>
> What is shown below is that total_size($ref) called in report_size()
> returns <the size of $ref>, but that of neither $u nor $y.
>
> In the report below, particularly <SV = PV...> lines and <LEN = ...>
> lines would be worth considering.
I have to read this again after I had my coffee. In short, do I need to change
something in Devel::Size::Report, or is this a Devel::Size bug?
Why is:
$ref= shift;
total_size($ref);
different from:
total_size($_[0]);
? If Devel::Size really looks at the wrong SV or something like that, the
inconsistencies make sense.
In addition, Devel::Size does have a problem with a reference to a scalar as
shown below:
# perl -MDevel::Size=total_size -le 'print total_size("1")'
26
#perl -MDevel::Size=total_size -le 'print total_size(\"1")'
26
Shouldn't the size of a reference to a scalar (plus the scalar) be different
than from a scalar alone?
> (1) <SV = PV...> lines show which scalar is subjected to report size.
> Apparently,
> <SV = PV(0x1555404) at 0x155c454> is for $u,
> <SV = PV(0x1555428) at 0x155c424> is for $y,
> <SV = PV(0x1555440) at 0x1586388> is for $ref.
>
> (2) On the 3rd calling of report_size, Dump($ref) reports
> CUR = 8 and LEN = 16. (CUR stands for the length of string,
> and LEN stands for the size of the string buffer.)
> Once $ref is lengthened by assignment of "A longer string",
> the string buffer will not shortened, even if a shorter string
> is stored after that.
That makes sense, however, I am not sure why without an assignment some
scalars suddenly look bigger.
See, I told you! :-) Why does this happen?
> report_size2 (revision of report_size)
> $u = 33
> $y = 40
> $u = 33
> $y = 40
[snip]
> LEN = 16
> ** report_size (mimic Devel::Size::Report::report_size)
> SV = PV(0x1555440) at 0x1586388
> REFCNT = 1
> FLAGS = (PADBUSY,PADMY,POK,pPOK)
> PV = 0x16a8fbc "A string"\0
> CUR = 8
> LEN = 9
> SV = PV(0x1555440) at 0x1586388
> REFCNT = 1
> FLAGS = (PADBUSY,PADMY,POK,pPOK)
> PV = 0x16a8f0c "A longer string"\0
> CUR = 15
> LEN = 16
> SV = PV(0x1555440) at 0x1586388
> REFCNT = 1
> FLAGS = (PADBUSY,PADMY,POK,pPOK)
> PV = 0x16a8f0c "A string"\0
> CUR = 8
> LEN = 16
Here we see the space is bigger than it should be. Why?
Best wishes,
Tels
- --
Signed on Sun Jan 11 09:37:20 2004 with key 0x93B84C15.
Visit my photo gallery at http://bloodgate.com/photos/
PGP key on http://bloodgate.com/tels.asc or per email.
"My glasses, my glasses. I cannot see without my glasses." - "My glasses,
my glasses. I cannot be seen without my glasses."
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2-rc1-SuSE (GNU/Linux)
Comment: When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl.
iQEVAwUBQAENRHcLPEOTuEwVAQFLdgf/SLiprlcJQeDRIVvri4F+EkFAVE0v7opG
7FkChN+QFsLDHaS5MuIhCTpoVWo8LtzZcmzBu57lcpEh81bJG6MzDPNln6q5gWl6
Of0c6DOMAcjrLfGmR2VCcQQZgCQr8o0CFstyLqKhiXqGGsjMAAfkvo0IuEhZTB1H
WvObGVOuGhOqdcaGxJbFc4Ub0rfp/3XS7mD5/PategrrjBDr/wEyTeKN51YB+5cy
C8UKPMQ5KRKvGVktZDalywvrtaz89r9IhhSYQzXdMEUao/xjJZq3fcyE9RlRmkSd
aznOH5hveejKPFpdV4qvAoeWP4GyzrPhyrTgYkMvL7yyWXZmmsFoNA==
=9Qkj
-----END PGP SIGNATURE-----
Moin,
On Sunday 11 January 2004 07:38, SADAHIRO Tomoyuki wrote:
> On Sat, 10 Jan 2004 17:23:28 +0100
> sub report_size {
> my $ref = shift;
> Dump($ref);
> return total_size( $ref );
> }
> sub report_size2 {
> Dump($_[0]);
> return total_size($_[0]);
> }
AAhhhhhhhhhhhhh I get it!
$ref = shift;
makes a copy of the argument, and reuses some random scalar storage space, and
the total_size() afterwards just tells me the size of that storage space, but
not that of the original. Clever, Sadahiro San! :) Thank you!
Look forward to v0.04 then...
Best wishes,
Tels
- --
Signed on Sun Jan 11 09:52:39 2004 with key 0x93B84C15.
Visit my photo gallery at http://bloodgate.com/photos/
PGP key on http://bloodgate.com/tels.asc or per email.
"Now, admittedly, it's critical software. This is the 'let's go kill
people' software." -- Mark A. Welsh III
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2-rc1-SuSE (GNU/Linux)
Comment: When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl.
iQEVAwUBQAEPN3cLPEOTuEwVAQGh4gf/RXGJmDJWD8/DA3gHY3JhTJbESXXcXIVr
Vl5jSDO84qMudLcDfE7ooY3+iwi8LQDMEKN9S15YdHR8ZB7tn2A1dVm1Okdv1eBE
Kpn2fBo3BH11YO51cYA4bhOl7eBuRSp9n/TIPjZyzn09zrNhcDu+8FnlLd7b4k8z
igaeYw3uYMqgFY8wa7l2RMB67L2ft1l9mrn1eMi2OcyQ6usiaCJKqRbIHgnhZfvy
mEPEZoqtbglgtjMqqUs+J06oQ+nSiyUKm6sB6PDtahDJfdIbPVQ/RO2MhAt0ED/9
fpj4b1Vhk3KG7ru32zrDwTitxvNdMqECvpkLS1cC8EwEQ+AeAkXxQQ==
=w/xd
-----END PGP SIGNATURE-----
> On Saturday 10 January 2004 16:53, Nicholas Clark wrote:
> > How unmaintainable would the code be if you used constant; to set up
> > named constants to use as your array indexes?
>
> That is a very clever idea that never occured to me....
I can't remember whose it is, but it's not mine. It's sort of do-it-yourself
pseudohashes
> I send him an email a few days ago, but haven't seen an answer. Maybe I used
> the wrong email, or he is busy, or on vacation. I wanted to dig a bit myself
I know that he had an important work deadline this Friday, which made him
somewhat busy.
Nicholas Clark
> In addition, Devel::Size does have a problem with a reference to a scalar as
> shown below:
>
> # perl -MDevel::Size=total_size -le 'print total_size("1")'
> 26
> #perl -MDevel::Size=total_size -le 'print total_size(\"1")'
> 26
>
> Shouldn't the size of a reference to a scalar (plus the scalar) be different
> than from a scalar alone?
Hello.
I read the doc and code of Devel::Size for a time,
it seems a feature that if the argument is a reference,
it derefenences the argument and
returns the size of what the reference is pointing to.
If the argument is a reference to an array or a hash,
size/total_size returns the size of an array or a hash,
which does not include the size of a reference.
Then, if the argument is a reference to a scalar,
size/total_size should return the size of a scalar,
which does not include the size of a reference.
This would make sence.
We pass <a reference (a) to a reference (b)> and
get the size of a reference (b).
On my machine:
C:\>perl -MDevel::Size -le "print Devel::Size::total_size('1')"
26
C:\>perl -MDevel::Size -le "print Devel::Size::total_size(\'1')"
26
C:\>perl -MDevel::Size -le "print Devel::Size::total_size(\\'1')"
42
C:\>perl -MDevel::Size -le "print Devel::Size::size('1')"
26
C:\>perl -MDevel::Size -le "print Devel::Size::size(\'1')"
26
C:\>perl -MDevel::Size -le "print Devel::Size::size(\\'1')"
16
26 is the size of a scalar with a string of 2 bytes ('1' plus \0).
16 is the size of a scalar with a simple reference (SVt_RV type).
42 is the sum of 16 and 26. They balance well :-)
Regards,
SADAHIRO Tomoyuki
Moin,
On Sunday 11 January 2004 13:41, SADAHIRO Tomoyuki wrote:
> On Sun, 11 Jan 2004 09:45:48 +0100
>
> Tels <perl_...@bloodgate.com> wrote:
> > In addition, Devel::Size does have a problem with a reference to a scalar
> > as shown below:
> >
> > # perl -MDevel::Size=total_size -le 'print total_size("1")'
> > 26
> > #perl -MDevel::Size=total_size -le 'print total_size(\"1")'
> > 26
> >
> > Shouldn't the size of a reference to a scalar (plus the scalar) be
> > different than from a scalar alone?
>
> Hello.
>
> I read the doc and code of Devel::Size for a time,
> it seems a feature that if the argument is a reference,
> it derefenences the argument and
> returns the size of what the reference is pointing to.
>
> If the argument is a reference to an array or a hash,
> size/total_size returns the size of an array or a hash,
> which does not include the size of a reference.
Ugh! I released v0.04 half an hour ago, and this means the total_size( [123] )
is wrong, the size of one reference must be added.
> Then, if the argument is a reference to a scalar,
> size/total_size should return the size of a scalar,
> which does not include the size of a reference.
> This would make sence.
>
> We pass <a reference (a) to a reference (b)> and
> get the size of a reference (b).
>
> On my machine:
> C:\>perl -MDevel::Size -le "print Devel::Size::total_size('1')"
> 26
> C:\>perl -MDevel::Size -le "print Devel::Size::total_size(\'1')"
> 26
> C:\>perl -MDevel::Size -le "print Devel::Size::total_size(\\'1')"
> 42
So, for \"1" my code should print
Scalar reference 42 bytes (overhead: 16 bytes)
(whereas 16 are the bytes by the reference) as opposed for "1":
Scalar 26 bytes
(overhead is none).
> C:\>perl -MDevel::Size -le "print Devel::Size::size('1')"
> 26
> C:\>perl -MDevel::Size -le "print Devel::Size::size(\'1')"
> 26
> C:\>perl -MDevel::Size -le "print Devel::Size::size(\\'1')"
> 16
> 26 is the size of a scalar with a string of 2 bytes ('1' plus \0).
> 16 is the size of a scalar with a simple reference (SVt_RV type).
> 42 is the sum of 16 and 26. They balance well :-)
Very usefull thank you!
Best wishes,
Tels
- --
Signed on Sun Jan 11 13:54:19 2004 with key 0x93B84C15.
Visit my photo gallery at http://bloodgate.com/photos/
PGP key on http://bloodgate.com/tels.asc or per email.
"Zudem könnten nun nicht mehr nur Täter, sondern auch Opfer abgehört
werden, um diese besser zu schützen." Jörg Bode, FDP -
http://heise.de/newsticker/data/anw-11.12.03-003/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2-rc1-SuSE (GNU/Linux)
Comment: When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl.
iQEVAwUBQAFIFHcLPEOTuEwVAQHzXAf+K74XPndC9tz0bd+Icm2tyN5YJSYSDf7d
TGn3bNuGgglzcxnia6B0PHcjypnAqaOHsEWOT0ksmyMog9vwXJ1hWheV2orr4J30
bzrXfucHDFKiBygRp3VnkMbCumklRqZM8blaPGl4M3NDppXwcELbRaKB4z0J+pT5
JcUtaPT4VpNzgzNBrcmzBaF++KS/YzVxabj6SxzjhENWudciJKVt84fkf5MhWdYd
qjTghQOKxj6bTRXv0PaqZCv777gz9v+Gv4njBNsTN3CrIuAH8XS5sTNu372lQjlR
x5wQIEZuoUdm/eOocFjTlylhivy+WBXOONDeho5dFmk/BO3vY2Xu6g==
=iwdh
-----END PGP SIGNATURE-----