This code:
new P0, .PerlNum
set P0, -1.2
new P1, .PerlString
set P1, "-1.2"
eq_num P0, P1, OK
print "not "
OK: print "ok\n"
end
prints "not ok", when it should print "ok". A similar problem exists with
the ne_num op, suggesting the underlying bug is somewhere in the
implementation of the cmp_num vtable function.
[And yes, I'm well aware of the problems inherent in doing floating point
comparisons. However, in this case, the floating-point respresentation
of the string '1.2' should be exactly the same as the representation of
the number 1.2, since the former is converted into the latter
internally]
Simon
> This code:
> new P0, .PerlNum
> set P0, -1.2
> new P1, .PerlString
> set P1, "-1.2"
> eq_num P0, P1, OK
> print "not "
> OK: print "ok\n"
> end
> [And yes, I'm well aware of the problems inherent in doing floating point
> comparisons.
Breakpoint 1, Parrot_PerlNum_cmp_num (interpreter=0x82654f0, pmc=0x40305850,
value=0x40305838) at perlnum.c:301
301 diff = PMC_num_val(pmc) - VTABLE_get_number(interpreter, value);
(gdb) n
302 return diff > 0 ? 1 : diff < 0 ? -1 : 0;
(gdb) p diff
$1 = 2.2204460492503131e-16
(gdb)
> Simon
leo
OK, that suggests that there's a bug somewhere in our string->number
conversion. Either that, or we're going to have to live with the fact
that 1.2 and '1.2' are not the same number (which would suck).
Simon
The basic bug has to be that it's using two different routines to
do the conversion (or possibly there's one routine that's paying
attention to some context it shouldn't be paying attention to--but
that seems less likely).
I hope this doesn't point to a more general problem. Using different
routines to do the same thing for compilation and execution was a bad
mistake I made in early Perls, and had to work hard to fix in Perl 5.
For instance, all the constant folding in Perl 5 happens by actually
running the code in question, not by trying to guess what the run-time
system will do, as Perl 4 did.
I recommend not remaking my mistakes. Please make different mistakes. :-)
Larry
In string_to_num() in src/string.c,
f = f * sign * pow(10.0, exponent); /* ugly, oh yeah */
Which in this case translates to 12.0*-1*0.1 which is -1.2000...2 != -1.2.
I replaced this bit of code from a block I found in mysql. I only tested
this
on linux, but it seems to do the trick. See attached.
This still doesn't explain why the compiler would be using a different
routine to turn the string 1.2 into a number than the run time does.
This is not code that should be duplicated. Or am I misunderstanding
the problem?
Larry
I'm not very familiar with the garbage collector, would the call to
string_make() be a memory leak? Or will it be collected?
diff -r1.68 pbc.c
593c593,594
< int k;
---
> int k, l;
> STRING *str;
594a596
> str = const_string(interpreter, buf);
596d597
<
600c601,602
< (FLOATVAL)atof(buf);
---
> string_to_num(str);
> The basic bug has to be that it's using two different routines to
> do the conversion (or possibly there's one routine that's paying
> attention to some context it shouldn't be paying attention to--but
> that seems less likely).
Well, that's a very good point. The compiler just does atof() while
string_to_num() is a long and possibly buggy conversion routine that
tries to deal with unicode codepoints.
> I recommend not remaking my mistakes. Please make different mistakes. :-)
We'll try hard ...
> Larry
leo
Well imcc used to be a standalone language compiler that converted PIR
to PASM, which got then assembled to bytecode by assembler.pl. Now
imcc is fully integrated, parrot is the evolution of the imcc
executable and parrot has two assemblers (or compiler) inside.
*But* there is still a lot of code inside imcc, which needs
adaption. Imcc didn't use any of Parrot's internals except the necessary
things to generate the bytecode.
> ...(See add_const_num() in imcc/pbc.c). Packing the cstring up
> in a STRING then calling string_to_num() *fixes* the problem,
Done that now too, similar to your patch.
> ... but it
> doesn't address the issue Larry brought up. The interpreter makes
> heavy use of the encoding sensitive STRING type, the compiler does
> not.
Imcc or the lexer has no syntax yet to even read an Unicode string. It's
just reading bytes and converts everything to C-strings. That needs of
course fixing.
> I'm not very familiar with the garbage collector, would the call to
> string_make() be a memory leak? Or will it be collected?
I used string_from_cstring(), the memory gets collected automatically.
leo
> Oops, use this one instead... "re"-fixes the exponent support...
Thanks, applied.
leo
--
Brent "Dax" Royal-Gordon <br...@brentdax.com>
Perl and Parrot hacker
Oceania has always been at war with Eastasia.
Well, string_to_num() is the opposite of it. And string.c has
string_from_num() which is using Parrot_sprintf() ;)
BTW sscanf() is missing too.
leo
LT> BTW sscanf() is missing too.
perl5 mever had a scanf (or variation) and for good reason IIRC. it was
never needed and it has a very nasty api. i hated using it in c because
you could never control how much to read and parse and also find out
where in the input stream it fails. <> and a regex were always better in
p5. so i don't think we need to do it in parrot. in fact with
rules/grammars being able to read from a stream, you could code one in
that and have it work properly. do any other langs we want to support
(other than c) have a scanf flavor in them?
uri
--
Uri Guttman ------ u...@stemsystems.com -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
I have never used scanf. I have no idea what it does or how.
Therefore, I'm hardly qualified to write an implementation of it.
--
Mark Biggar
mark.a...@comcast.net
> *** tmp/parrot/src/string.c Sat Mar 6 03:00:06 2004
> --- parrot/src/string.c Wed Mar 17 12:24:02 2004
> ***************
> *** 1836,1844 ****
> int exp_sign = 0;
> INTVAL in_exp = 0;
> INTVAL in_number = 0;
> ! FLOATVAL exponent = 0;
> INTVAL fake_exponent = 0;
> INTVAL digit_family = 0;
>
> while (start < end) {
> UINTVAL c = s->encoding->decode(start);
> --- 1836,1845 ----
> int exp_sign = 0;
> INTVAL in_exp = 0;
> INTVAL in_number = 0;
> ! INTVAL exponent = 0;
> INTVAL fake_exponent = 0;
> INTVAL digit_family = 0;
> + FLOATVAL exp_log=10.0, exp_val=1.0;
>
> while (start < end) {
> UINTVAL c = s->encoding->decode(start);
> ***************
> *** 1849,1855 ****
>
> if (df && df == digit_family) {
> if (in_exp) {
> ! exponent = exponent * 10 + s->type->get_digit(s->type,c);
> if (!exp_sign) {
> exp_sign = 1;
> }
> --- 1850,1856 ----
>
> if (df && df == digit_family) {
> if (in_exp) {
> ! exponent = exponent + s->type->get_digit(s->type,c);
> if (!exp_sign) {
> exp_sign = 1;
> }
> ***************
> *** 1906,1912 ****
>
> exponent = fake_exponent + exponent * exp_sign;
>
> ! f = f * sign * pow(10.0, exponent); /* ugly, oh yeah */
> }
>
> return f;
> --- 1907,1936 ----
>
> exponent = fake_exponent + exponent * exp_sign;
>
> ! if(exponent < 0) {
> ! exponent = -exponent;
> ! exp_sign=-1;
> ! }
> !
> ! for (;;) {
> ! if (exponent & 1) {
> ! exp_val *= exp_log;
> ! exponent--;
> ! }
> ! if (!exponent)
> ! break;
> ! exp_log *= exp_log;
> ! exponent >>= 1;
> ! }
> !
> ! if(exp_sign < 0)
> ! f /= exp_val;
> ! else
> ! f *= exp_val;
> !
> !
> ! if(sign < 0)
> ! f = -f;
> }
>
> return f;
Yep, was outlined by Larry too. That's already changed.
[ Please don't quote the whole thread, thanks ]
leo