Right now, I'm trying to think about integer ops.
I want to add two integers, e.g., and have the result be an integer. The
naive way:
foo = bar + baz
Doesn't work because of automatic promotion to bignum.
foo = int(bar + baz)
seems not to work because of OverflowError, and catching it seems to do
little, as the error has already occured. I'm sure there are hacks I
could do, but if I want an implementation that will act the way c int's
(or long's, or some appropriately sized type) do, is there an elegant way
of creating such a function?
Extra points for hit about keeping signed v. unsigned information correct.
Of course, I could always make a hacked-up version, but it seems like this
should be something I can pass-through.
Thanks in advance,
D"newbie"an
> I am currently embarking on the utterly futile and mildly recursive task
> of implementing a C interpreter in python (Jython specifically).
>
> Right now, I'm trying to think about integer ops.
>
> I want to add two integers, e.g., and have the result be an integer. The
And what do you want to happen when the two integers you're adding yield
a result too large to be represented as an integer? You do no specify
what behavior you want for that (and I believe the C standard doesn't,
either, though I'm not familiar with the 1999 version, only the older one).
Basically what you need is a function makeint that will return the int
of its single argument -- if that overflows, then you may want to e.g.
truncate that to 32 bits, or whatever. Since most cases won't overflow
you're best off with an "easier to ask forgiveness than permission"
structure, e.g.:
def mask(x, themask = 0x7FFFFFFF):
return int(x & themask)
def makeint(x):
try: return int(x)
except OverflowError:
if x < 0:
return -mask(-x)
else:
return mask(x)
> naive way:
>
> foo = bar + baz
>
> Doesn't work because of automatic promotion to bignum.
so you code
foo = makeint(bar + baz)
with some makeint function such as the above one.
Alex
Alex:
> so you code
>
> foo = makeint(bar + baz)
>
> with some makeint function such as the above one.
I had a similar problem for the PyPy project. In order
to overcome the problem of always having to call
conversion functions, I wrote a class that tries
to promote its restricted type through expressions.
So if you start a variable by
x = r_int(42)
and perform any integer operations on it, the result
will stay in the same domain.
Feel free to use the attached code as you like.
ciao - chris
--
Christian Tismer :^) <mailto:tis...@tismer.com>
Mission Impossible 5oftware : Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/
14109 Berlin : PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776
PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04
whom do you want to sponsor today? http://www.stackless.com/
Unless there's a function I'm missing that returns INT_MAX (as C
pronounces it) and I'm just supposed to mod it.
-Dan
> Am I so far off in thinking maybe this should be provided by
> the language? Some way to create an int, using some (perhaps
> implementation-defined) method of changing a long to an int?
> Some way to ensure that I don't misguess on what themask is?
> What if I'm programming on a machine with 36 bit longs? Or
> where 64 bits is the atomic datatype? It seems like there
> should be an int constructor that won't overflow in this case.
For some of what I do, it sure would be nice to have 8, 16, and
32 bit base-2 datatypes available. One of these days I should
write a C extension providing fixed-width data types...
--
Grant Edwards grante Yow! ... the MYSTERIANS
at are in here with my
visi.com CORDUROY SOAP DISH!!
>>> import sys
>>> hex(sys.maxint)
'0x7fffffff'
HTH
/Anders
--
-- Of course I'm crazy, but that doesn't mean I'm wrong.
Anders Hammarquist | i...@cd.chalmers.se
Physics student, Chalmers University of Technology, | Hem: +46 31 88 48 50
G|teborg, Sweden. RADIO: SM6XMM and N2JGL | Mob: +46 707 27 86 87
> Unless there's a function I'm missing that returns INT_MAX (as C
> pronounces it) and I'm just supposed to mod it.
sys.maxint
Chad
>Am I so far off in thinking maybe this should be provided by the language?
>Some way to create an int, using some (perhaps implementation-defined)
>method of changing a long to an int? Some way to ensure that I don't
>misguess on what themask is?
Do you realize that Python only recently ELIMINATED most of this? Prior to
2.0, ints were ints and longs were longs and never the twain shall meet,
and most people are much happier with the quiet transition we now have.
>What if I'm programming on a machine with 36
>bit longs? Or where 64 bits is the atomic datatype?
These kinds of concerns should, for the most part, be entirely irrelevant
to you in a language like Python. There is so much overhead in the data
structures as it is that the bitwise representation should never enter your
head, unless you're using something like the 'struct' or 'array' modules.
--
- Tim Roberts, ti...@probo.com
Providenza & Boekelheide, Inc.
> Am I so far off in thinking maybe this should be provided by the language?
IMHO, you are. "integral types with N bits ignoring overflows" is
a very marginal "niche" requirement; the language would grow far too
big if it was to provide all such marginalia! It's easy, when one is
developing in a focused way in a specific application area, to start
wishing for one's language to provide just the primitives that would
be most convenient for that specific need. But unless a primitive
has WIDE use over a large variety of application areas, the language
IS better off without it -- that's what LIBRARIES are for!
> Some way to create an int, using some (perhaps implementation-defined)
> method of changing a long to an int?
Why? What's wrong with int(thelong & amask)?
> Some way to ensure that I don't
> misguess on what themask is? What if I'm programming on a machine with 36
> bit longs? Or where 64 bits is the atomic datatype?
I see you've already been advised to use sys.maxint, which is fine if
what you want is to use "integers of the exact number of bits which
happen to be provided by this specific piece of hardware". I did not
think you wanted that variability in your "C interpreter", so I chose
to provide, instead, "32-bit integers" anyway, ignoring the underlying
platform.
Isn't it WONDERFUL that the language does NOT provide a hard-coded
"way to create an int from a long", so you can easily CHOOSE depending
on your application's need how and whether to truncate longs that
are too large?
For example, if you're interested in simulating a decimal machine,
where integers have N digits rather than N bits, then instead of
bitmasking you just use int(thelong % biggie) where biggie=1000000
or whatever. Isn't that the cat's pajamas?
Alex
If
int(foo % sys.maxint)
is such a call, great. Is this guaranteed to work? I don't know.
Basically, if you're going to say ints and longs are separate types, I
think there should be a way to makes ints from longs fairly reliably. I
don't think that's a niche, I think it's something a consistent language
should provice.
Maybe python is moving towards a day where ints and longs are completely
indistinguishable. This probably wouldn't be a bad thing, from many
standpoints. But until that day comes, I think that there should be a way
to construct an object without getting an exception. I guess it's more a
matter of taste whether the implentation or the interface-user should do
it, but since one still allows the other, I'd say the implementation
probably should.
I like the flexibility python provides, but it seems that there might be
cracks where orthogonality or consistency of interface have cracks; which
makes sense in an evovling language. If I managed to arrive at python at
a point where this "problem" was just begotten and there's a good chance
it will go away soon, perfect. But I stil think there's an elegant
solution to be had, that works across implementations. Is it sys.maxint?
Maybe. Silent exceptions, like Java will do for math stuff? Maybe. I'm
getting the impression the solution is maxint, so great. Thanks.
-Dan
> It seems like so long as there's a distinction in the language between
> ints and longs, there should be a way to create ints reliably.
>
> If
>
> int(foo % sys.maxint)
>
> is such a call, great. Is this guaranteed to work? I don't know.
It's guaranteed to do funny things indeed!
>>> int(-23 % sys.maxint)
2147483624
>>>
a % b, for b>0, always return an x such that 0 <= x < b. That is
most definitely NOT what you want in this case for a<0... and I
suspect that this would also not be what you want:
>>> int(sys.maxint % sys.maxint)
0
>>>
yet it IS totally guaranteed by the semantics of %.
> Basically, if you're going to say ints and longs are separate types, I
> think there should be a way to makes ints from longs fairly reliably. I
> don't think that's a niche, I think it's something a consistent language
> should provice.
Well, your proposed expression DOES "make ints from longs fairly reliably",
it's just that I think the ints it makes are probably not the ones you'd
like to get (except for 0<=foo<sys.maxint).
I think you want something like: int(cmp(foo,0)*(abs(foo) & sys.maxint)).
Me, I think it's better to incapsulate the "conversion to int ignoring
overflow" in a function, using an if/else:
def toint(foo):
if foo>=0: return int(foo&sys.maxint)
else: return -int(-foo&sys.maxint)
and I'd further factor out the "int(X&sys.maxint)" part because I loathe
duplicated code. But that brings us right back to the same suggestion
I already gave you, with a tiny perhaps-improvement:
def mask(x, themask=sys.maxint): return int(x & themask)
def toint(x):
if x>=0: return mask(x)
else: return -mask(-x)
I originally proposed using a literal 2**32-1 as the default value
of themask (because I think that is likely to be more useful for
the purpose of simulating an abstract machine's arithmetic), now
we're using sys.maxint instead (thus putting ourselves at the mercy
of the specific machine we're running on), big deal.
> Maybe python is moving towards a day where ints and longs are completely
> indistinguishable. This probably wouldn't be a bad thing, from many
That's what PEP 237 states, yes.
> standpoints. But until that day comes, I think that there should be a way
> to construct an object without getting an exception. I guess it's more a
There are many, depending on what object you want...;-).
You seem adamantly convinced that "mask off however many least-significant
bits the current machine allows into an int" is the one and obvious way
of constructing an int from a long, when the long has more significant bits
that will fit into an int, but I think this is unproven. Why shouldn't
some other application prefer, e.g.:
def toint(x):
try: return int(x)
except OverflowError: return [sys.maxint,-sys.maxint][x<0]
i.e. a "saturating" conversion (warning: doesn't work in 2.3 since
int(alongthatstoobig) just returns the long rather than an int, NOT
raising any exception whatsoever...).
> matter of taste whether the implentation or the interface-user should do
> it, but since one still allows the other, I'd say the implementation
> probably should.
"In the face of ambiguity, refuse the temptation to guess", and I
think it's NOT obvious how N>32 bits should be squeezed into 32 (the
idea of choosing the LEAST significant ones may look obvious to you,
but I'd suggest thinking again about it, in terms of application
needs rather than traditional behavior of hardware components...:-).
> I like the flexibility python provides, but it seems that there might be
> cracks where orthogonality or consistency of interface have cracks; which
> makes sense in an evovling language. If I managed to arrive at python at
> a point where this "problem" was just begotten and there's a good chance
> it will go away soon, perfect. But I stil think there's an elegant
> solution to be had, that works across implementations. Is it sys.maxint?
> Maybe. Silent exceptions, like Java will do for math stuff? Maybe. I'm
> getting the impression the solution is maxint, so great. Thanks.
If taking just as many LSB's as sys.maxint happens to have is what your
application needs, then yes, sys.maxint (plus some care about sign!)
is indeed "the solution". As to what needs your application actually
happens to have, well, that's something that only you, as the designer
thereof, can determine -- the language sure can't do it for you!
Alex
So wait, all you want to do is algebra in the ring Z_{2^32}?
Assuming this is right, why not just make a RingElement class that has
all the operators overloaded to perform the appropriate operation and
then perform the appropriate modulus. With appropriate optimizations if
needs be.
Sorry, I came in on the end of this, but perhaps this will be helpful.
James.
>>> Am I so far off in thinking maybe this should be provided by
>>> the language?
>>
>> IMHO, you are. "integral types with N bits ignoring overflows"
>
> So wait, all you want to do is algebra in the ring Z_{2^32}?
Don't know about him, but what I want to do is 2's compliment
add and subtract, bitwise and/or/not/xor and left/right shift
operations on 8, 16, and 32 bit values. If that's "algebra in
the ring Z_{2^8|16|32}, then yes.
> Assuming this is right, why not just make a RingElement class
> that has all the operators overloaded to perform the
> appropriate operation and then perform the appropriate modulus.
> With appropriate optimizations if needs be.
--
Grant Edwards grante Yow! I think I am an
at overnight sensation right
visi.com now!!
I was always better at number theory than coding theory. I think you
want to work in {Z_2}^32 - If you really wanna do it then I still hold
that just wrapping it all up into a class is the way to go.
It's not as crazy as it sounds, essentially what you're doing is
treating a 32 bit integer as a polynomial rather than an integral value.
In a language like C, it's a good way to do it. << gives you
multiplication by x, and if you want cyclic polynomials you can
construct a Galois field out of it IIRC. In python you probably actually
want to use something that *is* a polynomial. Possibly numeric python or
something could help out.
Grab a book on coding theory. Hamming is The Man IIRC.
James.
I think the intention is to eventually make the distinction
go away altogether at the language level. There might still
be int and long objects under the covers, but that would
just be an implementation detail, and you'd be able to
use them interchangeably everywhere.
In your case, it seems to me you don't actually care whether
you've got an int object or a long object, as long as the
arithmetic works right. I second the suggestion to mask
the results, but I'd add that you don't really need to
worry about keeping the sign right. Just keep all your
values internally as unsigned ints until you need to
do something which actually cares about the sign.
This is a better reflection of the way the real hardware
works, anyway -- the bit patterns in registers and memory
aren't inherently signed or unsigned, it all depends on
what you're doing with them.
So, e.g. an addition just becomes
c = (a + b) & 0xFFFFFFFFL
which will be considerably faster than using an if-statement
to sort out the sign every time.
--
Greg Ewing, Computer Science Dept,
University of Canterbury,
Christchurch, New Zealand
http://www.cosc.canterbury.ac.nz/~greg
>>> import sys
>>> def toint(foo):
... if foo>=0: return int(foo&sys.maxint)
... else: return -int(-foo&sys.maxint)
...
>>> hex(-sys.maxint-1)
'0x80000000'
>>> -sys.maxint-1
-2147483648
>>> type(-sys.maxint-1)
<type 'int'>
>>> toint(-sys.maxint-1)
0
Bug, ISTM.
[...similarly...]
Because two's complement "minint" is not -sys.maxint,
(it is -sys.maxint-1)
Regards,
Bengt Richter
IMO,it would be simpler to just write an extenstion in C and use C's
operators on 8, 16, and 32 bit integer data variables.
--
Grant Edwards grante Yow! Remember, in 2039,
at MOUSSE & PASTA will
visi.com be available ONLY by
prescription!!
Well, first of all, I don't have enough clue what you people are talking
about to ask specific questions...
But how can you make a pointer in Python? Is it a real pointer to
memory, or just an item in, say, a dictionary or an array?
--
Posted via http://dbforums.com
> On Wed, 26 Mar 2003 12:38:25 GMT, Alex Martelli <al...@aleax.it> wrote:
>>Daniel Timothy Bentley wrote:
> [...]
>>
>>> Basically, if you're going to say ints and longs are separate types, I
>>> think there should be a way to makes ints from longs fairly reliably. I
>>> don't think that's a niche, I think it's something a consistent language
>>> should provice.
>>
>>Well, your proposed expression DOES "make ints from longs fairly
>>reliably", it's just that I think the ints it makes are probably not the
>>ones you'd like to get (except for 0<=foo<sys.maxint).
>>
>>I think you want something like: int(cmp(foo,0)*(abs(foo) & sys.maxint)).
>>
>>Me, I think it's better to incapsulate the "conversion to int ignoring
>>overflow" in a function, using an if/else:
>>
>>def toint(foo):
>> if foo>=0: return int(foo&sys.maxint)
>> else: return -int(-foo&sys.maxint)
>>
> The above doesn't match typical hardware and C-style int behavior:
Right, it doesn't replicate the asymmetry thus typically found. You
have to specialcase that. In 2.2 it's easiest and fastest to:
def toint(foo):
try: return int(foo)
except OverflowError: pass
if foo>=0: return int(foo&sys.maxint)
else: return -int(-foo&sys.maxint)
but in 2.3 int(foo) won't raise OverflowError, so if you need to
reproduce the asymmetry you can do something like:
def toint(foo):
if foo==-sys.maxint-1: return int(foo)
elif foo>=0: return int(foo&sys.maxint)
else: return -int(-foo&sys.maxint)
or you can swap the order of the first two guarded returns, if
you like, since their guards are mutually exclusive of course:
def toint(foo):
if foo>=0: return int(foo&sys.maxint)
elif foo==-sys.maxint-1: return int(foo)
else: return -int(-foo&sys.maxint)
the second one is probably going to be marginally faster in
practical use (fewer calls with -sys.maxint-1 than with >=0
arguments), and another tiny performance enhancement is:
def toint(foo, themask=sys.maxint, anomaly=-sys.maxint-1):
if foo>=0: return int(foo&themask)
elif foo==anomaly: return int(foo)
else: return -int(-foo&themask)
(probably too tiny to measure in most applications).
Alex
Please explain the purpose behind your desire for this. Python
certainly uses pointers internally, in the C implementation at
least, but they are not exactly exposed to the programmer.
Perhaps if you explain what you want, the answer will be easier.
-Peter
Peter Hansen wrote:
> Please explain the purpose behind your desire for this. Python
> certainly uses pointers internally, in the C implementation at
> least, but they are not exactly exposed to the programmer.
> Perhaps if you explain what you want, the answer will be easier.
The original poster claimed to be writing a c compiler in Python.
I presume he meant a C interpreter (the exact meaning of these
terms is a bit hazy anyhow), in which case Tetsuo's question is a
valid one. I certainly hope the answer is the dictionary or array.
It's a peculiar task to have set oneself, but I suppose it'd be
a great way to really get to know the C language.
-- Michael Chermside
Isn't he making a C interpreter? So it would need to make pointers, or
it wouldn't be C.
That's true, but they wouldn't be real pointers
to real memory. They'd be simulated pointers to whatever
representation is used for the simulated memory.
For instance, if the memory is being represented by an
array.array of bytes, a pointer would just be an
integer representing an offset into this array.
UIAM there's still a bug. I would model C behavior with
def toint(foo, themask=sys.maxint, signbit=long(sys.maxint)+1):
return int(-(foo&signbit)) + int(foo&themask)
I.e., just mask off sufficient least significant bits of
the wider to fit in the narrower, including the sign bit,
and let the resulting sign bit be interpreted as a normal
sign bit irrespective of the further-up sign bit of the wider.
Note the difference for any positive long with a bit
in the int signbit position (ordinarily bit 31). Indeed,
just the positive long value of 1L<<31 itself will do it.
E.g., using
====< toint.py >================================
import sys
def toint_a(foo, themask=sys.maxint, anomaly=-sys.maxint-1):
if foo>=0: return int(foo&themask)
elif foo==anomaly: return int(foo)
else: return -int(-foo&themask)
def toint_b(foo, themask=sys.maxint, signbit=long(sys.maxint)+1):
return int(-(foo&signbit)) + int(foo&themask)
================================================
>>> import sys
>>> import toint
>>> signbit = long(sys.maxint)+1
>>> signbit
2147483648L
>>> hex(signbit)
'0x80000000L'
>>> toint.toint_a(signbit)
0
>>> toint.toint_b(signbit)
-2147483648
>>> toint.toint_a(+1+signbit)
1
>>> toint.toint_b(+1+signbit)
-2147483647
Or as first mentioned,
>>> toint.toint_a(1L<<31)
0
>>> toint.toint_b(1L<<31)
-2147483648
Regards,
Bengt Richter