Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Python String Immutability Broken!

4 views
Skip to first unread message

Hendrik van Rooyen

unread,
Aug 24, 2008, 9:49:41 PM8/24/08
to pytho...@python.org

South Africa. Sunday 24th August 2008.

Our South African correspondent, Waffling Swiftly, reports the
discovery of a corpse in the local cyberspace.

It is reputed to belong to a programmer who was flayed alive
by the C.L.P. group, because he had violated the immutability
of a python string.

Rumour has it that the attack was led, and the killing blow struck,
by the "KILL GIL" girl who left her lair on Irmen "Mr. Pyro"'s blog
at http://www.razorvine.net/frog/user/irmen/article/2005-02-13/45 in
order to perform the hit, using her katana.

When asked to comment, the BDFL shrugged and said:

The guy had it coming.
He was the architect of his own misfortune.
Python strings _are_ immutable.


Here is the evidence:

root@dmp:/tmp/disk/ebox/iotest/lib# python
Python 2.5.2 (r252:60911, Mar 1 2008, 13:52:45)
[GCC 4.2.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> s = 'The quick brown fox jumps over the lazy dog and now is the time!'
>>> s
'The quick brown fox jumps over the lazy dog and now is the time!'
>>> import ctypes as c
>>> io = c.cdll.LoadLibrary('./lib_gpio.a')
>>> io.io_en()
0
>>> io.read_write(s,len(s))
255
>>> s
'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff
\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff
\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff
\xff\xff\xff\xff\xff\xff\xff'
>>>g = 'boo'
>>> io.read_write(g,len(g))
255
>>> g
'\xff\xff\xff'
>>>

And here is the incriminating code:

WARNING: Don't do this at home. It does low level
i/o and unless you run it on a DMP Vortex processor,
it might turn your printer into a pumpkin, or something.

YOU HAVE BEEN WARNED!


</code>

#include <stdlib.h>
#include <stdio.h>
#include <sys/io.h>
#include <unistd.h>

/* We define our read and write routines */

#define write_port(a,b) outb(b,a)
#define read_port(a) inb(a)

/* Then we hard code the port numbers we need */

#define data_0 (0x78)
#define data_1 (0x79)
#define data_2 (0x7a)

#define dir_0 (0x98)
#define dir_1 (0x99)
#define dir_2 (0x9a)


/* Here we organise permission to use the ports*/

int io_en()
{
int rv = 0;
rv = iopl(3);
return rv;
}

/* Then we make some dumb i/o routines for testing */

/* first we make outputs */

unsigned char put_0 (char x)
{
write_port (dir_0, 0xff);
write_port (data_0, x);
return 0x00;
}

unsigned char put_1 (char x)
{
write_port (dir_1, 0xff);
write_port (data_1, x);
return 0x00;
}

unsigned char put_2 (char x)
{
write_port (dir_2, 0xff);
write_port (data_2, x);
return 0x00;
}

/* Then we make inputs */

unsigned char rval = 0x00;

unsigned char get_0 ()
{
write_port (dir_0, 0x00);
rval = read_port(data_0) & 0xff;
return rval;
}

unsigned char get_1 ()
{
write_port (dir_1, 0x00);
rval = read_port(data_1) & 0xff;
return rval;
}

unsigned char get_2 ()
{
write_port (dir_2, 0x00);
rval = read_port(data_2) & 0xff;
return rval;
}

/* This routine outputs and inputs a symmetric block of bytes, writing
the outputs out and replacing them by the corresponding inputs */

unsigned char read_write (unsigned char *outputs, int len)
{
int i = 0;
char rv;
while (i < len)
{
rv = put_1(i); /* put out the addy */
rv = put_0(outputs[i]); /* put out the char */
rv = put_2('\x03'); /* make a write strobe */
rv = put_2('\x02');
rv = put_2('\x03');
rv = get_0(); /* turn the bus around */
rv = put_2('\x05'); /* make a read strobe */
rv = put_2('\x04');
outputs[i] = get_0(); /* read the char */
rv = put_2('\x05'); /* 3-state bus again */
i++;
}
return *outputs;
}
</end code>

The man's dying words were supposed to be:

You often see this pattern of inputs replacing
outputs in DSP serial port code.
I suppose I should have used array.array...
aaahhrgh...!!

Flowers, comments, pitfalls, advice and money are welcome!

- Hendrik

Roy Smith

unread,
Aug 24, 2008, 5:40:33 PM8/24/08
to
In article <mailman.2119.121961...@python.org>,

"Hendrik van Rooyen" <ma...@microcorp.co.za> wrote:

> It is reputed to belong to a programmer who was flayed alive

Reminds me of that great old song from "Saturday Night Hacker":

Oh, oh, oh, oh.
Flaying alive, flaying alive.
Oh, oh, oh, oh.
Flaying ali-i-i-i-i-ive!

Patrick Maupin

unread,
Aug 24, 2008, 10:41:50 PM8/24/08
to
On Aug 24, 8:49 pm, "Hendrik van Rooyen" <m...@microcorp.co.za> wrote:
> (a lot of stuff related to using a string with a C library via ctypes)

Very entertaining.

But let me get this straight: Are you just complaining that if you
pass a string to an arbitrary C function using ctypes, that that
arbitrary function can modify the string?

Because if you are, then I think you share a great deal of
responsibility for the death of that string -- sending the poor thing
to its grave through some unknown C function.

What would you have ctypes do instead?

Hendrik van Rooyen

unread,
Aug 25, 2008, 4:31:22 PM8/25/08
to pytho...@python.org

Patrick Maupin <pmau....ail.com> wrote:

>Very entertaining.
>

Thanks. Nice to see that there is still some sense of humour
left somewhere - its all been so serious here lately - people
seem to forget that hacking is fun!

>But let me get this straight: Are you just complaining that if you
>pass a string to an arbitrary C function using ctypes, that that
>arbitrary function can modify the string?
>

Actually, I am not complaining - I am asking for advice on the side
effects of what I am doing, which is replacing a bunch of bits
in what is essentially an output bit field with the corresponding
input bits at the same addresses read back from a simulated i/o
bus structure. And I would also like to know if there is a better
way of doing this.

The C code actually works, doing what was intended - the \xff that
one sees appearing back comes from the pullup resistors on the
eBox's i/o. I can show that it is working by adding some resistance
and capacitance (by holding the connector against my tongue) in which
case I get a munged version of the fox back. (- evidently my tongue
is not such a perfect communications medium as I would like to believe.)

Passing the fox is actually deceptive and misleading, as in real
use there would be no such correlation sideways across bits, as
they are just representations of output lines.
(Think "coils" in PLC jargon)

>Because if you are, then I think you share a great deal of
>responsibility for the death of that string -- sending the poor thing
>to its grave through some unknown C function.

This string is NOT dead - it is alive, and not even stunned -
it just looks as if it is sleeping because of the \xff - which
comes from the fact that there is no real hardware out there yet.

The C functions are very simple ones actually - they just do
what are essentially Linux I/O system calls - setting direction
bits for a port (in or out) and then reading or writing the data.

- Hendrik

Simon Brunning

unread,
Aug 25, 2008, 6:15:56 AM8/25/08
to Python List
2008/8/25 Hendrik van Rooyen <ma...@microcorp.co.za>:

> It is reputed to belong to a programmer who was flayed alive
> by the C.L.P. group, because he had violated the immutability
> of a python string.

You can indeed use ctypes to modify the value of a string - see
<http://tinyurl.com/5hcnwl>. You can use it to crash the OS, too.

My advice - don't.

--
Cheers,
Simon B.
si...@brunningonline.net
http://www.brunningonline.net/simon/blog/
GTalk: simon.brunning | MSN: small_values | Yahoo: smallvalues | Twitter: brunns

Ken Seehart

unread,
Aug 25, 2008, 6:43:01 AM8/25/08
to pytho...@python.org
You can also use ctypes to globally change the value of integers less
than 101. Personally, I don't particularly like the number 14. I
changed it to 9 and I am much happier now.

I love ctypes. So cool. It's not supposed to be safe.

Life is either a daring adventure or nothing. Security does not
exist in nature, nor do the children of men as a whole experience
it. Avoiding danger is no safer in the long run than exposure.
*Helen Keller <http://www.quotationspage.com/quotes/Helen_Keller/>*
/US blind & deaf educator (1880 - 1968)/

Of course I would not hire anyone who believes this quote, other than
Helen Keller, if she were still with us.

It is quite possible to write a small program that works using abused
strings. But my life better not depend on it. Among other things, if
you use the abused string as a key anywhere, you will not get correct
results. Trying to change the length of the string will cause
disasters. Lengthening a string will corrupt memory, and shortening the
string will not shorten it but rather embed '\0' in it.

Ken

> --
> http://mail.python.org/mailman/listinfo/python-list
>
>

Ken Seehart

unread,
Aug 25, 2008, 6:54:50 AM8/25/08
to pytho...@python.org
Hendrik van Rooyen wrote:
>
> ...

>
> Actually, I am not complaining - I am asking for advice on the side
> effects of what I am doing, which is replacing a bunch of bits
> in what is essentially an output bit field with the corresponding
> input bits at the same addresses read back from a simulated i/o
> bus structure. And I would also like to know if there is a better
> way of doing this.
>
Yes, there is a better way. Use a character array instead of a string.

http://python.net/crew/theller/ctypes/tutorial.html#arrays
> ...
- Ken

Steven D'Aprano

unread,
Aug 25, 2008, 9:09:17 AM8/25/08
to
On Mon, 25 Aug 2008 03:43:01 -0700, Ken Seehart wrote:

> You can also use ctypes to globally change the value of integers less
> than 101. Personally, I don't particularly like the number 14. I
> changed it to 9 and I am much happier now.

Okay, you've got me curious. How do you do that, and why only up to 101?

--
Steven

Peter Otten

unread,
Aug 25, 2008, 9:45:49 AM8/25/08
to
Steven D'Aprano wrote:

Up to 256 in current Python. Small integers are shared to save memory.
After a quick look into the ctypes tutorial and the python source I came up
with

# 64-bit linux

>>> from ctypes import *
>>> libc = cdll.LoadLibrary("libc.so.6")
>>> libc.memset(id(14)+16, 0, 8)
7742336
>>> 14
0
>>> 14 == 0
True

Peter

Gabriel Genellina

unread,
Aug 25, 2008, 11:42:40 AM8/25/08
to pytho...@python.org
En Mon, 25 Aug 2008 17:31:22 -0300, Hendrik van Rooyen <ma...@microcorp.co.za> escribió:

> Patrick Maupin <pmau....ail.com> wrote:
>
>>But let me get this straight: Are you just complaining that if you
>>pass a string to an arbitrary C function using ctypes, that that
>>arbitrary function can modify the string?
>
> Actually, I am not complaining - I am asking for advice on the side
> effects of what I am doing, which is replacing a bunch of bits
> in what is essentially an output bit field with the corresponding
> input bits at the same addresses read back from a simulated i/o
> bus structure. And I would also like to know if there is a better
> way of doing this.

To avoid altering the equilibrium of the whole universe, use ctypes.create_string_buffer:
http://python.net/crew/theller/ctypes/tutorial.html#fundamental-data-types

--
Gabriel Genellina

Patrick Maupin

unread,
Aug 25, 2008, 11:50:46 AM8/25/08
to
On Aug 25, 3:31 pm, "Hendrik van Rooyen" <m...@microcorp.co.za> wrote:
> Actually, I am not complaining - I am asking for advice on the side
> effects of what I am doing, which is replacing a bunch of bits
> in what is essentially an output bit field with the corresponding
> input bits at the same addresses read back from a simulated i/o
> bus structure. And I would also like to know if there is a better
> way of doing this.

Whenever I do low-level stuff like this, I'm in one of two modes:

Mode #1: I'm using somebody else's C library and the overhead of
doing so is small.

Mode #2: I need to code my own low-level stuff (for speed, IO access,
whatever).

In mode 1, I try not to break out a compiler. ctypes is great for
this, and the results are "pure python" to the extent that you can
give pure python to someone else with the same C library, and it will
work. No muss, no fuss, no makefile, no question that ctypes is
awesome stuff.

In mode 2, I have to break out a compiler. I almost never do this
without ALSO breaking out Pyrex. Pyrex is also awesome stuff, and in
Pyrex, you can easily create a (new) Python string for your results
without having to worry about reference counting or any other really
nasty low level interpreter details. You can code a lot of stuff in
pure Pyrex, and you can easily mix and match Pyrex and C.

Pyrex and ctypes are both tools which let me connect to non-Python
code without having to remember to handle Python interpreter internals
correctly. If I can get by with ctypes, I do, but if I actually have
to code in something other than Python to get the job done, I bypass
ctypes and go straight for Pyrex.

Terry Reedy

unread,
Aug 25, 2008, 12:36:09 PM8/25/08
to pytho...@python.org

Ken Seehart wrote:
> Hendrik van Rooyen wrote:
>>
>> ...
>>

>> Actually, I am not complaining - I am asking for advice on the side
>> effects of what I am doing, which is replacing a bunch of bits
>> in what is essentially an output bit field with the corresponding
>> input bits at the same addresses read back from a simulated i/o
>> bus structure. And I would also like to know if there is a better
>> way of doing this.
>>

> Yes, there is a better way. Use a character array instead of a string.
>
> http://python.net/crew/theller/ctypes/tutorial.html#arrays

Which essentially is the bytearray type of 3.0.

Hendrik van Rooyen

unread,
Aug 26, 2008, 4:32:48 PM8/26/08
to pytho...@python.org

"Simon Brunning":

>You can indeed use ctypes to modify the value of a string - see
><http://tinyurl.com/5hcnwl>. You can use it to crash the OS, too.
>
>My advice - don't.

Thanks for the link.

Any advice on what to do or use as an I/O structure for dissemination?


Ken Seehart:

8<----------- using ctypes to make 1+14 = 10 ------------------

>I love ctypes. So cool. It's not supposed to be safe.

And here I thought I was weird…

> Life is either a daring adventure or nothing. Security does not
> exist in nature, nor do the children of men as a whole experience
> it. Avoiding danger is no safer in the long run than exposure.
> *Helen Keller <http://www.quotationspage.com/quotes/Helen_Keller/>*
> /US blind & deaf educator (1880 - 1968)/
>
>Of course I would not hire anyone who believes this quote, other than
>Helen Keller, if she were still with us.

Why not? – as I see it, the Keller quote states the literal truth of
the matter – we all live under an illusion of security – but then
that might just be because I am South African, and the country is
run by cattle thieves.

>It is quite possible to write a small program that works using abused
>strings. But my life better not depend on it. Among other things, if
>you use the abused string as a key anywhere, you will not get correct
>results. Trying to change the length of the string will cause
>disasters. Lengthening a string will corrupt memory, and shortening the
>string will not shorten it but rather embed '\0' in it.

Understood. – remember I am using it as a kind of array of “pseudoports”
- memory representations of what goes on on wires on the outside.

So only a real madman would try to impute the kind of cross bit
correlation needed to use the bunch of bits as a key. The length would
be fixed, governed by the existence of real hardware on the outside.

Ken Seehart again:

>Yes, there is a better way. Use a character array instead of a string.

The original reason I used a string directly instead of array.array was
to try to skip the serialisation step when disseminating the information
via sockets.

As you can appreciate, this is all “hot stuff” as it represents said
wire states, and is of interest system wide.

So lets explore this further – lets say I use two arrays – one to
represent the stuff that must be output, and one to represent the
latest inputs read.

Then, I think, first prize would be the ability to “publish” that
information as a shared memory block, that can be accessed by other
python processes. Then it will be possible to a priori “chop up
the ownership” of the various bits, so that a process can simply
monitor the bits of interest to it, setting or clearing the bits
of the outputs it is responsible for. In this way the work could
be divided amongst many processes.

Then, on a periodic basis, the I/O would be done, much like one
would do it in an embedded system using an interrupt driven ticker
routine.

That would be really cool.

Does anybody know how to get such memory sharing done in Python?
(Linux only)

- Hendrik

0 new messages