Is python buffer overflow proof?

Jizzai

unread,

Aug 2, 2009, 9:50:14 AM8/2/09

to

Is a _pure_ python program buffer overflow proof?

For example in C++ you can declare a char[9] to hold user input.
If the user inputs 10+ chars a buffer overflow occurs.

In python, I cannot seem to find a way to define/restrict a string length.
This is probably by design and raises the topic in question.

Am curious to see the opinions of people who know.

TIA.

Marcus Wanner

unread,

Aug 2, 2009, 10:32:39 AM8/2/09

to

I believe that python is buffer overflow proof. In fact, I think that
even ctypes is overflow proof...

Marcus

Christian Heimes

unread,

Aug 2, 2009, 10:43:25 AM8/2/09

to pytho...@python.org

Marcus Wanner wrote:
> I believe that python is buffer overflow proof. In fact, I think that
> even ctypes is overflow proof...

No, ctypes isn't buffer overflow proof. ctypes can break and crash a
Python interpreter easily.

Christian

Steven D'Aprano

unread,

Aug 2, 2009, 11:18:27 AM8/2/09

to

On Sun, 02 Aug 2009 13:50:14 +0000, Jizzai wrote:

> Is a _pure_ python program buffer overflow proof?

It's supposed to be.

> For example in C++ you can declare a char[9] to hold user input. If the
> user inputs 10+ chars a buffer overflow occurs.
>
> In python, I cannot seem to find a way to define/restrict a string
> length. This is probably by design and raises the topic in question.

That's a separate issue from being buffer overflow proof. You can't
specify that a string have a maximum of N characters except by slicing
the string after it's formed:

s = "x"*10000 # Make a big string.
s = s[:100] # Limit it to 100 characters.

But Python won't overflow any buffers even if you try to create a truly
huge string:

s = "x"*(1024**4) # Try to create a 1 TB string.

Your PC will run slow while Python and the OS tries to allocate 1TB of
memory, then it will safely raise MemoryError. Pure Python should never
dump core.

--
Steven

Marcus Wanner

unread,

Aug 2, 2009, 5:51:11 PM8/2/09

to

I see. I thought that it said "invalid array index" when you try to
read/write outside of an array's bounds, though...

Marcus

Diez B. Roggisch

unread,

Aug 3, 2009, 3:45:32 AM8/3/09

to

Marcus Wanner schrieb:

But you can cast the resulting pointer to an array of larger size, and
there you are.

Diez

Marcus Wanner

unread,

Aug 3, 2009, 8:01:00 AM8/3/09

to

Ah, that makes sense. I had forgotten about ctypes.cast().

Marcus

sturlamolden

unread,

Aug 3, 2009, 5:04:53 PM8/3/09

to

On 2 Aug, 15:50, Jizzai <jiz...@gmail.com> wrote:

> Is a _pure_ python program buffer overflow proof?
>
> For example in C++ you can declare a char[9] to hold user input.
> If the user inputs 10+ chars a buffer overflow occurs.

Short answer: NO

Bounds checking on sequence types is a protection against buffer
overflow, but is certainly not sufficient.

The Python interpreter is written in C. Python extension modules are
written in C (or something similar). If you find an unprotected buffer
in this C code, you can possibly overflow this buffer. This can be
used for nasty things like corrupting the stack and injecting
malicious code. There is a reason why the Python sandbox (rexec and
Bastion modules) was disabled in Python 2.3.

IronPython and Jython provides better protection against buffer
overflow than CPython, as these interpreters are written in safer
languages (C# and Java). You thus get an extra layer of protection
between the Python code and the unsafe C (used in JVM and .NET
runtimes).

Gabriel Genellina

unread,

Aug 3, 2009, 9:39:50 PM8/3/09

to pytho...@python.org

En Mon, 03 Aug 2009 18:04:53 -0300, sturlamolden <sturla...@yahoo.no>
escribi�:

> On 2 Aug, 15:50, Jizzai <jiz...@gmail.com> wrote:
>
>> Is a _pure_ python program buffer overflow proof?
>>
>> For example in C++ you can declare a char[9] to hold user input.
>> If the user inputs 10+ chars a buffer overflow occurs.
>
> Short answer: NO
>
> Bounds checking on sequence types is a protection against buffer
> overflow, but is certainly not sufficient.
>
> The Python interpreter is written in C. Python extension modules are
> written in C (or something similar). If you find an unprotected buffer
> in this C code, you can possibly overflow this buffer. This can be
> used for nasty things like corrupting the stack and injecting
> malicious code. There is a reason why the Python sandbox (rexec and
> Bastion modules) was disabled in Python 2.3.

(I think the reason rexec and bastion were disabled has nothing to do with
the possibility of buffer overflows in extension modules)

> IronPython and Jython provides better protection against buffer
> overflow than CPython, as these interpreters are written in safer
> languages (C# and Java). You thus get an extra layer of protection
> between the Python code and the unsafe C (used in JVM and .NET
> runtimes).

I disagree. You've just translated the responsability to check for buffer
overflows, from the Python VM, to the Java VM or the .Net runtime (and all
three suffered from buffer overruns and other problems in some way or
another). Also, Python extensions written in C are equivalent to using JNI
in Java or unmanaged code in C#: all three are likely to have hidden
problems.
It's always the same story: a *language* may declare that such things are
impossible, but a particular *implementation* may have bugs and fail to
comply with the specification.

--
Gabriel Genellina

Steven D'Aprano

unread,

Aug 3, 2009, 11:44:54 PM8/3/09

to

On Mon, 03 Aug 2009 14:04:53 -0700, sturlamolden wrote:

> On 2 Aug, 15:50, Jizzai <jiz...@gmail.com> wrote:
>
>> Is a _pure_ python program buffer overflow proof?
>>
>> For example in C++ you can declare a char[9] to hold user input. If the
>> user inputs 10+ chars a buffer overflow occurs.
>
> Short answer: NO
>
> Bounds checking on sequence types is a protection against buffer
> overflow, but is certainly not sufficient.
>
> The Python interpreter is written in C. Python extension modules are
> written in C (or something similar). If you find an unprotected buffer
> in this C code, you can possibly overflow this buffer.

How are C extension modules "_pure_ python"?

--
Steven

Paul Rubin

unread,

Aug 4, 2009, 12:34:15 AM8/4/09

to

Steven D'Aprano <ste...@REMOVE.THIS.cybersource.com.au> writes:
> > The Python interpreter is written in C. Python extension modules are
> > written in C (or something similar). If you find an unprotected buffer
> > in this C code, you can possibly overflow this buffer.
>
> How are C extension modules "_pure_ python"?

A lot of basic Python constructs (like numbers and dictionaries) are
implemented as C extension modules. It is reasonable to consider
"pure Python" to include the contents of the Python standard library.

John Nagle

unread,

Aug 4, 2009, 1:06:06 AM8/4/09

to

Gabriel Genellina wrote:
> En Mon, 03 Aug 2009 18:04:53 -0300, sturlamolden <sturla...@yahoo.no>
> escribi�:
>
>> On 2 Aug, 15:50, Jizzai <jiz...@gmail.com> wrote:
>>
>>> Is a _pure_ python program buffer overflow proof?
>>>
>>> For example in C++ you can declare a char[9] to hold user input.
>>> If the user inputs 10+ chars a buffer overflow occurs.
>>
>> Short answer: NO

> I disagree. You've just translated the responsability to check for

> buffer overflows, from the Python VM, to the Java VM or the .Net runtime
> (and all three suffered from buffer overruns and other problems in some
> way or another).

A more useful question is whether the standard libraries are being
run through any of the commercial static checkers for possible buffer
overflows.

John Nagle

Steven D'Aprano

unread,

Aug 4, 2009, 2:09:52 AM8/4/09

to

Well, yes, but we're not saying that Python is bug-free. There could be
bugs in the Python VM for that matter.

The point is that code you write yourself can rely on "pure Python" to be
free of buffer-overflows (for some definition of "rely") rather than
having to worry about managing memory yourself. If you do this:

buffer = [0]*1024
buffer[:] = [1]*1025

you don't over-write some random piece of memory, the list object resizes
to accommodate, or fails with an exception instead. No special action is
needed to avoid buffer overflows. You can't make that claim about C
extensions.

It's interesting to contrast that with DoS vulnerabilities in pure Python
code. Python won't stop you from trying to calculate a googolplex:

googol = 10**100
googolplex = 10**googol

and doing so will be a moderately effective denial of service against
your Python application. If you're concerned with that, you need to code
defensively in the Python layer. Protecting against time-consuming
operations is not part of Python's design.

--
Steven

Paul Rubin

unread,

Aug 4, 2009, 3:56:05 AM8/4/09

to

Steven D'Aprano <ste...@REMOVE.THIS.cybersource.com.au> writes:
> The point is that code you write yourself can rely on "pure Python" to be
> free of buffer-overflows (for some definition of "rely") rather than
> having to worry about managing memory yourself.

Right. Basically the Python interpreter protects you reasonably well
from silly errors. The interpreter hasn't had anywhere near the level
of hardening required to claim to protect you from diabolically clever
malicious code running in the same interpreter as your sensitive
application. The Rexec/Bastion modules were basically swiss cheese.

Gabriel Genellina

unread,

Aug 4, 2009, 4:48:15 AM8/4/09

to pytho...@python.org

En Tue, 04 Aug 2009 02:06:06 -0300, John Nagle <na...@animats.com>
escribiï¿œ:

> Gabriel Genellina wrote:
>> En Mon, 03 Aug 2009 18:04:53 -0300, sturlamolden

>> <sturla...@yahoo.no> escribiï¿œ:

>>
>>> On 2 Aug, 15:50, Jizzai <jiz...@gmail.com> wrote:
>>>
>>>> Is a _pure_ python program buffer overflow proof?
>>>> For example in C++ you can declare a char[9] to hold user input.
>>>> If the user inputs 10+ chars a buffer overflow occurs.

> A more useful question is whether the standard libraries are being

> run through any of the commercial static checkers for possible buffer
> overflows.

In the past the Python source code was checked with valgrind and some
coverity tools; I don't know the current status.

--
Gabriel Genellina

Christian Heimes

unread,

Aug 4, 2009, 6:58:03 AM8/4/09

to pytho...@python.org

John Nagle wrote:
> A more useful question is whether the standard libraries are being
> run through any of the commercial static checkers for possible buffer
> overflows.

The CPython interpreter is constantly checked with
http://www.coverity.com/. Although Python is used for critical stuff at
large companies like Apple, Google and NASA, only a few critical bugs in
the C code have been found in the last couple of years.

Thorsten Kampe

unread,

Aug 4, 2009, 7:23:12 AM8/4/09

to

* Jizzai (Sun, 02 Aug 2009 13:50:14 GMT)

> Is a _pure_ python program buffer overflow proof?

You cannot create "your own" buffer overflow in Python as you can in C
and C++ but your code could still be vulnerable if the underlying Python
construct is written in C. See [1] for instance.

Thorsten
[1] http://www.gentoo.org/security/en/glsa/glsa-200610-07.xml

Tim Chase

unread,

Aug 4, 2009, 8:27:27 AM8/4/09

to Marcus Wanner, pytho...@python.org

Marcus Wanner wrote:
> On 8/3/2009 3:45 AM, Diez B. Roggisch wrote:
>> But you can cast the resulting pointer to an array of larger size, and
>> there you are.
>

> Ah, that makes sense. I had forgotten about ctypes.cast().

You *can* shoot yourself in the foot with Python, you just have
to aim much more carefully than you do with C/C++.

-tkc

Neil Hodgson

unread,

Aug 4, 2009, 9:32:55 AM8/4/09

to

Thorsten Kampe:

> You cannot create "your own" buffer overflow in Python as you can in C
> and C++ but your code could still be vulnerable if the underlying Python
> construct is written in C.

Python's standard library does now include unsafe constructs.

import ctypes
x = '1234'
# Munging byte 1 OK
ctypes.memset(x, 1, 1)
print(x)
# Next line writes beyond end of variable and crashes
ctypes.memset(x, 1, 20000)
print(x)

Neil

sturlamolden

unread,

Aug 4, 2009, 11:46:30 PM8/4/09

to

On Aug 4, 2:27 pm, Tim Chase <python.l...@tim.thechases.com> wrote:

> You *can* shoot yourself in the foot with Python, you just have
> to aim much more carefully than you do with C/C++.

You can e.g. define a class with a __del__ method and make some
circular references. That should give you a nice memory leak.

Thorsten Kampe

unread,

Aug 7, 2009, 9:10:29 AM8/7/09

to

* Neil Hodgson (Tue, 04 Aug 2009 13:32:55 GMT)

> Thorsten Kampe:
> > You cannot create "your own" buffer overflow in Python as you can in
C
> > and C++ but your code could still be vulnerable if the underlying Python
> > construct is written in C.
>
> Python's standard library does now include unsafe constructs.

I don't doubt that. If Python contains a buffer overflow vulnerability
your code will also be susceptible to that. Please read the link I
provided as an example.

Thorsten

Fuzzyman

unread,

Aug 7, 2009, 4:52:03 PM8/7/09

to

Well, both Java and .NET both have their own FFI that let you do
whatever you want (more or less).

Michael Foord
--
http://www.ironpythoninaction.com/

Fuzzyman

unread,

Aug 7, 2009, 4:54:05 PM8/7/09

to

On Aug 4, 6:06 am, John Nagle <na...@animats.com> wrote:
> Gabriel Genellina wrote:

> > En Mon, 03 Aug 2009 18:04:53 -0300, sturlamolden <sturlamol...@yahoo.no>
> > escribió:

>
> >> On 2 Aug, 15:50, Jizzai <jiz...@gmail.com> wrote:
>
> >>> Is a _pure_ python program buffer overflow proof?
>
> >>> For example in C++ you can declare a char[9] to hold user input.
> >>> If the user inputs 10+ chars a buffer overflow occurs.
>
> >> Short answer: NO
> > I disagree. You've just translated the responsability to check for
> > buffer overflows, from the Python VM, to the Java VM or the .Net runtime
> > (and all three suffered from buffer overruns and other problems in some
> > way or another).
>
> A more useful question is whether the standard libraries are being
> run through any of the commercial static checkers for possible buffer
> overflows.
>
> John Nagle

Python has been run through valgrind which did expose (and result in
the fixing) of several theoretical problems.

Pure Python can be crashed (cause segfaults) in various ways - there
is even a directory of tests that do this in the test suite. I don't
think any are due to buffer overflows.