Security problem with unitialized memory

811 views
Skip to first unread message

Ronald L. Rivest

unread,
Nov 24, 2014, 3:28:50 AM11/24/14
to julia...@googlegroups.com
I am just learning Julia...

I was quite shocked today to learn that Julia does *not*
initialize allocated storage (e.g. to 0 or some default value).
E.g. the code
     A = Array(Int64,5)
     println(A[1])
has unpredictable behavior, may disclose information from
other modules, etc.

This is really quite unacceptable in a modern programming
language; it is as bad as not checking array reads for out-of-bounds
indices.  

Google for "uninitialized security" to find numerous instances
of security violations and unreliability problems caused by the
use of uninitialized variables, and numerous security advisories
warning of problems caused by the (perhaps inadvertent) use
of uninitialized variables.

You can't design a programming language today under the naive
assumption that code in that language won't be used in highly
critical applications or won't be under adversarial attack.

You can't reasonably ask all programmers to properly initialize 
their allocated storage manually any more than you can ask them 
to test all indices before accessing an array manually; these are
things that a high-level language should do for you. 

The default non-initialization of allocated storage is a 
mis-feature that should absolutely be fixed.

There is no efficiency argument here in favor of uninitialized storage
that can outweigh the security and reliability disadvantages...

Cheers,
Ron Rivest

Mauro

unread,
Nov 24, 2014, 3:48:05 AM11/24/14
to julia...@googlegroups.com
Pointer types will initialise to undef and any operation on them fails:
julia> a = Array(ASCIIString, 5);

julia> a[1]
ERROR: access to undefined reference
in getindex at array.jl:246

But you're right, for bits-types this is not an error an will just
return whatever was there before. I think the reason this will stay
that way is that Julia is a numerics oriented language. Thus you many
wanna create a 1GB array of Float64 and then fill it with something as
opposed to first fill it with zeros and then fill it with something.
See:

julia> @time b = Array(Float64, 10^9);
elapsed time: 0.029523638 seconds (8000000144 bytes allocated)

julia> @time c = zeros(Float64, 10^9);
elapsed time: 0.835062841 seconds (8000000168 bytes allocated)

You can argue that the time gain isn't worth the risk but I suspect that
others may feel different.

Rafael Fourquet

unread,
Nov 24, 2014, 4:17:27 AM11/24/14
to julia...@googlegroups.com
Regarding Mauro's example: it is my understanding that your first
example is that much faster mainly for cache reasons; indeed, the
speed difference is less visible with the following two functions:

f1(n) = (b = Array(Float64, n); fill!(b, 0.0))
f2(n) = (b = zeros(Float64, n); fill!(b, 0.0))

Then f2 will be at most twice as slow, but often faster than that.

Rafael

Tamas Papp

unread,
Nov 24, 2014, 5:30:08 AM11/24/14
to julia...@googlegroups.com
On Mon, Nov 24 2014, Ronald L. Rivest <rives...@gmail.com> wrote:

> Google for "uninitialized security" to find numerous instances
> of security violations and unreliability problems caused by the
> use of uninitialized variables, and numerous security advisories
> warning of problems caused by the (perhaps inadvertent) use
> of uninitialized variables.

AFAIK Julia was not intended for applications that need to be
secure. Consequently, it may not be the best language for implementing
the next kernel, SSH server, etc. Besides the issue that you brought up,
it may have other vulnerabilities, for example, I suspect that if you
implement crypto naively in Julia it may be vulnerable to timing attacks
because of the dynamic nature of the language.

Given that initialization has a cost (however minor), I think is
reasonable not to initialize when in 99% of the cases the array will be
filled anyway with values that are actually computed using other data.

> You can't reasonably ask all programmers to properly initialize
> their allocated storage manually any more than you can ask them
> to test all indices before accessing an array manually; these are
> things that a high-level language should do for you.

Sorry, I may be missing something, but if you are not filling an array
with values, why did you create it in the first place? If you just need
a bunch of zeroes, you should probably use sparse data structures.

> There is no efficiency argument here in favor of uninitialized storage
> that can outweigh the security and reliability disadvantages...

All these considerations are subjective and application-dependent, I
doubt that there is general rule. Some high-level languages use
uninitialized arrays, Common Lisp is an example.

Best,

Tamas

Alan Edelman

unread,
Nov 24, 2014, 10:56:47 AM11/24/14
to julia...@googlegroups.com
I would have gone
 A = zeros(Int64,5)
Alan

David Smith

unread,
Nov 24, 2014, 11:43:07 AM11/24/14
to julia...@googlegroups.com
Some ideas:

Is there a way to return an error for accesses before at least one assignment in bits types?  I.e. when the object is created uninitialized it is marked "dirty" and only after assignment of some user values can it be "cleanly" accessed?

Can Julia provide a thin memory management layer that grabs memory from the OS first, zeroes it, and then gives it to the user upon initial allocation?  After gc+reallocation it doesn't need to be zeroed again, unless the next allocation is larger than anything previous, at which time Julia grabs more memory, sanitizes it, and hands it off. 

Stefan Karpinski

unread,
Nov 24, 2014, 12:57:23 PM11/24/14
to julia...@googlegroups.com
There are two rather different issues to consider:

1. Preventing problems due to inadvertent programmer errors.
2. Preventing malicious security attacks.

When we initially made this choice, it wasn't clear if 1 would be a big issue but we decided to see how it played out. It hasn't been a problem in practice: once people grok that the Array(T, dims...) constructor gives uninitialized memory and that the standard usage pattern is to call it and then immediately initialize the memory, everything is ok. I can't recall a single situation where someone has had some terrible bug due to uninitialized int/float arrays.

Regarding 2, Julia is not intended to be a hardened language for writing highly secure software. It allows all sorts of unsafe actions: pointer arithmetic, direct memory access, calling arbitrary C functions, etc. The future of really secure software seems to be small formally verified kernels written in statically typed languages that communicate with larger unverified systems over restricted channels. Julia might be appropriate for the larger unverified system but certainly not for the trusted kernel. Adding enough verification to Julia to write secure kernels is not inconceivable, but would be a major research effort. The implementation would have to check lots of things, including, of course, ensuring that all arrays are initialized.

A couple of other points:

Modern OSes protect against data leaking between processes by zeroing pages before a process first accesses them. Thus any data exposed by Array(T, dims...) comes from the same process and is not a security leak.

An uninitialized array of, say, integers is not in itself a security concern – the issue is what you do with those integers. The classic security hole is to use a "random" value from uninitialized memory to access other memory by using it to index into an array or otherwise convert it to a pointer. In the presence of bounds checking, however, this isn't actually a big concern since you will still either get a bounds error or a valid array value – not a meaningful one, of course, but still just a value.

Writing programs that are secure against malicious attacks is a hard, unsolved problem. So is doing efficient, productive high-level numerical programming. Trying to solve both problems at the same time seems like a recipe for failing at both.

Jameson Nash

unread,
Nov 24, 2014, 1:36:49 PM11/24/14
to julia...@googlegroups.com

I think that Rivest’s question may be a good reason to rethink the initialization of structs and offer the explicit guarantee that all unassigned elements will be initialized to 0 (and not just the jl_value_t pointers). I would argue that the current behavior resulted more from a desire to avoid clearing the array twice (if the user is about to call fill, zeros, ones, +, etc.) than an intentional, casual exposure of uninitialized memory.

A random array of integers is also a security concern if an attacker can extract some other information (with some probability) about the state of the program. Julia is not hardened by design, so you can’t safely run an unknown code fragment, but you still might have an unintended memory exposure in a client-facing app. While zero’ing memory doesn’t prevent the user from simply reusing a memory buffer in a security-unaware fashion (rather than consistently allocating a new one for each use), it’s not clear to me that the performance penalty would be all that noticeable for map Array(X) to zero(X), and only providing an internal constructor for grabbing uninitialized memory (perhaps Base.Unchecked.Array(X) from #8227)

Stefan Karpinski

unread,
Nov 24, 2014, 1:57:14 PM11/24/14
to julia...@googlegroups.com
If we can make allocating zeroed arrays faster that's great, but unless we can close the performance gap all the way and eliminate the need to allocated uninitialized arrays altogether, this proposal is just a rename – Unchecked.Array plays the exact same role as the current Array constructor. It's unclear that this would even address the original concern since it still *allows* uninitialized allocation of arrays. This rename would just force people who have used Array correctly in code that cares about being as efficient as possible even for very large arrays to change their code and use Unchecked.Array instead.

Ronald L. Rivest

unread,
Nov 24, 2014, 2:30:10 PM11/24/14
to julia...@googlegroups.com
Regarding initialization:

   -- I'm toying with the idea of recommending Julia for an introductory programming
      class (rather than Python).  

   -- For this purpose, the language should not have hazards that catch the unwary.

   -- Not initializing storage is definitely a hazard.  With uninitialized storage, a 
      program may run fine one day, and fail mysteriously the next, depending on 
      the contents of memory.  This is about predictability, reliability, dependability,
      and correctness.

   -- I would favor a solution like
             A = Array(Int64,n)                   -- fills with zeros
             A = Array(Int64,n,fill=1)          -- to fill with ones
             A = Array(Int64,n,fill=None)    -- for an uninitialized array
       so that the *default* is an initialized array, but the speed geeks
       can get what they want.

Cheers,
Ron

Tomas Lycken

unread,
Nov 24, 2014, 2:39:38 PM11/24/14
to julia...@googlegroups.com
That *is* the default usage in most introductory settings - just don't show them the Array(T,n) constructor, but give them zeros and ones functions instead. (It's perfectly fine to do e.g. A = zeros(10); fill!(A, 5) if you don't care about the extra write...)

If there is a specific setting where the students actually *need* to allocate uninitialized memory (e.g. for speed), they are probably ready to learn that the Array constructor gives them that.

Julia's approach has so far seemed to be that the users are consenting adults. I like that approach.

// T

Stefan Karpinski

unread,
Nov 24, 2014, 2:56:29 PM11/24/14
to julia...@googlegroups.com
I guess part of the problem is that calling the `zeros` function may be less obvious as a way of constructing an array to many new programmers than calling the `Array` constructor. Having a Boolean fill keyword argument approach might be reasonable, although calling it `zeroed` might be more accurate since we won't be filling the array with a specific value but rather zeroing the memory first. Alternatively, we could just fill the array with random garbage intentionally so that programmers are made painfully aware that they didn't initialize the array ;-)

David Smith

unread,
Nov 24, 2014, 3:01:49 PM11/24/14
to julia...@googlegroups.com
To add some data to this conversation, I just timed allocating a billion Int64s on my macbook, and I got this (I ran these multiple times before this and got similar timings):

julia> N=1_000_000_000
1000000000 

julia> @time x = Array(Int64,N);
elapsed time: 0.022577671 seconds (8000000128 bytes allocated)

julia> @time x = zeros(Int64,N);
elapsed time: 3.95432248 seconds (8000000152 bytes allocated)

So we are talking adding possibly seconds to a program per large array allocation.


Stefan Karpinski

unread,
Nov 24, 2014, 3:05:59 PM11/24/14
to julia...@googlegroups.com
Using calloc could reduce this significantly since the kernel can lazily fill zero pages only when you access them. 

Erik Schnetter

unread,
Nov 24, 2014, 3:18:41 PM11/24/14
to julia...@googlegroups.com
This is not quite right -- the first does not actually map the pages
into memory; this is only done lazily when they are accessed the first
time. You need to compare "alloc uninitialized; then initialize once"
with "alloc zero-initialized; then initialize again".

Current high-end system architectures have memory write speeds of ten
or twenty GByte per second; this is what you should see for very large
arrays -- this would be about 0.4 seconds for your case. For smaller
arrays, the data would reside in the cache, so that the allocation
overhead should be significantly smaller even.

-erik

--
Erik Schnetter <schn...@cct.lsu.edu>
http://www.perimeterinstitute.ca/personal/eschnetter/

Jameson Nash

unread,
Nov 24, 2014, 4:09:38 PM11/24/14
to julia...@googlegroups.com
It appears the fill operation accounts for about 0.15 seconds of the 6.15 seconds that my OS X laptop takes to create this array:

$ ./julia -q

julia> N=10^9

1000000000


julia> @time begin x=zeros(Int64,N); fill(x,0) end

elapsed time: 6.325660691 seconds (8000136616 bytes allocated, 1.71% gc time)

0-element Array{Array{Int64,1},1}


$ ./julia -q

julia> N=10^9

1000000000


julia> @time x=zeros(Int64,N)

elapsed time: 6.160623835 seconds (8000014320 bytes allocated, 0.22% gc time)

David Smith

unread,
Nov 24, 2014, 5:57:26 PM11/24/14
to julia...@googlegroups.com
Did you mean to call zeros() in both cases?

Jameson Nash

unread,
Nov 24, 2014, 6:12:56 PM11/24/14
to julia...@googlegroups.com
yes. the point is to compare the cost of implicitly calling `zero` (resulting in the equivalent of calling zero twice) to the cost of not initializing the memory before writing to it. I could alternatively have done: `@time x=zeros(); @time fill(x, 0)` to measure the same information.

David Smith

unread,
Nov 24, 2014, 6:59:48 PM11/24/14
to julia...@googlegroups.com
But you initialized it in both cases.  Is there a compiler optimization going on here that combines the zeros() and fill()?

Jameson Nash

unread,
Nov 24, 2014, 7:09:13 PM11/24/14
to julia...@googlegroups.com
But you initialized it in both cases. 

Yes.

> Is there a compiler optimization going on here that combines the zeros() and fill()?

No.

But there is a kernel optimization going on that complicates this measurement. Approximately, the memory requested by `malloc` (& friends) is not actually allocated until you try to read or write to it. So there are in fact 3 effects here (roughly speaking, they are malloc, A[1:4096:end], and fill()), where that second operation is unavoidable, and orders of magnitude slower than the other two. You measured the speed of 1 vs. 1+2+3. Whereas I measured the speed of 1+2+3 vs 1+2+3+3.

Stefan Karpinski

unread,
Nov 24, 2014, 7:20:21 PM11/24/14
to Julia Users
Should the comparison actually be more like this:

julia> @time begin
           x = Array(Int,N)
           fill!(x,1)
       end;
elapsed time: 6.782572096 seconds (8000000128 bytes allocated)

julia> @time begin
           x = zeros(Int,N)
           fill!(x,1)
       end;
elapsed time: 14.166256835 seconds (8000000176 bytes allocated)

At least that's the comparison that makes sense for code that allocates and then initializes an array. I consistently see a 2x slowdown or more.

David Smith

unread,
Nov 24, 2014, 7:53:50 PM11/24/14
to julia...@googlegroups.com
This is what I was thinking. I just assumed that the fill() time would be constant for both and factored that out, not knowing that malloc() was lazy.

I get similar results for Stefan's bench, although the variance is large.

Jacob Quinn

unread,
Nov 24, 2014, 8:03:08 PM11/24/14
to julia...@googlegroups.com
The variance you're seeing is most likely due to the garbage collector kicking with that much memory being allocated and then abandoned.

Steven G. Johnson

unread,
Nov 24, 2014, 9:30:44 PM11/24/14
to julia...@googlegroups.com


On Monday, November 24, 2014 3:05:59 PM UTC-5, Stefan Karpinski wrote:
Using calloc could reduce this significantly since the kernel can lazily fill zero pages only when you access them.

Even better, on some systems (e.g. Linux) calloc is implemented via copy-on-write of a single zero page, so it only needs to fill new zero pages when you write.

Unfortunately, Julia allocates 16-byte aligned data by default (to help SIMD code), and there is no calloc version of posix_memalign as far as I know.

Jameson Nash

unread,
Nov 24, 2014, 9:49:06 PM11/24/14
to julia...@googlegroups.com
we could also turn on USE_MMAP by default. all systems should be returning zero pages for new memory. then we only need to worry about manually zero'ing things that are smaller than a page

Erik Schnetter

unread,
Nov 24, 2014, 10:00:00 PM11/24/14
to julia...@googlegroups.com
On Mon, Nov 24, 2014 at 7:19 PM, Stefan Karpinski <ste...@karpinski.org> wrote:
> Should the comparison actually be more like this:
>
> julia> @time begin
> x = Array(Int,N)
> fill!(x,1)
> end;
> elapsed time: 6.782572096 seconds (8000000128 bytes allocated)
>
> julia> @time begin
> x = zeros(Int,N)
> fill!(x,1)
> end;
> elapsed time: 14.166256835 seconds (8000000176 bytes allocated)
>
>
> At least that's the comparison that makes sense for code that allocates and
> then initializes an array. I consistently see a 2x slowdown or more.

My laptop can zero memory at 6+ GByte/sec. The overhead in your case
should be about ten times less than what you report. I suspect that
what you are measuring is not just the overhead of `zeros` over
`Array`, but something else as well.

Or Julia's `zeros` is not implemented efficiently. Zeroing memory is
surprisingly CPU intensive, and one has to call `memset`, or has to
ensure that the vectorizer kicks in, and may have to manually unroll
the loops. I'm not sure LLVM gets this right by default...

... yes, Julia's `zeros` is not as good as it should be. It calls
`fill!`, but the generated machine code leaves to be desired. Time to
add a special case to `fill!`, delegating things to `memset` in
certain cases?

Erik Schnetter

unread,
Nov 24, 2014, 10:01:45 PM11/24/14
to julia...@googlegroups.com
On Mon, Nov 24, 2014 at 9:30 PM, Steven G. Johnson
<steve...@gmail.com> wrote:
> Unfortunately, Julia allocates 16-byte aligned data by default (to help SIMD
> code), and there is no calloc version of posix_memalign as far as I know.

The generated machine code I've seen does not make use of this. All
the load/store instructions in vectorized or unrolled loops assume
unaligned pointers. (Plus, with AVX one should align to 32 bytes
instead.)

Viral Shah

unread,
Nov 24, 2014, 10:48:30 PM11/24/14
to julia...@googlegroups.com
How much faster is zeros with your PR? IIRC, we used to have something like this in the early days.

Viral Shah

unread,
Nov 24, 2014, 10:49:41 PM11/24/14
to julia...@googlegroups.com
Filling random garbage is even more time consuming than filling zeros! 

-viral

Simon Kornblith

unread,
Nov 24, 2014, 10:54:36 PM11/24/14
to julia...@googlegroups.com
In general, arrays cannot be assumed to be 16-byte aligned because it's always possible to create one that isn't using pointer_to_array. However, from Intel's AVX introduction:

Intel® AVX has relaxed some memory alignment requirements, so now Intel AVX by default allows unaligned access; however, this access may come at a performance slowdown, so the old rule of designing your data to be memory aligned is still good practice (16-byte aligned for 128-bit access and 32-byte aligned for 256-bit access).

Viral Shah

unread,
Nov 24, 2014, 11:02:53 PM11/24/14
to julia...@googlegroups.com
To add to the point, you can also get non-aligned stuff with subarrays or results from a ccall.

-viral

Stefan Karpinski

unread,
Nov 24, 2014, 11:16:21 PM11/24/14
to Julia Users
That's not the point – if you already have memory and have to fill it, then you're not in any position for the kernel to lazily zero it, so the alignment of arbitrary arrays is irrelevant. The point SGJ was making is that we want to allocate the memory using something calloc-like so that the kernel can do lazy zeroing for us, but we also need that memory to be 16-byte aligned, but there is not portable way to get 16-byte-aligned memory that the kernel will lazily zero for you. We can have lazy zeroing or 16-byte alignment but not both. This makes me wonder if we couldn't just allocate 15 bytes more than necessary and return the first address that on a 16-byte boundary.

Viral Shah

unread,
Nov 24, 2014, 11:44:35 PM11/24/14
to julia...@googlegroups.com
Right - we can always use calloc and then provide aligned memory.

Perhaps this is worth benchmarking. It is still likely to be much faster than malloc + memset, because it should have significantly better cache behaviour, even though the zeroing is not free. The question is, whether this cost is small enough.

-viral

Erik Schnetter

unread,
Nov 25, 2014, 12:18:07 AM11/25/14
to julia...@googlegroups.com
On Mon, Nov 24, 2014 at 11:15 PM, Stefan Karpinski <ste...@karpinski.org> wrote:
> That's not the point - if you already have memory and have to fill it, then
> you're not in any position for the kernel to lazily zero it, so the
> alignment of arbitrary arrays is irrelevant. The point SGJ was making is
> that we want to allocate the memory using something calloc-like so that the
> kernel can do lazy zeroing for us, but we also need that memory to be
> 16-byte aligned, but there is not portable way to get 16-byte-aligned memory
> that the kernel will lazily zero for you. We can have lazy zeroing or
> 16-byte alignment but not both. This makes me wonder if we couldn't just
> allocate 15 bytes more than necessary and return the first address that on a
> 16-byte boundary.

On each system, malloc makes certain guarantees about the alignment of
the pointer it returns. On a 64-bit system, malloc will likely return
memory that is at least aligned to 8 bytes, maybe more. Thus one would
need to allocate 8 additional bytes, not 15.

Viral Shah

unread,
Nov 25, 2014, 12:29:07 AM11/25/14
to julia...@googlegroups.com
Much has been already said on this topic. 

The Array(...) interface was kind of meant to be low-level for the user of scientific computing, only to be used when they know what they are doing. You get the raw uninitialized memory as fast as possible.

The user-facing interface was always an array constructor - zeros(), ones(), rand(), etc. Some of this is because of our past experience coming from a matlab/R-like world. 

As Julia has become more popular, we have realized that those not coming from matlab/R end up using all the possible constructors. While this has raised a variety of issues, I'd like to say that this will not get sorted out satisfactorily before the 0.4 release. For a class that may be taught soon, the thing to do would be to use the zeros/ones/rand constructors to construct arrays, instead of Array(), which currently is more for a package developer. I understand that Array() is a much better name as Stefan points out, but zeros() is not too terrible - it at least clearly tells the user that they get zeroed out arrays.

While we have other "features" that can lead to unsafe code (ccall, @inbounds), none of these are things one is likely to run into while learning the language.

-viral

Ronald L. Rivest

unread,
Nov 25, 2014, 1:05:40 AM11/25/14
to julia...@googlegroups.com
The problem also exists for new() (e.g. when initializing a record/object).  zeros() can
apparently be used here instead.

Cheers,
Ron

Ronald L. Rivest

unread,
Nov 25, 2014, 1:13:34 AM11/25/14
to julia...@googlegroups.com
Sorry; zeros() does not work here instead of new().  My mistake.
Is there a safe alternative to new() that guarantees that all fields
will have a definite fixed value?

Cheers,
Ron

Stefan Karpinski

unread,
Nov 25, 2014, 10:17:31 AM11/25/14
to Julia Users
It seems more reasonable to me to always zero uninitialized fields of composite values. This is basically free since objects larger than a memory page are not common.

Stefan Karpinski

unread,
Nov 25, 2014, 10:35:38 AM11/25/14
to Julia Users

Jeff Waller

unread,
Nov 25, 2014, 11:10:23 AM11/25/14
to julia...@googlegroups.com


On Monday, November 24, 2014 10:54:36 PM UTC-5, Simon Kornblith wrote:
In general, arrays cannot be assumed to be 16-byte aligned because it's always possible to create one that isn't using pointer_to_array. However, from Intel's AVX introduction:

Intel® AVX has relaxed some memory alignment requirements, so now Intel AVX by default allows unaligned access; however, this access may come at a performance slowdown, so the old rule of designing your data to be memory aligned is still good practice (16-byte aligned for 128-bit access and 32-byte aligned for 256-bit access).

And BTW 512 bit AVX registers are coming next year.  http://en.wikipedia.org/wiki/AVX-512
 

Erik Schnetter

unread,
Nov 25, 2014, 1:38:00 PM11/25/14
to julia...@googlegroups.com

Viral Shah

unread,
Nov 26, 2014, 12:58:02 AM11/26/14
to julia...@googlegroups.com
Much of this this discussion is now captured in the issue Stefan opened. 

-viral
Reply all
Reply to author
Forward
0 new messages