Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
conservative gc sucks
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  Messages 26 - 35 of 35 - Collapse all  -  Translate all to Translated (View all originals) < Older 
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Andy Freeman  
View profile  
 More options Jan 14 2003, 12:30 pm
Newsgroups: comp.lang.lisp
From: ana...@earthlink.net (Andy Freeman)
Date: 14 Jan 2003 09:30:31 -0800
Local: Tues, Jan 14 2003 12:30 pm
Subject: Re: conservative gc sucks

Erik Naggum <e...@naggum.no> wrote in message <news:3251520171080849@naggum.no>...
> * Andy Freeman
> | Can this reliably work on Unix?

>   Why would you want to enroll such an open file in this mechanism?

If it's the only mechanism for files, I don't have much choice.

>   Or are you afraid of what might happen if you delete a file under a
>   program that relies on it?  If so, what makes you think that this
>   clever way of dealing with files will make a substantial difference?

It affects observable behavior.

>   In other words, what exactly did you think I proposed?

It looked like a proposal for a scheme that provided the illusion of a
"large" number of open files on a system that didn't actually provide
same.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Erik Naggum  
View profile  
 More options Jan 14 2003, 2:34 pm
Newsgroups: comp.lang.lisp
From: Erik Naggum <e...@naggum.no>
Date: 14 Jan 2003 19:34:02 +0000
Local: Tues, Jan 14 2003 2:34 pm
Subject: Re: conservative gc sucks
* Andy Freeman
| If it's the only mechanism for files, I don't have much choice.

  Ah, I see you invented the "only mechanism" part on your own and
  attributed it to me.  How manifestly indecent of you to do so.

  Please back up and reattach your argumentation to what I wrote -- it
  is currently flee-floating without connection to what I wrote, yet
  you comment on it as if I had said something I had not.  Then make
  your own contribution explicit and see what difference it makes.

* Erik Naggum
| In other words, what exactly did you think I proposed?

* Andy Freeman
| It looked like a proposal for a scheme that provided the illusion
| of a "large" number of open files on a system that didn't actually
| provide same.

  Could you do me the favor of /reading/ what I wrote and /please/ try
  to avoid introducing noise of your own into it if you are going to
  comment on it?  Intellectual honesty demands that you try to keep at
  least somewhat clear of polluting the information you comment on.
  Not that intellectual honesty is in high esteem in this newsgroup,
  but I still get fairly annoyed when people make up things and then
  pretend I said them.

--
Erik Naggum, Oslo, Norway

Act from reason, and failure makes you rethink and study harder.
Act from faith, and failure makes you blame someone and push harder.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Andy Freeman  
View profile  
 More options Jan 14 2003, 4:05 pm
Newsgroups: comp.lang.lisp
From: ana...@earthlink.net (Andy Freeman)
Date: 14 Jan 2003 13:05:46 -0800
Local: Tues, Jan 14 2003 4:05 pm
Subject: Re: conservative gc sucks

Erik Naggum <e...@naggum.no> wrote in message <news:3251520171080849@naggum.no>...
> * Andy Freeman
> | Can this reliably work on Unix?

>   If you delete an open file, what exactly do you want to rely on?

That data operations using valid handles are unaffected by unlink.

I think of create&unlink as operations that maintain "root" pointers
to data.  Open turns said root pointers into handles that can be used
to manipulate said data; close frees said handles.

With that separation of powers, unlink shouldn't have any effect on
handle operations.

Yes, I know that the unix model isn't quite that straightforward....

-andy


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Erik Naggum  
View profile  
 More options Jan 14 2003, 4:12 pm
Newsgroups: comp.lang.lisp
From: Erik Naggum <e...@naggum.no>
Date: 14 Jan 2003 21:12:33 +0000
Local: Tues, Jan 14 2003 4:12 pm
Subject: Re: conservative gc sucks
* Andy Freeman
| That data operations using valid handles are unaffected by unlink.

  If you want to rely on this, do you still want the stream to be
  closed when it is garbage collected, or do you want some control
  over when it ceases to exist?

  I guess I am trying to figure out why you brought this up in the
  context of garbage-collected streams with finalization semantics.

  Also, despite what you believe, this is not the only mechanism.
  The standard language semantics prevails.  Someone wanted to be
  relieved of closing streams "manually" and wanted them to be closed
  when they became unreferenced.  Again despite what you believe I
  said, I have offered three different ways to address this problem.
  (One of them automatic reaping of unreferenced file handles.)  How
  you could possibly have invented the premise that one of these
  three would be the only one available is beyond me.

--
Erik Naggum, Oslo, Norway

Act from reason, and failure makes you rethink and study harder.
Act from faith, and failure makes you blame someone and push harder.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Tim Bradshaw  
View profile  
 More options Jan 14 2003, 5:35 pm
Newsgroups: comp.lang.lisp
From: Tim Bradshaw <t...@cley.com>
Date: 14 Jan 2003 21:44:46 +0000
Local: Tues, Jan 14 2003 4:44 pm
Subject: Re: conservative gc sucks

* Andy Freeman wrote:
> Yes, I know that the unix model isn't quite that straightforward....

NFS.  And so this comes back to where we started: the moment you start
playing clever games with files or streams and GC you suddenly
discover you're in a world where you have to do distributed GC across
possibly heterogeneous systems with differing underlying semantics.
Or alternatively, you don't discover this, but your programs just
randomly break every once in a while.

--tim


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Andy Freeman  
View profile  
 More options Jan 14 2003, 8:30 pm
Newsgroups: comp.lang.lisp
From: ana...@earthlink.net (Andy Freeman)
Date: 14 Jan 2003 17:30:42 -0800
Local: Tues, Jan 14 2003 8:30 pm
Subject: Re: conservative gc sucks

Erik Naggum <e...@naggum.no> wrote in message <news:3251561642580056@naggum.no>...
> * Andy Freeman
> | If it's the only mechanism for files, I don't have much choice.

>   Ah, I see you invented the "only mechanism" part on your own and
>   attributed it to me.  How manifestly indecent of you to do so.

Except that I didn't attribute anything to anyone.  I asked a question
to help me understand what Naggum wrote.

The thread has discussed both "only" and "special case" mechanisms and
the message in question doesn't specify its category.  Thus, my question.

>   Please back up and reattach your argumentation to what I wrote

No thanks.  Been there, did that, got the t-shirt.

I'd like to learn about general mechanisms because I'd like to
avoid yet another context-specific mechanism.  If that's not on
the table....

> * Andy Freeman
> | It looked like a proposal for a scheme that provided the illusion
> | of a "large" number of open files on a system that didn't actually
> | provide same.

>   Could you do me the favor of /reading/ what I wrote

The mechanism in question kept track of files and closed&reopened them
behind the programmer's back in certain circumstances; the close&reopen
was not explicitly requested by the programmer. (The suggested mechanism
to choose files to close was LRU.)  True, Naggum didn't say why the
mechanism was closing files - I assumed that a system limitation might
be relevant.

-andy


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
ozan s yigit  
View profile  
 More options Jan 14 2003, 9:15 pm
Newsgroups: comp.lang.lisp
From: ozan s yigit <o...@blue.cs.yorku.ca>
Date: 14 Jan 2003 20:57:03 -0500
Local: Tues, Jan 14 2003 8:57 pm
Subject: Re: conservative gc sucks
l...@emf.emf.net (Tom Lord) writes, amongst other things:

> It's not just untrusted code that is the problem -- it's also
> untrusted data.  With suitably crafted malicious _data_, you get some
> control over what's on the stack, both in variables and in spilled
> registers.  Thus, malicious data can also create false roots.

> False roots enable direct exploits that manage to cause big leaks but
> they also enable indirect exploits.  An attacker can combine
> unintended retention of some objects with other exploits, for example,
> tricking the GC into keeping around an object that ultimately keeps a
> file open, then using another exploit to run code that accesses that
> file.

i'm sorry, i find this quite incoherent. can you specify a step-by-step
exploit on a server of your choice running with boehm GC, and explain how
and why you ended up with the GC and false roots as your *only* course of
action for the exploit? if you hacked into server's stack (say), why are
there easier exploits available to you? (etc) i'm not saying your point
is invalid, i'm just not seeing it clearly out of all this hand waving.
a serious example would help reinforce our security toolkits.

oz
---
a nought, an ought, a knot, a not easily perceived distinction. -- t. duff


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Tom Lord  
View profile  
 More options Jan 15 2003, 12:25 am
Newsgroups: comp.lang.lisp
From: l...@emf.emf.net (Tom Lord)
Date: Wed, 15 Jan 2003 05:25:06 -0000
Local: Wed, Jan 15 2003 12:25 am
Subject: Re: conservative gc sucks
    l...@emf.emf.net (Tom Lord) writes, amongst other things:

    > It's not just untrusted code that is the problem -- it's also
    > untrusted data.  With suitably crafted malicious _data_, you get
    > some control over what's on the stack, both in variables and in
    > spilled registers.  Thus, malicious data can also create false
    > roots.
    >
    > False roots enable direct exploits that manage to cause big
    > leaks but they also enable indirect exploits.  An attacker can
    > combine unintended retention of some objects with other
    > exploits, for example, tricking the GC into keeping around an
    > object that ultimately keeps a file open, then using another
    > exploit to run code that accesses that file.

    i'm sorry, i find this quite incoherent.

That's ok.  I think your message leads someplace interesting.

I'm sorry, too: because this is a long reply.  It's in two parts: one
is just trying to clear up what I think are the misunderstandings that
led you to judge my contribution "incoherent" -- that's boring, but
necessary.  The second part is a lot more intersting, in my view: it's
an actual gosh-darn engineering question: how to spend money on
software development that involves a choice between conservative and
precise collectors.  So, here we go:

* Fixing the Apparent Miscommunication

    i'm sorry, i find this quite incoherent.

  Initially, at least, that appears to be because you misread it.
  You go on to say:

    if you hacked into server's stack (say), why are[n't] there easier
    exploits available to you? (etc)

  The misunderstanding seems to be over my phrase "you get some control
  over what's on the stack".

  There are popularized exploits that involve, for lack of a better
  term, "stack smashing".  For example, a bug permits a buffer overrun
  on a stack-based buffer.  An attacker supplies data that causes the
  overrun.  That data both contains arbitrary code and replaces the
  return address of the stack frame with a pointer to that arbitrary
  code (or, maybe it just points the return address to existing code
  that shouldn't run at that particular time but that, if run, will have
  malicious effect).  When you talk about "easier exploits" -- I think
  that is the kind of thing you are talking about, no?

  That is not the kind of "control over the stack" I'm talking about.
  I'm talking about control over the stack which does _not_ require a
  bug:  control over the values in variables;  control over the values
  in registers when they are spilled.  If the stack is scanned
  conservatively, those values, suitably constructed (which can be
  forced in some cases by choice of attacker-supplied-data) are false
  roots and cause errant retention by the GC.

  The fundamental problem here is that with conservative scanning, the
  stack values take on a new meaning that has nothing to do with the
  program text: the conservative GC sees them as potential roots.  That
  overloading gives attackers a new avenue by which they may program
  your application (with malicious data) to behave in unintended ways.

    can you specify a step-by-step exploit on a server of your choice
    running with boehm GC, and explain how and why you ended up with
    the GC and false roots as your *only* course of action for the
    exploit?

  No.  Nor would I specify one here, even if I could.  Nor do I believe
  that doing so is a necessary part of pointing out the security risks.

    if you hacked into server's stack (say), why are there easier
    exploits available to you? (etc) i'm not saying your point is
    invalid, i'm just not seeing it clearly out of all this hand
    waving.  a serious example would help reinforce our security
    toolkits.

  An attacker against a system which uses conservative GC can sometimes
  (and the program text doesn't make clear when) provide malicious data
  that targets particular objects to be retained that, with precise GC
  would not be retained, or that with conservative GC absent malicious
  data would be unlikely to be retained.  How to combine that capability
  with other partial exploits, or how to use it directly as a complete
  exploit, is left as an exercise to the attacker; I've mentioned some
  of my ideas already (resource exhaustion; preserving sensative
  resources to make them available to a code exploit).

* The Engineering Question

  It all comes down to opportunities and how much they cost and
  probabilities, all of which are impossible to measure precisely.

  What is the probability of a conservative GC bug being used in an
  exploit, or causing a costly failure due to a naturally occuring
  bug?   This is very hard to guess -- sadly, this thread probably
  raises the probability of exploits;  non-costly failures from
  conservative GC (minor storage leaks) are very measurable -- so
  we should not "guess low" on the probability of a costly failure.

  What is the cost of developing against a precise vs. conservative
  GC?  There's the initial cost (Boehm-family collectors are ready off
  the shelf.)  Then there's the continuting cost (compare how much
  I'll spend tuning conservative GC to eliminate retention bugs
  vs. how much I'll spend on precise GC to eliminate bookkeeping
  bugs).  I think that in the current historic state of affairs, it's
  safe to say that the initial cost of precise GC is higher (though
  not by a huge amount) -- but that the ongoing costs can be made
  about equal (portable precise GC needs either code generators or
  gclint).

  What's the lock-in cost of conservative GC?  In other words, if we
  decide today to go with conservative GC, how much do we have to pay
  later if we need to switch to precise?  We should note that code
  written presuming a conservative GC, especially a
  conservative-stack-scanning GC, is not easy to convert to precise GC
  -- one has to add bookkeeping.  We can't count on there being an
  automatable conversion process -- we'd have to restrict the code
  somewhat to guarantee that automatic conversion was possible (and
  then write gclint even though we're using conservative GC).  As
  run-time systems grow, this conversion cost is going to just keep
  getting higher.  We might have a slight out, if we plan to one-day
  modify our C compiler to spew type information, but if that day
  comes, we'll both lose portability and raise the cost of modifying
  or replacing our compiler.  Finally, if we need to do this
  conversion quickly, the costs must be suitably multiplied.  I don't
  think these lock-in costs are easy to estimate, other than that they
  aren't trivial, and they will grow over time.

  What's the pay-back of going to precise GC early?  At least
  incrementally better memory performance; the ability to reliably
  regression test code that involves GC semantics (e.g., weak
  reference implementations); freedom from worrying about conservative
  GC lock-in costs, liability costs, and the probability of nastly
  conservative GC bugs or exploits.

  Putting that all together, the only lossage of investing in precise
  GC is the initial cost, and there's plenty of experience in the
  field that tells us that cost isn't very high: wanna upper-bound it
  at 3 man-years (min of 1 calendar-year)?

  It's a no-brainer.  Walk away from conservative GC; invest a bit in
  precise.  If need be, we can put all this in the form of a kind of
  Drake's equation for the bean counters.

-t


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Andy Freeman  
View profile  
 More options Jan 15 2003, 10:22 pm
Newsgroups: comp.lang.lisp
From: ana...@earthlink.net (Andy Freeman)
Date: 15 Jan 2003 19:22:36 -0800
Local: Wed, Jan 15 2003 10:22 pm
Subject: Re: conservative gc sucks

Erik Naggum <e...@naggum.no> wrote in message <news:3251567553621728@naggum.no>...
> * Andy Freeman
> | That data operations using valid handles are unaffected by unlink.

>   If you want to rely on this, do you still want the stream to be
>   closed when it is garbage collected, or do you want some control
>   over when it ceases to exist?

Streams?  My question was about how a described mechanism for
dealing with files interacts with the semantics for (local) Unix
filesystems.

One possible useful answer is "it breaks badly if you do <whatever>".

>   I guess I am trying to figure out why you brought this up in the
>   context of garbage-collected streams with finalization semantics.

I didn't.  I noted Naggum's distinction between streams and files and
asked about files.

>   Also, despite what you believe, this is not the only mechanism.

I don't "believe", I asked a simple question that could have been
answered with "no, this isn't a general mechanism", possibly with
an added "it's good when ..." or even "no general mechanism is
possible because ...".

I note again that general mechanisms had been discussed in the thread
and that the proposed mechanism was not labelled.  Thus, a question.

-andy


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Hans-J. Boehm  
View profile  
 More options Jan 20 2003, 4:52 pm
Newsgroups: comp.lang.lisp
From: Hans_Bo...@hp.com (Hans-J. Boehm)
Date: 20 Jan 2003 13:52:11 -0800
Local: Mon, Jan 20 2003 4:52 pm
Subject: Re: conservative gc sucks
I'm responding only to the few points on which we still disagree:

I still disagree about security exploits (see below).

And it still seems to me that the tradeoffs are substantial enough
that I would consider them.  If the foreign-function interface were
already specified and mandatory, were performant enough that I could
live with it everywhere, and I had enough resources to deal with the
extra implementation effort, I would go for the type-accurate
collector.  In the case of the gcj effort for example, I suspect none
of those are really true.  (The foreign-function interface (JNI) is
specified, but very complex and its performance often left something
to be desired.  The main gcj developers decided early on not to make
it mandatory.)  In the case of Mono, I know at least the last point
was an issue.

It also seems to me that providing the option of scanning some things,
e.g. C frames on the stack or C allocated objects, conservatively is
always good, since it gives you options you wouldn't otherwise have.
The real issues are whether

1) This costs you enough collector performance to negate the
flexibility advantages.  (I think the answer here is mostly unclear
for applications for which generational collection works well, and
mostly "no" for others.  Our collector seems very competitive for the
latter, but less so for the former.  Mostly copying collectors
probably help there.)

2) Once you have the facility, do you want to use it in cases for
which you could generate precise layout information with more
implementation effort and/or other overhead?

>     6) We agree that it is occasionally useful to let the collector
>        manage external resources such as file descriptors.  By using a
>        conservative GC, you may lose any guarantees about how many
>        file descriptors can be simultaneously open, and thus you may
>        run out of descriptors earlier than you expected.  

> Agreed, with the addition that running out of descriptors is not the
> _only_ danger: simply retaining a particular descriptor can open the
> door to exploits.

I disagree.  If you are using finalizers to close files, you shouldn't
be relying on the timing of the file close for security.  Its possible
that a type accurate collector might foil an exploit based on having
the file open.  But if that's the case, you just got lucky.  You have
a bug in the client code.

There are usually other ways for malicious code to delay such a file
close.  For example, it may fail to release a lock that's needed by a
finalizer preceding the file close in a finalization queue.  Or it may
force the heap to grow, thus decreasing GC frequency.  None of these
depend on conservative GC.

>        But for nearly all programming languages, you don't have those
>        guarantees anyway, for several reasons.  Object reachability is
>        usually not precisely defined.  And often the finalization
>        facility isn't quite up to this task.  

> Agreed.  This is an area where language designers and implementors
> need to do a much better job, and where a much better job for our
> purposes here can certainly be done.  There's a hard problem of
> defining reachability so as to minimize lifetimes to the greatest
> extent practical -- but that hard problem isn't the one we're talking
> about here: we only need to provide a useful upper-bound on lifetimes.

I doubt you can practically get that in this case, especially if you
are relying on a general purpose finalization mechanism.

(I think the most serious problem with current specifications of
finalization is that they don't provide a reasonable LOWER bound on
finalization time.  You need some guarantee that the file descriptor
is not still in a register and being accessed when the object holding
it becomes inaccessible and is finalized.  My impression is that
current implementations and don't even give you that guarantee.  It's
unclear the specifications do, either.)

>        In my experience, this still seems to work fine inpractice,
>        however, conservative GC or not.

> I can't reliably regression test the weak-references implementation in
> my Scheme interpreter that uses a partially-conservative GC, because
> it uses a partially-conservative GC.  That discovery was the
> particular incident that first soured me on conservative GC.

Isn't that at least partially a bug in the regression test?  If you
check that most weak references are cleared, the probability of a
spurious failure should rapidly go to zero with an increasing sample
size.  That's how I tend to write such tests.  Given the state of
"reachability" definitions, that's probably all you can really check
for anyway, at least without knowing a lot about your optimizer.

Hans


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages < Older 
« Back to Discussions « Newer topic     Older topic »