I have some data that I'm storing in a T_DATA VALUE. Is the data that's stored there part of the GC heap - IOW can it move in memory? If so, is there a way to pin it so that it doesn't move while I'm using it?
> I have some data that I'm storing in a T_DATA VALUE. Is the data > that's stored there part of the GC heap - IOW can it move in memory? > If so, is there a way to pin it so that it doesn't move while I'm > using it?
Not sure if I understand the question. A Data object has a pointer (RDATA(obj)->data) to some block of memory that you've allocated, and no, Ruby's GC process isn't going to assign some new value to that pointer.
If you're asking whether Ruby will move the address of the Data object itself: I'm guessing that that's possible.
On 6/1/06, Lyle Johnson <lyle.john...@gmail.com> wrote:
> If you're asking whether Ruby will move the address of the Data object > itself: I'm guessing that that's possible.
I was wondering about the latter. I couldn't find any APIs for pinning objects in memory so I was worried that the object might move out from underneath me. But on second thought I'd have the DATA pointer cached in a register / call stack in any event so it probably doesn't matter if the object moves in the future.
Lyle Johnson wrote: > If you're asking whether Ruby will move the address of the Data object > itself: I'm guessing that that's possible.
If ruby moved objects like that (whether T_DATA or T_OBJECT, T_STRING, etc), it would be a disaster. Every VALUE that referred to the object (in other words every reference to it in a variable, array, hash, etc.) would become invalid, since the VALUE type is actually a pointer in these cases. (I may be misunderstanding the question though...)
-- vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407
On Jun 1, 2006, at 6:47 PM, Joel VanderWerf wrote:
> Lyle Johnson wrote: >> If you're asking whether Ruby will move the address of the Data >> object >> itself: I'm guessing that that's possible.
> If ruby moved objects like that (whether T_DATA or T_OBJECT, T_STRING, > etc), it would be a disaster. Every VALUE that referred to the object > (in other words every reference to it in a variable, array, hash, > etc.) > would become invalid, since the VALUE type is actually a pointer in > these cases. (I may be misunderstanding the question though...)
> -- > vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407
No ruby does not move objects in memory. As to how horrible that would be if it did, there are GCs that do work like this (Copying GC). Believe it or not there are speed advantages to copying gcs in that the algorithm has runtime proportional to the number of reachable objects, rather than the size of the heap like mark-and- sweep (which is what ruby uses). Copying collectors also compact the the memory, reducing fragmentation. A copying GC would be difficult in the current ruby implementation since a copying gc cannot really be conservative (it has to change things in the root set), and ruby uses the C stack so it is difficult to be sure if something is definitely _not_ a pointer. With mark-and-sweep false positives are ok, since nothing ever gets moved. With a copying gc it could mistake an int on the c stack for a pointer "collect" the "object" it "pointed" to and then change the value. Which of course would be the cause of many odd and subtle bugs in ruby code.
So I would guess that Ruby memory allocation is relatively expensive? Certainly nowhere near as fast as allocating memory off of the "end" of the heap or the stack, right? Does it have to search a free list of blocks itself or does it delegate allocation to the system's malloc() implementation?
It's tricky doing the interop with the CLR since things like boxed value type objects *can* be moved in memory, so I need create a pinned GCHandle object to keep the GC from moving the object (this is also bad as you could imagine since it leads to heap fragmentation). So after spending most of the day thinking about the CLR side of the house, I was a bit surprised to find that Ruby doesn't move objects around.
This makes me a bit happier in a way since I don't have to worry about the issues on both sides of the house, but since I figured out how to do it on the CLR side, I was hoping to reuse that new-found experience on the Ruby side :)
> No ruby does not move objects in memory. As to how horrible that > would be if it did, there are GCs that do work like this (Copying > GC). Believe it or not there are speed advantages to copying gcs in > that the algorithm has runtime proportional to the number of > reachable objects, rather than the size of the heap like mark-and- > sweep (which is what ruby uses). Copying collectors also compact the > the memory, reducing fragmentation. A copying GC would be difficult > in the current ruby implementation since a copying gc cannot really > be conservative (it has to change things in the root set), and ruby > uses the C stack so it is difficult to be sure if something is > definitely _not_ a pointer. With mark-and-sweep false positives are > ok, since nothing ever gets moved. With a copying gc it could mistake > an int on the c stack for a pointer "collect" the "object" it > "pointed" to and then change the value. Which of course would be the > cause of many odd and subtle bugs in ruby code.
> So I would guess that Ruby memory allocation is relatively expensive? > Certainly nowhere near as fast as allocating memory off of the "end" > of the heap or the stack, right? Does it have to search a free list of > blocks itself or does it delegate allocation to the system's malloc() > implementation?
Speaking without any knowledge of ruby's internals I imagine it's actually is just allocating from the end of some pre-allocated buffer until it reaches the end of the buffer. So if you never run out of room in the buffer the allocation is just incrementing a pointer. When you reach the end you do the first GC and subsequent allocations have to search the freelist for a big enough chunk.
On Fri, Jun 02, 2006 at 04:57:54PM +0900, Logan Capaldo wrote: > On Jun 1, 2006, at 8:23 PM, John Lam wrote:
> >So I would guess that Ruby memory allocation is relatively expensive? > >Certainly nowhere near as fast as allocating memory off of the "end" > >of the heap or the stack, right? Does it have to search a free list of > >blocks itself or does it delegate allocation to the system's malloc() > >implementation?
> Speaking without any knowledge of ruby's internals I imagine it's > actually is just allocating from the end of some pre-allocated buffer > until it reaches the end of the buffer. So if you never run out of > room in the buffer the allocation is just incrementing a pointer. > When you reach the end you do the first GC and subsequent allocations > have to search the freelist for a big enough chunk.
Ruby does not use a compacting GC and doesn't manage memory itself (the way a normal memory allocator does) either. There are two parts to allocating an object: * each non-immediate object takes a sizeof(RVALUE)-sized slot (typically 20 bytes) from one of the heaps managed by ruby (look for RVALUE and heaps in gc.c). It's sizeof(RVALUE) for any object so there's no problem with "chunk sizes" and fragmentation (iow. all chunks are ~20 bytes long). A freelist is used to find unused slots in said heaps. Additional heaps of increasing size will be created when there are no free slots or too few were freed in a GC run. * most objects need additional memory (pointed to by fields in their corresponding slots): instance variable tables, char* for Strings, VALUE* for Arrays... these are allocated with malloc and will be freed when the corresponding object is reclaimed.
ruby relies on malloc(3) for low-level allocation, instead of doing it all with sbrk(2) and friends.
> On Fri, Jun 02, 2006 at 04:57:54PM +0900, Logan Capaldo wrote: >> On Jun 1, 2006, at 8:23 PM, John Lam wrote:
>>> So I would guess that Ruby memory allocation is relatively >>> expensive? >>> Certainly nowhere near as fast as allocating memory off of the "end" >>> of the heap or the stack, right? Does it have to search a free >>> list of >>> blocks itself or does it delegate allocation to the system's >>> malloc() >>> implementation?
>> Speaking without any knowledge of ruby's internals I imagine it's >> actually is just allocating from the end of some pre-allocated buffer >> until it reaches the end of the buffer. So if you never run out of >> room in the buffer the allocation is just incrementing a pointer. >> When you reach the end you do the first GC and subsequent allocations >> have to search the freelist for a big enough chunk.
> Ruby does not use a compacting GC and doesn't manage memory itself > (the way a > normal memory allocator does) either. > There are two parts to allocating an object: > * each non-immediate object takes a sizeof(RVALUE)-sized slot > (typically 20 > bytes) from one of the heaps managed by ruby (look for RVALUE and > heaps in > gc.c). It's sizeof(RVALUE) for any object so there's no problem > with "chunk > sizes" and fragmentation (iow. all chunks are ~20 bytes long). A > freelist > is used to find unused slots in said heaps. Additional heaps of > increasing > size will be created when there are no free slots or too few were > freed in a > GC run. > * most objects need additional memory (pointed to by fields in their > corresponding slots): instance variable tables, char* for > Strings, VALUE* > for Arrays... these are allocated with malloc and will be freed > when the > corresponding object is reclaimed.
> ruby relies on malloc(3) for low-level allocation, instead of doing > it all > with sbrk(2) and friends.
Interesting. (-- takes notes --). Almost seems like cheating :). But in a good way. I'm going to have read gc.c. Speaking of reading ruby source, is there an order you would recommend? Every time I look at it I get overwhelmed by a) not knowing where to start and b) K&R C. I can power-through the K&R C for the most part I think, but figuring out what to read when is tougher.
On 6/1/06, Joel VanderWerf <vj...@path.berkeley.edu> wrote:
> If ruby moved objects like that (whether T_DATA or T_OBJECT, T_STRING, > etc), it would be a disaster. Every VALUE that referred to the object > (in other words every reference to it in a variable, array, hash, etc.) > would become invalid, since the VALUE type is actually a pointer in > these cases.
You know, I did know that, but it didn't occur to me at the time. Good point.
> Interesting. (-- takes notes --). Almost seems like cheating :). But > in a good way. I'm going to have read gc.c. Speaking of reading ruby > source, is there an order you would recommend? Every time I look at > it I get overwhelmed by a) not knowing where to start and b) K&R C. I > can power-through the K&R C for the most part I think, but figuring > out what to read when is tougher.
On Fri, 02 Jun 2006 20:25:02 +0200, Logan Capaldo <logancapa...@gmail.com> wrote:
> Interesting. (-- takes notes --). Almost seems like cheating :). But in > a good way. I'm going to have read gc.c. Speaking of reading ruby > source, is there an order you would recommend? Every time I look at it I > get overwhelmed by a) not knowing where to start and b) K&R C. I can > power-through the K&R C for the most part I think, but figuring out what > to read when is tougher.
Lyle Johnson wrote: > On 6/1/06, Joel VanderWerf <vj...@path.berkeley.edu> wrote:
>> If ruby moved objects like that (whether T_DATA or T_OBJECT, T_STRING, >> etc), it would be a disaster. Every VALUE that referred to the object >> (in other words every reference to it in a variable, array, hash, etc.) >> would become invalid, since the VALUE type is actually a pointer in >> these cases.
> You know, I did know that, but it didn't occur to me at the time. Good > point.
I had no doubt that you knew it; we would not have FXRuby otherwise :)
-- vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407
On Sat, Jun 03, 2006 at 03:25:02AM +0900, Logan Capaldo wrote: > On Jun 2, 2006, at 4:44 AM, Mauricio Fernandez wrote: [...] > >ruby relies on malloc(3) for low-level allocation, instead of doing it all > >with sbrk(2) and friends.
> Interesting. (-- takes notes --). Almost seems like cheating :). But > in a good way. I'm going to have read gc.c. Speaking of reading ruby > source, is there an order you would recommend? Every time I look at > it I get overwhelmed by a) not knowing where to start and b) K&R C. I > can power-through the K&R C for the most part I think, but figuring > out what to read when is tougher.
It depends on what you're interested in (/me slaps self). The easiest starting points would be array.c, hash.c (st.c if you really want to see the underlying st_table implementation, but it's just your regular hash table), string.c... that is, the core data structures. They are very easy to read, but maybe not that interesting ultimately due to this very straightforwardness.
As for the more interesting stuff, here are some functions to begin with: * eval.c: * rb_eval: the basic AST walker * rb_call, rb_get_method_body: method dispatching (+method cache) at work * rb_add_method: managing the method tables (m_tbl) * rb_include_module: to see how proxy classes (T_ICLASS) work; bits of Ruby's object model .... * parse.y: the grammar + yylex (*tricky*)
This is what I answered to a similar question 3 years ago in [74002]:
Ruby Core * dln.c: wraps dlopen or the equiv. function of your platform, not very interesting * gc.c: quite easy to follow, of interest only if you want to know how the GC works internally, but it's just mark & sweep doing "common sense" things so you can safely skip it. * st.c: a hash table implementation used internally by Ruby, quite straightforward * eval.c: much harder to read as you have to know the node types to follow it; several functions are essentially a big switch() statement for a node * parse.y: this can help you see what different node types correspond to by having a look at the grammar. * regex.c: whatever, don't read it :-)
some other .c files contain only support code
Built-in classes Take the class you like, scroll down to the Init_xxx() function and locate the C function that implements the method you want to study. No particular order required.