Since this has come up again, and it's apparent that the last time around I wasn't sufficiently clear, it's time to go through this again, and for the final time. (I will beat this thing into the ground by the time we're done)
The way object serialization will be handled from the bytecode level is simple: You call freeze or thaw, like so:
freeze S3, P5 # Freezes the P5 PMC, and children, to the string in S3 thaw P5, S3 # Thaw out the serialized object
From within the PMC vtables, there are two vtable entries that are of importance:
freeze tells the PMC to freeze itself and put the results of the freezing in the string. This is an object method
Thaw tells the class to construct a PMC based on the passed-in string data. It's a class method, though it may be called on an existing PMC. It always produces a new PMC. (We can argue over that one if need be)
The runtime will provide the following functionality
Freezing core singleton data (int, num, string, PMC) freezing named lists freezing named key/value lists thaw core singleton data thaw named lists thaw named key/value lists
chill PMC * warm STRING *
When freezing or thawing, if child PMCs are frozen/thawed using the appropriate calls to the runtime (rather than the freeze/thaw method of a PMC doing it manually for child PMCs) then the runtime will guarantee that multiply-referenced PMCs will be instantiated only once.
The chill and warm runtime methods take a PMC or a frozen representation of a PMC (respectively) and provide a human readable version of that PMC. If the PMC has a __debug_chill or __debug_warm method (which are optional) then that method will be called to provide a human-readable version of the PMC, otherwise the default will be used.
The encoding methods for freezing (and corresponding decoding methods for thawing) may be overridden to provide an alternate serialization format. The only requirement of the serialziation format is that it starts with a minimally valid piece of XML that encodes the format and version of the serialized format. The rest of the serialization format need not be XML. This is done because the format and version of the serialized data are required in the stream, and making it XML incoveniences nobody and makes the XML folks happy. It's good enough, and not up for discussion.
The overriding API is not, as yet, specified. We can do that when we're done fighting out over the semantics here.
The following rules will be in place:
1) Freezing at the destruction level may *not* use any additional memory for object traversal 2) Overlapping freezes are not allowed 3) Freezing while a freeze is in progress is deferred until the current freeze is done 4) Destruction level freezing may not freeze non-dead PMCs
I can be convinced that #4 is not going to fly, but be aware that it will potentially cost an extra pointer per PMC.
Requirement #1 generally mandates that all PMCs must have sufficient information available all the time to perform serialization. It also mandates that there can't be iterators, save hashes, or other whatnots, since the potential for destroy-time serialization means those methods are untenable. I'm open to argument that the freeze and thaw methods can be context sensitive and the non-destroy case can be memory hungry and multithreaded, but that's a dodgy thing. Make a really good case.
Note that I do *not* want to have multiple object traversal systems in parrot! We have one for DOD, and proposals have ranged upwards from there. No. That is *not* happening--the chance for error is significant, the side-effects of the error annoying and tough to track down for complex cases (akin to the trouble with tracking down GC issues), and just not necessary. (Perhaps desirable for speed/space reasons, but desirable isn't necessary) This is something that's hidden under a number of layers of API, so regardless of the outcome it doesn't affect the assembly, PMC, or runtime API.
Clone can be handled by switching in encodings (one that produces a PMC rather than an encoded stream, or one that just does an intermediate string and chews memory, at least for now), and doesn't require a separate API entry point. I'm unconvinced that objects need to clone themselves often enough to bother with a separate API entry for 'em, or if one is provided that VTABLE_clone() should do anything for non-simple objects besides calling VTABLE_thaw(VTABLE_freeze()).
The thread-safety is an issue, and as such interpreters that share data should share a mutex. At the moment I think we should have separate symbolic mutexes that may, potentially, resolve down to a single mutex, but when freezing an interpreter should grab its freeze mutex. If there are multiple interperters in the thread group they will all have the same mutex, and as such will be single-threaded for the freeze.
We can fight about this, too, but the fight is orthogonal to the other issues.
Note that I forgot to account for the possibility of freezing code and environment. As such, the freeze and thaw opcodes have the following variant:
freeze Sx, Py, Iz thaw Py, Sx, Iz
where Z is the freeze/thaw level. O (the default) is the PMC only, 1 includes the global/class data for the PMC, and 2 includes the code for the packages for the PMC, pruned for accessibility.
Also note that if teh PMC being frozen is a closure or continuation, then all lexical and global data to reinstantiate the PMC (including potentially code) will be frozen at level 0.
>The encoding methods for freezing (and corresponding decoding methods for >thawing) may be overridden to provide an alternate serialization format. >The only requirement of the serialziation format is that it starts with a >minimally valid piece of XML that encodes the format and version of the >serialized format. The rest of the serialization format need not be XML. >This is done because the format and version of the serialized data are >required in the stream, and making it XML incoveniences nobody and makes >the XML folks happy. It's good enough, and not up for discussion.
Does that mean all encodings will start with the "standard" markup header:
<xml version="1.0"?>
I'm not sure what minimally valid means, but I think this is considerable overhead.
Also, I don't think I like using freeze/thaw format for the standard PBC format if we are embedding XML everywhere, which is ok, but Leo had mentioned that we would use freeze/thaw for loading classes. With XML in the picture I think it becomes too bulky.
If we are to use recursive serialization for classes then we'd end up with fat bytecodes; every field and method would have the XML header.
>>>>> "MS" == Melvin Smith <mrjoltc...@mindspring.com> writes:
MS> At 04:38 PM 10/20/2003 -0400, Dan Sugalski wrote: >> The encoding methods for freezing (and corresponding decoding methods for >> thawing) may be overridden to provide an alternate serialization format. >> The only requirement of the serialziation format is that it starts with a >> minimally valid piece of XML that encodes the format and version of the >> serialized format. The rest of the serialization format need not be XML. >> This is done because the format and version of the serialized data are >> required in the stream, and making it XML incoveniences nobody and makes >> the XML folks happy. It's good enough, and not up for discussion.
MS> Does that mean all encodings will start with the "standard" markup header:
MS> <xml version="1.0"?>
MS> I'm not sure what minimally valid means, but I think this is considerable MS> overhead.
how so? most serialized things (other than simple scalars) will generate much more data than that.
MS> Also, I don't think I like using freeze/thaw format for the MS> standard PBC format if we are embedding XML everywhere, which is MS> ok, but Leo had mentioned that we would use freeze/thaw for MS> loading classes. With XML in the picture I think it becomes too MS> bulky.
i don't he was referring to the current p5 freeze/thaw formats or code. he was just using freeze/thaw as the methods/calls to be made to do the work. the actual formats are undecided now AFAICT.
MS> If we are to use recursive serialization for classes then we'd end MS> up with fat bytecodes; every field and method would have the XML MS> header.
the xml header is only for the top level thing in the serialized tree. if it is nonstandard you have to mark the serialized string so you can call the matching thaw methods. each object in the serialized tree will have to support that method or some code has to be supplied to handle all the freeze/thaw calls made by the tree traverser code. so the xml header is just a way to mark which external class will be used for the freeze/thaw and it will always be called for each object in the tree. you can't mix/match different freeze/thaw techniques in one operation (yes, you could but then you do have to mark each node with its technique which is a lot of overhead and painful in other ways).
uri
-- Uri Guttman ------ u...@stemsystems.com -------- http://www.stemsystems.com --Perl Consulting, Stem Development, Systems Architecture, Design and Coding- Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
> >>>>> "MS" == Melvin Smith <mrjoltc...@mindspring.com> writes:
> MS> At 04:38 PM 10/20/2003 -0400, Dan Sugalski wrote: > >> The encoding methods for freezing (and corresponding decoding > methods for > >> thawing) may be overridden to provide an alternate serialization format. > >> The only requirement of the serialziation format is that it starts > with a > >> minimally valid piece of XML that encodes the format and version of the > >> serialized format. The rest of the serialization format need not be XML. > >> This is done because the format and version of the serialized data are > >> required in the stream, and making it XML incoveniences nobody and makes > >> the XML folks happy. It's good enough, and not up for discussion.
> MS> Does that mean all encodings will start with the "standard" markup > header:
> MS> <xml version="1.0"?>
> MS> I'm not sure what minimally valid means, but I think this is > considerable > MS> overhead.
>how so? most serialized things (other than simple scalars) will generate >much more data than that.
> MS> Also, I don't think I like using freeze/thaw format for the > MS> standard PBC format if we are embedding XML everywhere, which is > MS> ok, but Leo had mentioned that we would use freeze/thaw for > MS> loading classes. With XML in the picture I think it becomes too > MS> bulky.
>i don't he was referring to the current p5 freeze/thaw formats or >code. he was just using freeze/thaw as the methods/calls to be made to >do the work. the actual formats are undecided now AFAICT.
> MS> If we are to use recursive serialization for classes then we'd end > MS> up with fat bytecodes; every field and method would have the XML > MS> header.
>the xml header is only for the top level thing in the serialized >tree. if it is nonstandard you have to mark the serialized string so you >can call the matching thaw methods. each object in the serialized tree >will have to support that method or some code has to be supplied to >handle all the freeze/thaw calls made by the tree traverser code. so the >xml header is just a way to mark which external class will be used for >the freeze/thaw and it will always be called for each object in the >tree. you can't mix/match different freeze/thaw techniques in one >operation (yes, you could but then you do have to mark each node with >its technique which is a lot of overhead and painful in other ways).
That answers my question of overhead with regards to XML headers. If there is a single header for defining the type of "stream" then the actual serialization can be dense enough.
I just needed clarification. :)
I think a grammar should be developed for this, since it should likely be implemented as a recursive parser.
>>>>> "MS" == Melvin Smith <mrjoltc...@mindspring.com> writes:
MS> That answers my question of overhead with regards to XML headers. MS> If there is a single header for defining the type of "stream" then the MS> actual serialization can be dense enough.
MS> I just needed clarification. :)
well, that was my take on what dan said. i could have misinterpreted him.
MS> I think a grammar should be developed for this, since it should likely MS> be implemented as a recursive parser.
there are many ways to encode serialized stuff and not all are conducive to grammars IMO. some may be simple type/length/binary value things which are best expanded in nice fast loops. other formats could just go into pure ugly XML which can be parsed back into data. also dan mentioned the half-frozen human readable types (which both the binary AND XML formats need!) which don't (or may not) get parsed again.
uri
-- Uri Guttman ------ u...@stemsystems.com -------- http://www.stemsystems.com --Perl Consulting, Stem Development, Systems Architecture, Design and Coding- Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
> >>>>> "MS" == Melvin Smith <mrjoltc...@mindspring.com> writes:
> MS> That answers my question of overhead with regards to XML headers. > MS> If there is a single header for defining the type of "stream" then the > MS> actual serialization can be dense enough.
> MS> I just needed clarification. :)
>well, that was my take on what dan said. i could have misinterpreted him.
> MS> I think a grammar should be developed for this, since it should likely > MS> be implemented as a recursive parser.
>there are many ways to encode serialized stuff and not all are conducive >to grammars IMO. some may be simple type/length/binary value things >which are best expanded in nice fast loops. other formats could just go >into pure ugly XML which can be parsed back into data. also dan >mentioned the half-frozen human readable types (which both the binary >AND XML formats need!) which don't (or may not) get parsed again.
When I say grammar, I don't mean as input to a parser generator. I'm accustomed to using grammars for any sort of non-trivial packet or data format. It makes dealing with and discussing the format much easier.
> the xml header is only for the top level thing in the serialized > tree. if it is nonstandard you have to mark the serialized string so you > can call the matching thaw methods. each object in the serialized tree > will have to support that method or some code has to be supplied to > handle all the freeze/thaw calls made by the tree traverser code. so the > xml header is just a way to mark which external class will be used for > the freeze/thaw and it will always be called for each object in the > tree. you can't mix/match different freeze/thaw techniques in one > operation (yes, you could but then you do have to mark each node with > its technique which is a lot of overhead and painful in other ways).
I find the notion of an "XML header" a bit confusing, given Dan's statement to the effect that it was a throw to XML folks.
I think anything "XML folks" will be interested in will entail *wrapping* stuff, not *prefixing* it.
Perhaps Dan meant "wrapper" or "envelope" when he said "header", but its not clear to me without an example. So, I'll put some examples out there for folks to throw stuff at:
Here's an example of a Parrot Magic Ice (PMI) header (no love here from XML folks):
<pmi class="foo" .../> # Data of some sort determined by class foo
(This is the way I read Dan's original comment.)
Taken in its entirety, this chunk isn't XML. Sure, you could pull out the first line and pass it to something that understands XML, and it wouldn't choke. But, if there is value here, I'm missing it. It could just as well have been SMTP style:
PMI-Class: foo Some-Other-Header: ... # Data of some sort
Here's an example of a wrapper:
<pmi class="foo" ...> <!-- Data of some sort determined by class foo --> </pmi>
That's a bit better, although bear in mind that if the intent is that you could throw the entire chunk at an XML parser and have it not complain, you are going to have to take some care in generating the guts. Binary data is in general right out (where does it end? What if it contains fragments that look like XML markup?). Sure, you could slap it in a <![CDATA[ ... ]]> film, but you'd still have to watch out for the possibility that the body might want the sequence "]]>" in it somewhere...
.............................................
OK. Now, I'll throw one crazy idea into the mix. Suppose for just a moment that instead of using XML proper, or leaving things completely open-ended, we adopted SAX events as our interchange. it would be roughly equivalent to:
begin-element pmi { class => 'foo' } # events for the guts end-element pmi
Now, anyone who likes XML can hook up a DOM tree builder, or an XML renderer to the stream of events and be happy as a clam. But, for storing stuff on disk, we are free to invent a more compact representation of the events. Thawing entails interpreting the events as object allocations and state changes to the objects.
I can imagine some reasonably compact representations...
On Oct 20, 2003, at 10:09 PM, Gregor N. Purdy wrote:
> Here's an example of a wrapper:
> <pmi class="foo" ...> > <!-- Data of some sort determined by class foo --> > </pmi>
> That's a bit better, although bear in mind that if the intent is that > you could throw the entire chunk at an XML parser and have it not > complain, you are going to have to take some care in generating the > guts. Binary data is in general right out (where does it end? What if > it contains fragments that look like XML markup?). Sure, you could > slap it in a <![CDATA[ ... ]]> film, but you'd still have to watch > out for the possibility that the body might want the sequence "]]>" > in it somewhere...
A slight aside, but just to build on what you said since this is an often-misunderstood facet of XML, there are a bunch of other reasons you can't just throw binary data into an arbitrary XML document inside of a CDATA section:
1) If you declare the encoding of the XML to be UTF-8, then your document won't be well-formed XML if your binary data doesn't look like legitimate UTF-8 data (which it won't in the general case--many bytes and byte sequences can't occur in UTF-8).
2) XML parsers are free to transcode--so a document declared in one encoding may pass data on to its client application in another encoding, which will mangle your binary data. For instance, the expat parser can consume documents in a variety of encodings, but always passes text to its callbacks in UTF-8.
3) XML parsers are required to line-ending normalize, so anything which looks like a CR or CRLF will turn into an LF, even inside of a CDATA section.
That said, you can get around all of this by base-64 encoding your binary data for storage in XML, since that turns binary data into text. On the other hand, it's a waste of space and more CPU cycles to consume than a more obvious binary format.
And that said, I read Dan's email as just meaning an XML header, but I didn't quite understand exactly what he had in mind either.
Dan Sugalski <d...@sidhe.org> wrote: > Since this has come up again, ...
[ FYI: I was starting implementing this, based on a general traverse vtable with callback functions. Two patches got backed out by Dan after some discussion in PM ]
> ... and it's apparent that the last time around > I wasn't sufficiently clear, it's time to go through this again, and for > the final time.
I'd be really happy, if you could go through my concerns mentioned in the summary in the thread: Subject: Re: [RfC] vtable->dump Date: Thu, 4 Sep 2003 12:31:08 +0200
> ... (I will beat this thing into the ground by the time we're > done)
Sorry for the inconvenience and being ignorant ...
> PMC *thaw(interpreter, STRING *)
This should IMHO be able to create constant PMCs out of metadata, e.g. for subroutine objects. So there should be some means to tell thaw() to create PMC(s) in the constant_pmc_pool.
> The chill and warm runtime methods take a PMC or a frozen representation > of a PMC (respectively) and provide a human readable version of that PMC.
I dunno, why chill() is superior to dump() or pretty_print(), but the name doesn't really matter.
> 1) Freezing at the destruction level may *not* use any additional memory > for object traversal
What is "Freezing at the destruction level"? Is this anyhow related to destruction ordering?
> Note that I do *not* want to have multiple object traversal systems in > parrot! We have one for DOD, and proposals have ranged upwards from there. > No. That is *not* happening--the chance for error is significant, the > side-effects of the error annoying and tough to track down for complex > cases (akin to the trouble with tracking down GC issues), and just not > necessary. (Perhaps desirable for speed/space reasons, but desirable > isn't necessary)
DOD's mark() routine has different requirements then a general traverse() for freeze(), chill(), clone(), and destruction ordering. Using just mark() will have these side effects that you want to avoid.
A general traverse() can be depth first of breadth first, mark() isn't required do have any specific ordering as long as it sets live bits everywhere.
mark() is called permanently in a running interpreter, that does non trivial things. There are shortcuts for scalars, DOD is highly optimized not to destroy cache coherency. Using mark() also implies to back out my small PMC patches. All the advantages of smaller scalars are gone then.
While freeze() and friends have to pull in each PMC into the cache, just setting the live bit on a PMC hasn't. Further: Lukes proposal for speeding up timely destruction puts objects either in front or at the end of the next_for_GC chain. This IMHO implies that mark() is unusable as your general and solely iterator.
> ... This is something that's hidden under a number of layers > of API, so regardless of the outcome it doesn't affect the assembly, PMC, > or runtime API.
So when its hidden, I really don't understand, why you are insisting on an (IMHO) suboptimal design.
> The thread-safety is an issue,
While all schemes aren't thread-safe from user level (e.g. manually sorting an array containing shared PMCs, while it gets frozen), your scheme isn't thread-safe at low-level, as the next_for_GC pointer inside the PMC is used as a duplicate marker. But if a user changes shared resources its a user problem. We only guarantee atomic updates per PMC (s. P6E p 86f by Dan).
> Dan
Comments addressing all these issues are highly welcome, leo
Leopold Toetsch <l...@toetsch.at> writes: > > 1) Freezing at the destruction level may *not* use any additional memory > > for object traversal
This is a really hard problem. In some early experiments with destruction ordering (one of the problems wich need iteration) I didn't get around with allocating new memory, or recursing on the stack. It may be that we can get arround with a second pointer, but I'm not sure.
> What is "Freezing at the destruction level"? Is this anyhow related to > destruction ordering?
> > Note that I do *not* want to have multiple object traversal systems in > > parrot! We have one for DOD, and proposals have ranged upwards from there. > > No. That is *not* happening--the chance for error is significant, the > > side-effects of the error annoying and tough to track down for complex > > cases (akin to the trouble with tracking down GC issues), and just not > > necessary. (Perhaps desirable for speed/space reasons, but desirable > > isn't necessary)
I did some benchmarking (to test our hash implementation, but thats a different story). One thing I found out: We are completely dominated by gc. I'm not sure if it was trace_systemareas or the mark method, but don't put any load on mark.
mark should be as fast as possible. The other uses of traverse for freeze, dump, destruction-ordering etc. are all more or less called on user request, so the user needs to know its cost.
One other thing that makes mark different. If we ever want to use a copying collector (Which is not reachable currently because of the conservative stack-walking) The mark routine needs to know about the moving of objects. All other traverse routine never get this problem.
> DOD's mark() routine has different requirements then a general > traverse() for freeze(), chill(), clone(), and destruction ordering. > Using just mark() will have these side effects that you want to avoid.
My words. mark() is not traverse() also they do similar things.
> A general traverse() can be depth first of breadth first, mark() isn't > required do have any specific ordering as long as it sets live bits > everywhere.
> mark() is called permanently in a running interpreter, that does non > trivial things. There are shortcuts for scalars, DOD is highly optimized > not to destroy cache coherency. Using mark() also implies to back out > my small PMC patches. All the advantages of smaller scalars are gone > then.
This ist just on more thing of mark() speed.
> While freeze() and friends have to pull in each PMC into the cache, just > setting the live bit on a PMC hasn't. Further: Lukes proposal for > speeding up timely destruction puts objects either in front or at the > end of the next_for_GC chain. This IMHO implies that mark() is unusable > as your general and solely iterator.
> > ... This is something that's hidden under a number of layers > > of API, so regardless of the outcome it doesn't affect the assembly, PMC, > > or runtime API.
> So when its hidden, I really don't understand, why you are insisting on > an (IMHO) suboptimal design.
We have at the moment 15 (in words fifteen) vtable slots for dividing/remainder, 5 for multiplikation, 24 for bitwise ops. So bloating the vtable is by design, but it is the end of world if we special case the most often called function and have 2 (in words two) walking functions. Sorry, I think there are other places in the vtable which need some cleanup.
> > The thread-safety is an issue,
> While all schemes aren't thread-safe from user level (e.g. > manually sorting an array containing shared PMCs, while it gets > frozen), your scheme isn't thread-safe at low-level, as the next_for_GC > pointer inside the PMC is used as a duplicate marker. But if a user > changes shared resources its a user problem. We only guarantee atomic > updates per PMC (s. P6E p 86f by Dan).
The thread safty is less a problem for marking. It only needs to make sure that other threads don't munge the data they are walking. Write barriers or mutexes might help here. But how to freeze an object of an other thread? This needs to freeze the whole thread.
> > Dan
> Comments addressing all these issues are highly welcome, > leo
I think we should address this issue like experimentalists: Create the general traverse function. (No don't call it mark). Implement freeze, dump, destruction ordering using this function. When this all is running, port the mark function to use this new functionality. Benchmark, and watch the speedup of the brandnew design (or just find out that the slowdown is not bad enough to satisfy two walking functions). When the benchmarking is done lets descide if we need only one walk-function, and only than remove the mark function.
On Mon, 20 Oct 2003, Melvin Smith wrote: > At 04:38 PM 10/20/2003 -0400, Dan Sugalski wrote: > >The encoding methods for freezing (and corresponding decoding methods for > >thawing) may be overridden to provide an alternate serialization format. > >The only requirement of the serialziation format is that it starts with a > >minimally valid piece of XML that encodes the format and version of the > >serialized format. The rest of the serialization format need not be XML. > >This is done because the format and version of the serialized data are > >required in the stream, and making it XML incoveniences nobody and makes > >the XML folks happy. It's good enough, and not up for discussion.
> Does that mean all encodings will start with the "standard" markup header:
> <xml version="1.0"?>
Each serialized data stream, yeah. Just once, and probably a few extra characters in there, to make it legit XML.
Definitely not once per object, just once per stream--if you do:
freeze S3, P5
and P5 happens to have a PMC that points to your top level symbol table, the string in S3 will be darned huge, and have the <XML foo=?> thing in there exactly once. (Well, unless you've chosen an XML encoding, in which case I expect you've just blown memory... :)
> > the xml header is only for the top level thing in the serialized > > tree. if it is nonstandard you have to mark the serialized string so you > > can call the matching thaw methods. each object in the serialized tree > > will have to support that method or some code has to be supplied to > > handle all the freeze/thaw calls made by the tree traverser code. so the > > xml header is just a way to mark which external class will be used for > > the freeze/thaw and it will always be called for each object in the > > tree. you can't mix/match different freeze/thaw techniques in one > > operation (yes, you could but then you do have to mark each node with > > its technique which is a lot of overhead and painful in other ways).
> I find the notion of an "XML header" a bit confusing, given Dan's > statement to the effect that it was a throw to XML folks.
> I think anything "XML folks" will be interested in will entail > *wrapping* stuff, not *prefixing* it.
Nah, I expect what they'll want is for the entire data stream of serialized objects to be in XML format. Which is fine--they can have that. (It's why I mentioned the serialization routines can be overridden)
For an XML stream the header might be <xml parrot format='xml' version=1.0> with the rest of the stream in XML. A YAML stream would start <xml parrot format='yaml' version=1.0> with the rest in YAML, and teh binary format as <xml parrot format='binary' version=1.0>. Or something like that, modulo actual correct XML.
This way we have a single, fixed-format type/version header, which makes the initial identification easier and less error-prone. (Possibly even fit for file and programs of its ilk to note) The binary format won't care, and teh YAML format shouldn't care (as long as the indenting's right) but the XML format would, so it seems to make sense to use the XML stuff for the initial header.
On Tue, 21 Oct 2003, Leopold Toetsch wrote: > Dan Sugalski <d...@sidhe.org> wrote: > > Since this has come up again, ...
> [ FYI: I was starting implementing this, based on a general traverse > vtable with callback functions. Two patches got backed out by > Dan after some discussion in PM ]
Right, because you'd implemented some stuff I'd specifically said we weren't doing, and didn't back them out any of the times I asked...
> > ... and it's apparent that the last time around > > I wasn't sufficiently clear, it's time to go through this again, and for > > the final time.
> I'd be really happy, if you could go through my concerns mentioned in > the summary in the thread:
That's why I did this, in part. It's the plan, until declared otherwise.
> > PMC *thaw(interpreter, STRING *)
> This should IMHO be able to create constant PMCs out of metadata, e.g. > for subroutine objects. So there should be some means to tell thaw() to > create PMC(s) in the constant_pmc_pool.
There should be a way to put PMCs in the constant pool in general. I was thinking a constant op would work--something like
constant Ix, [SP]y
to make the string or PMC Y a constant at slot X in the constant pool. Passing in the PMC header to be filled in also works, though both fail if you want full PMC trees marked as constants since thawing out a PMC stream may involve creating multiple PMCs. (In which case we might be better temporarily switching allocation pools at constant creation time, rather than passing in PMCs)
> > The chill and warm runtime methods take a PMC or a frozen representation > > of a PMC (respectively) and provide a human readable version of that PMC.
> I dunno, why chill() is superior to dump() or pretty_print(), but the > name doesn't really matter.
The important thing is that it's not a vtable method. It's a function that belongs in the freeze/thaw API as it's just an alternate encoding or decoding. (Arguably it ought not be a separate API entry at all and just another encoding scheme, but that requires transcoding serialization forms, and I'd rather not get into that)
> > 1) Freezing at the destruction level may *not* use any additional memory > > for object traversal
> What is "Freezing at the destruction level"? Is this anyhow related to > destruction ordering?
No. There are some valid cases where an object, after having been declared dead by the DOD, wants to serialize itself. Persistent object stores apparently do this, and it makes a certain amount of sense--when the object goes out of scope the current state is flushed to disk.
It puts a number of unpleasant constraints on the core freeze routines. User code can violate them and take the consequences, but we can't.
> > Note that I do *not* want to have multiple object traversal systems in > > parrot! We have one for DOD, and proposals have ranged upwards from there. > > No. That is *not* happening--the chance for error is significant, the > > side-effects of the error annoying and tough to track down for complex > > cases (akin to the trouble with tracking down GC issues), and just not > > necessary. (Perhaps desirable for speed/space reasons, but desirable > > isn't necessary)
> DOD's mark() routine has different requirements then a general > traverse() for freeze(), chill(), clone(), and destruction ordering. > Using just mark() will have these side effects that you want to avoid.
The only thing that mark does that the general traversal doesn't, in the abstract, is flip the object's live flag. Everything else is an optimization of code which we can, if we need, discard.
> A general traverse() can be depth first of breadth first, mark() isn't > required do have any specific ordering as long as it sets live bits > everywhere.
I'm pretty sure that with a singly linked list we can get a generally properly-ordered flattened tree without having to do an insane number of passes across the dead object store. I may be incorrect in this, but I don't think so, and for our purposes the live bit can be safely ignored if we end up setting it, though potentially with another pass over the dead store, which may end up prohibitively expensive. We'll see.
> mark() is called permanently in a running interpreter, that does non > trivial things. There are shortcuts for scalars, DOD is highly optimized > not to destroy cache coherency. Using mark() also implies to back out > my small PMC patches. All the advantages of smaller scalars are gone > then.
All of this stuff for freezing is going to end up killing the small PMC patch anyway, unfortunately, since we're going to have to be able to traverse PMCs in the destruction phase, which means we have to have the means of traversal at hand as we can't guarantee that we can allocate more PMCs or resize the PMCs ext data.
> While freeze() and friends have to pull in each PMC into the cache, just > setting the live bit on a PMC hasn't. Further: Lukes proposal for > speeding up timely destruction puts objects either in front or at the > end of the next_for_GC chain. This IMHO implies that mark() is unusable > as your general and solely iterator.
We may end up playing macro games with the code. I can live with multiple instantiations of the code, but I want only a single chunk of source to be maintained.
> > ... This is something that's hidden under a number of layers > > of API, so regardless of the outcome it doesn't affect the assembly, PMC, > > or runtime API.
> So when its hidden, I really don't understand, why you are insisting on > an (IMHO) suboptimal design.
YHO, in this case, turns out to not consider all the issues involved. Either way, we do it this way for now. Once the implementation of the specified API is in, we can fight over reworking the internals.
On Tue, 21 Oct 2003, Elizabeth Mattijsen wrote: > At 08:21 -0400 10/21/03, Dan Sugalski wrote: > > > I find the notion of an "XML header" a bit confusing, given Dan's > >> statement to the effect that it was a throw to XML folks.
> >> I think anything "XML folks" will be interested in will entail > >> *wrapping* stuff, not *prefixing* it.
> >Nah, I expect what they'll want is for the entire data stream of > >serialized objects to be in XML format. Which is fine--they can have that. > >(It's why I mentioned the serialization routines can be overridden)
> >For an XML stream the header might be <xml parrot format='xml' > >version=1.0> with the rest of the stream in XML. A YAML stream would start > ><xml parrot format='yaml' version=1.0> with the rest in YAML, and teh > >binary format as <xml parrot format='binary' version=1.0>. Or something > >like that, modulo actual correct XML.
> If you want that to be looking like valid XML, it would have to be different:
> error: Specification mandate value for attribute parrot > <xml parrot/> > ^ > Better in my opinion would be something like:
> <parrot format="xml" version="1.0"/>data yadda yadda yadda
I'm not an XML guy, and I'm making all this up as I go along. If that's better, fine with me. :)
> >This way we have a single, fixed-format type/version header, which makes > >the initial identification easier and less error-prone. (Possibly even > >fit for file and programs of its ilk to note) The binary format won't > >care, and teh YAML format shouldn't care (as long as the indenting's > >right) but the XML format would, so it seems to make sense to use the XML > >stuff for the initial header.
> So are we talking about a header or a wrapper? If it is really a > header, it's not XML and then it's prettyy useless from an XML point > of view.
We're talking about the first thing in a file (or stream, or whatever). I was under the impression that XML files should be entirely composed of valid XML, hence the need for the stream type marker being valid XML. YAML doesn't care as much, so far as I understand, and for our own internal binary format we cna do whatever we want. If that's not true, then we can go for a more compact header.
Note that the serialized stream will be different depending on the encoder chosen. If you have the structure:
Only not inevitably horribly broken, invalid, and poorly done. :) The YAML form might look like
<XML type=parrot-yaml version=1.0> PMC: foo type: PerlArray values: pmc: bar string: Baz PMC: bar type: PerlInt values: integer:1
Once again, modulo my limited and inevitably incorrect YAML knowledge. So if the header says it's XML the whole thing is valid XML, while if it doesn't the rest of the stream doesn't have to be. (Just enough of the header so that an XML processing program can examine the stream and decide that the valid XML chunk at the beginning says that the rest of the stream's not XML)
Basically we want some nice, fixed (mostly) thing at the head of the stream that doesn't vary regardless of the way the stream is encoded, and XML seemed to be the most restrictive of the forms I know people will clamor for. (I know, it means the stream can't be valid Lisp-style sexprs, but XML's more widespread :)
Dan Sugalski <d...@sidhe.org> wrote: > On Tue, 21 Oct 2003, Leopold Toetsch wrote:
[ thaw ]
>> This should IMHO be able to create constant PMCs out of metadata, e.g. >> for subroutine objects. So there should be some means to tell thaw() to >> create PMC(s) in the constant_pmc_pool. > There should be a way to put PMCs in the constant pool in general. I was > thinking a constant op would work--something like > constant Ix, [SP]y > to make the string or PMC Y a constant at slot X in the constant pool.
You can append items to the constant table. You can't declare existing items as constant, because you can't change the underlying object pool, where the object was allocated. This would change the objects address.
> Passing in the PMC header to be filled in also works, though both fail if > you want full PMC trees marked as constants since thawing out a PMC stream > may involve creating multiple PMCs. (In which case we might be better > temporarily switching allocation pools at constant creation time, rather > than passing in PMCs)
These are either serious shortcomings or unneeded workarounds. An extra parameter to relevant vtables can take care of such special cases.
>> I dunno, why chill() is superior to dump() or pretty_print(), but the >> name doesn't really matter. > The important thing is that it's not a vtable method.
Ah, that's the difference. How shall the system pretty-print dynamically loaded PMCs then, when only a bytecode-stream is available? IMHO only a vtable in the class can perform that job.
>> > 1) Freezing at the destruction level may *not* use any additional memory >> > for object traversal > It puts a number of unpleasant constraints on the core freeze routines.
Constructing the frozen stream definitely needs memory. I don't see the difference, to memory consumed by a seen hash. Can you please elaborate a bit more on this.
> The only thing that mark does that the general traversal doesn't, in the > abstract, is flip the object's live flag. Everything else is an > optimization of code which we can, if we need, discard.
Yes, mark() can be written in terms of a general traverse, which gets a vtable function (and a data pointer). mark is basically traverse(mark, 0). But this isn't true the other way round. You can't do freeze based on the mark iterator. How do you pass the desired output format?
>> mark() is called permanently in a running interpreter, that does non >> trivial things. There are shortcuts for scalars, DOD is highly optimized >> not to destroy cache coherency. Using mark() also implies to back out >> my small PMC patches. All the advantages of smaller scalars are gone >> then. > All of this stuff for freezing is going to end up killing the small PMC > patch anyway, unfortunately, since we're going to have to be able to > traverse PMCs in the destruction phase, which means we have to have the > means of traversal at hand as we can't guarantee that we can allocate more > PMCs or resize the PMCs ext data.
A scalar can't contain or reference other PMCs, so it can't be a potential source of freeze loops. If I now spit out (PMC: Int, ID=xy, value=5) twice or (PMC: ID=other) doesn't really matter. thaw() can take care of duplicates, if needed. Other PMCs have the next_for_GC pointer.
Albeit I'm not convinced, that we can't have a seen hash.
> YHO, in this case, turns out to not consider all the issues involved.
That might very well be true, yes. So it would be fine, if you could fill the gaps.
On Tue, 21 Oct 2003, Leopold Toetsch wrote: > Dan Sugalski <d...@sidhe.org> wrote: > > On Tue, 21 Oct 2003, Leopold Toetsch wrote:
> [ thaw ]
> >> This should IMHO be able to create constant PMCs out of metadata, e.g. > >> for subroutine objects. So there should be some means to tell thaw() to > >> create PMC(s) in the constant_pmc_pool.
> > There should be a way to put PMCs in the constant pool in general. I was > > thinking a constant op would work--something like
> > constant Ix, [SP]y
> > to make the string or PMC Y a constant at slot X in the constant pool.
> You can append items to the constant table. You can't declare existing > items as constant, because you can't change the underlying object pool, > where the object was allocated. This would change the objects address.
The object's address should be irrelevant for the constant table. PMCs are referenced in the opstream by table offset. This offset can be into a PMC pool, or into a pointer table. While the pointer table has an extra level of indirection to it it adds flexibility and takes some pressure off of the ordering of PMCs for instantiated constants.
> > Passing in the PMC header to be filled in also works, though both fail if > > you want full PMC trees marked as constants since thawing out a PMC stream > > may involve creating multiple PMCs. (In which case we might be better > > temporarily switching allocation pools at constant creation time, rather > > than passing in PMCs)
> These are either serious shortcomings or unneeded workarounds. An extra > parameter to relevant vtables can take care of such special cases.
Not necessarily, no. The number of PMCs that are reconstituted for a set of constant frozen PMCs is indeterminate. If we're instantiating bytecode with constant PMCs in it it's possible the class that backs those PMCs has changed and things instantiate differently than they might otherwise do.
If we've frozen 20 PMCs, all we can guarantee is that when we unthaw them that we've got at least 20 PMCs, though we may have more, and the extras arguably should be allocated from the constant PMC arena (though not given slots in the constant table) so we can skip scanning the constant arenas for dead objects needing cleanup.
> >> I dunno, why chill() is superior to dump() or pretty_print(), but the > >> name doesn't really matter.
> > The important thing is that it's not a vtable method.
> Ah, that's the difference. How shall the system pretty-print dynamically > loaded PMCs then, when only a bytecode-stream is available? IMHO only a > vtable in the class can perform that job.
If the dynamically loaded PMC class doesn't have a backing Parrot class, you can't, and get the default, relatively primitive dump.
> >> > 1) Freezing at the destruction level may *not* use any additional memory > >> > for object traversal
> > It puts a number of unpleasant constraints on the core freeze routines.
> Constructing the frozen stream definitely needs memory. I don't see the > difference, to memory consumed by a seen hash. Can you please elaborate > a bit more on this.
Constructing the frozen stream will need some memory, yes. At the moment all it needs is a chunk of random memory and that's it, so we may well fail because we're out of memory. We may, however, have general pool memory handy. We can't guarantee that we have *any* headers, however, since we can legitimately be called from within the destruct phase of a DOD run, which may have been triggered by an out-of-headers condition.
Depending on how we flesh things out freezing may also not require any additional memory--if we relax the requirement for freezing to allow the output to be a PMC, we may be backed directly to a file or other storage that doesn't involve RAM allocation.
> > The only thing that mark does that the general traversal doesn't, in the > > abstract, is flip the object's live flag. Everything else is an > > optimization of code which we can, if we need, discard.
> Yes, mark() can be written in terms of a general traverse, which gets a > vtable function (and a data pointer). mark is basically traverse(mark, > 0). But this isn't true the other way round. You can't do freeze based > on the mark iterator. How do you pass the desired output format?
What does the desired output format have to do with any of this? All marking does is put things on the list of PMCs to be visited if it hasn't already been visited, so we get to that PMC at some point as we walk the visited list. In the context of the DOD sweep it also sets the live flag, but we could, if we chose, skip that and use the presence of a non-NULL value in the mark chain address for a PMC as an indicator of liveness. (Though yes, I realize that this means potentially skipping some of the optimizations, so I'm not proposing it as a requirement for the DOD sweep implementation)
> >> mark() is called permanently in a running interpreter, that does non > >> trivial things. There are shortcuts for scalars, DOD is highly optimized > >> not to destroy cache coherency. Using mark() also implies to back out > >> my small PMC patches. All the advantages of smaller scalars are gone > >> then.
> > All of this stuff for freezing is going to end up killing the small PMC > > patch anyway, unfortunately, since we're going to have to be able to > > traverse PMCs in the destruction phase, which means we have to have the > > means of traversal at hand as we can't guarantee that we can allocate more > > PMCs or resize the PMCs ext data.
> A scalar can't contain or reference other PMCs, so it can't be a > potential source of freeze loops. If I now spit out (PMC: Int, ID=xy, > value=5) twice or (PMC: ID=other) doesn't really matter. thaw() can take > care of duplicates, if needed. Other PMCs have the next_for_GC pointer.
Thaw can only properly take care of duplicates if the duplicates are correctly indicated in the serialization stream. Identical end-values are *not* sufficient to note multiple references to the same PMC.
> Albeit I'm not convinced, that we can't have a seen hash.
It takes an insane amount of memory and requires header allocation. We can't allocate headers, and the memory requirements are extreme. Been there, done that, it was a bad idea. Consider this arbitrarily and unconditionally ruled out if you're unwilling to believe the stats that were previously posted about this.
> > > The chill and warm runtime methods take a PMC or a frozen representation > > > of a PMC (respectively) and provide a human readable version of that PMC.
> > I dunno, why chill() is superior to dump() or pretty_print(), but the > > name doesn't really matter.
> The important thing is that it's not a vtable method. It's a function that > belongs in the freeze/thaw API as it's just an alternate encoding or > decoding. (Arguably it ought not be a separate API entry at all and just > another encoding scheme, but that requires transcoding serialization > forms, and I'd rather not get into that)
This is really just a naming problem. Dan wants to call the vtable-function freeze and have different encodings for all kinds of dumping/pretty_printing/marking. Leo calls the function traverse and controlls it by callbacks.
My personal opinion on this naming problem is: traverse describes more generaly what the function does. Marking live objects by freezing them in an encoding that does return nothing just sounds plain wrong.
Freeze should be just a user of the general traverse function. (And this does mean it is also no vtable function)
or even the freeze_encodings are callback_sets: freeze_xml, freeze_yaml, freeze_binary, whatever.
>>> 1) Freezing at the destruction level may *not* use any additional memory >>> for object traversal
>> What is "Freezing at the destruction level"? Is this anyhow related to >> destruction ordering?
> No. There are some valid cases where an object, after having been declared > dead by the DOD, wants to serialize itself. Persistent object stores > apparently do this, and it makes a certain amount of sense--when the > object goes out of scope the current state is flushed to disk.
This is a question of what is allowed at destruction time. You don't want to allow memory allocation, but allow freezing. That gets hard, because you need at least allocate the STRING where you want to put your frozen stream.
> It puts a number of unpleasant constraints on the core freeze routines. > User code can violate them and take the consequences, but we can't.
We can call (hopefully) arbitary user code in destruction routines. So this argument does not count
>>> Note that I do *not* want to have multiple object traversal systems in >>> parrot! We have one for DOD, and proposals have ranged upwards from there. >>> No. That is *not* happening--the chance for error is significant, the >>> side-effects of the error annoying and tough to track down for complex >>> cases (akin to the trouble with tracking down GC issues), and just not >>> necessary. (Perhaps desirable for speed/space reasons, but desirable >>> isn't necessary)
Freeze is just another traversal method. Just calling it freeze instead of traverse does not change this fact. You can limit the power of encodings, but this does not change the fact that you need to walk all children
>> DOD's mark() routine has different requirements then a general >> traverse() for freeze(), chill(), clone(), and destruction ordering. >> Using just mark() will have these side effects that you want to avoid.
> The only thing that mark does that the general traversal doesn't, in the > abstract, is flip the object's live flag. Everything else is an > optimization of code which we can, if we need, discard.
mark() may be implemented in form of a general traverse. Let the profiler decide if a special purpose mark() or a general traverse is better.
>> A general traverse() can be depth first of breadth first, mark() isn't >> required do have any specific ordering as long as it sets live bits >> everywhere.
> I'm pretty sure that with a singly linked list we can get a generally > properly-ordered flattened tree without having to do an insane number of > passes across the dead object store. I may be incorrect in this, but I > don't think so, and for our purposes the live bit can be safely ignored if > we end up setting it, though potentially with another pass over the dead > store, which may end up prohibitively expensive. We'll see.
I'm pretty sure that a singly linked list is not enough. I had done some experiments with this. One pass my be enough, but you need to keep track of the tree-traversal and of the partial ordered list. These to things don't play well together. Maybe this can be cut down to two lists, or one list and one bit per pmc.
>> mark() is called permanently in a running interpreter, that does non >> trivial things. There are shortcuts for scalars, DOD is highly optimized >> not to destroy cache coherency. Using mark() also implies to back out >> my small PMC patches. All the advantages of smaller scalars are gone >> then.
> All of this stuff for freezing is going to end up killing the small PMC > patch anyway, unfortunately, since we're going to have to be able to > traverse PMCs in the destruction phase, which means we have to have the > means of traversal at hand as we can't guarantee that we can allocate more > PMCs or resize the PMCs ext data.
Destruction ordering just enforces that small PMCs can't have destructors. If they have destructors they must be big, big enough to construct the ordered list of objects without allocating any memory.
If you think about it: The call to the destructors is done after free_unused_pobjects completed. The memory of the objects without destructors is already freed. If you are still in an out of memory situation when the destructors are run, then its also very likely that you are in a not much better situation afterwards.
>> While freeze() and friends have to pull in each PMC into the cache, just >> setting the live bit on a PMC hasn't. Further: Lukes proposal for >> speeding up timely destruction puts objects either in front or at the >> end of the next_for_GC chain. This IMHO implies that mark() is unusable >> as your general and solely iterator.
> We may end up playing macro games with the code. I can live with multiple > instantiations of the code, but I want only a single chunk of source to be > maintained.
You know we already have two versions of pobject_lives lying around.
On Tue, 21 Oct 2003, Juergen Boemmels wrote: > Dan Sugalski <d...@sidhe.org> writes:
> [...]
> > > > The chill and warm runtime methods take a PMC or a frozen representation > > > > of a PMC (respectively) and provide a human readable version of that PMC.
> > > I dunno, why chill() is superior to dump() or pretty_print(), but the > > > name doesn't really matter.
> > The important thing is that it's not a vtable method. It's a function that > > belongs in the freeze/thaw API as it's just an alternate encoding or > > decoding. (Arguably it ought not be a separate API entry at all and just > > another encoding scheme, but that requires transcoding serialization > > forms, and I'd rather not get into that)
> This is really just a naming problem. Dan wants to call the > vtable-function freeze and have different encodings for all kinds of > dumping/pretty_printing/marking. Leo calls the function traverse and > controlls it by callbacks.
It's more than just a naming issue (or if it is, then traverse is the wrong name). The traversal must be done externally, since we can't be recursive.
Mark puts a PMC on the list of PMCs to be frozen. Freeze dumps the PMC being frozen (and *only* that PMC) to the stream. The freeze routine for a PMC must mark (generally indirectly by calling the "add this pmc to the stream" api function) any PMCs that it needs to be in the stream.
The external function that traverses this list of PMCs to be dumped is responsible for making sure there are no duplicates--the easiest way is to do what the DOD sweep does and note that a PMC has already been put on the list and thus not mark it.
Mark and freeze are separate, though related by the subsystems that use them.
> This is a question of what is allowed at destruction time. You don't > want to allow memory allocation, but allow freezing. That gets hard, > because you need at least allocate the STRING where you want to put > your frozen stream.
It's more a question of what we we require the engine to do, vs what user code is allowed to do. A user program is allowed to write code that can fail at destroy time, however the infrastructure we provide (including, in this case, freezing--while I don't like it there's no choice) can't fail that way. It's the reason the DOD and GC systems don't allocate memory (or didn't--they shouldn't) when they run. The engine's not allowed to have failure modes in critical sections.
Basically the engine may fail because of user code, but user code can't fail because of the engine. It makes some things annoyingly restrictive, but some problems are inherently annoyingly restrictive.
> > It puts a number of unpleasant constraints on the core freeze routines. > > User code can violate them and take the consequences, but we can't.
> We can call (hopefully) arbitary user code in destruction routines. So > this argument does not count
> >> A general traverse() can be depth first of breadth first, mark() isn't > >> required do have any specific ordering as long as it sets live bits > >> everywhere.
> > I'm pretty sure that with a singly linked list we can get a generally > > properly-ordered flattened tree without having to do an insane number of > > passes across the dead object store. I may be incorrect in this, but I > > don't think so, and for our purposes the live bit can be safely ignored if > > we end up setting it, though potentially with another pass over the dead > > store, which may end up prohibitively expensive. We'll see.
> I'm pretty sure that a singly linked list is not enough. I had done > some experiments with this. One pass my be enough, but you need to > keep track of the tree-traversal and of the partial ordered > list. These to things don't play well together. Maybe this can be cut > down to two lists, or one list and one bit per pmc.
There may be a little more infrastructure--I've not dug out the algorithms books and gone hunting. The common algorithms tend to cheat by just dodging the whole problem. :)
> Destruction ordering just enforces that small PMCs can't have > destructors. If they have destructors they must be big, big enough to > construct the ordered list of objects without allocating any memory.
Can't have destructors *or* refer to PMCs that may either have a destructor or (indirectly) refer to a PMC that has a destructor.
If we have 2 PMCs with destructors they may be connected by a chain of 100 PMCs that don't, but we still need to walk that chain.
> If you think about it: The call to the destructors is done after > free_unused_pobjects completed. The memory of the objects without > destructors is already freed.
Then we reorder. This can't happen, and it didn't used to happen--if that's how it works now then there's a bug in the DOD system. *All* destructors *must* be called before any headers are collected.
> >> While freeze() and friends have to pull in each PMC into the cache, just > >> setting the live bit on a PMC hasn't. Further: Lukes proposal for > >> speeding up timely destruction puts objects either in front or at the > >> end of the next_for_GC chain. This IMHO implies that mark() is unusable > >> as your general and solely iterator.
> > We may end up playing macro games with the code. I can live with multiple > > instantiations of the code, but I want only a single chunk of source to be > > maintained.
> You know we already have two versions of pobject_lives lying around.
>>> Note that I do *not* want to have multiple object traversal systems >>> in >>> parrot! We have one for DOD, and proposals have ranged upwards from >>> there. >>> No. That is *not* happening--the chance for error is significant, the >>> side-effects of the error annoying and tough to track down for >>> complex >>> cases (akin to the trouble with tracking down GC issues), and just >>> not >>> necessary. (Perhaps desirable for speed/space reasons, but desirable >>> isn't necessary)
>> DOD's mark() routine has different requirements then a general >> traverse() for freeze(), chill(), clone(), and destruction ordering. >> Using just mark() will have these side effects that you want to avoid.
> The only thing that mark does that the general traversal doesn't, in > the > abstract, is flip the object's live flag. Everything else is an > optimization of code which we can, if we need, discard.
I don't believe that is quite true. There are a couple of important differences between traversal-for-GC and traversal-for-serialization, which will be a challenge to reconcile in the one-true-traversal:
1) Serialization traversals need to "take note" of logical int and float slots (e.g., as used in perlint.pmc and perlnum.pmc) so that they can be serialized, but for GC you only need to worry about GC-able objects. It's difficult to come up with a reasonable callback which can take either int, float, or PObj arguments.
2) It's reasonable for an object to have a pointer to some sort of cache object, which is not logically part of the object, and shouldn't be serialized along with it. This needs to be traversed for GC purposes, but needs to not be traversed for serialization. (Situations such as this--physical but not logical membership--are the origin of the "mutable" keyword in C++.)
3) Traversal for GC needs to do loop detection, but can just stop going down a particular branch of the object graph once it encounters an object it's seen before. Serialization traversals would need to have a way, upon encountering an object seen before, to include in the serialization stream an indication that the current object has already been serialized, and enough information to enable deserialization code to go find it and recreate the loop. The only options I see here are either for serialization to involve the allocation of unbounded additional memory, or to expand the PObj structure to include a slot for a UUID which can be used as a back-reference in a stream, or to have serialization break loops (so that deserialized structures never have loops).
I'm not 100% convinced that a single approach can't handle both applications, but it's looking as though their requirements are different enough that it may not work well.
Two other questions/concerns/comments/issues:
1) I assume that ultimately a user-space iterator would end up calling the traversal code, right? If so, you can't reasonably mandate that only one traversal be in progress at one time. That would be the canonical way to compare two ordered collections--get an iterator for each, and compare element-by-element.
2) I don't see it as a huge problem that serialization code could end up creating additional objects if called from a destroy() method. (Though yes, it would be a problem for GC infrastructure code to.) I say that for two reasons: (a) destroy() methods can really do anything they want, and if that task involves allocating additional memory, that just makes it a risk to perform that task in a destroy() method--it may fail due to out-of-memory conditions. I think that Java design experts tend to argue against doing things like serialization in finalization methods. It sounds elegant, but it's problematic. One reason for this is that you tend to want to serialize structures as a whole, not piece-by-piece as they are garbage-collected. The second reason it is not always a problem in practice is that (b) a DOD run may be triggered by an out-of-headers conditions, but that doesn't mean that an additional chunk of memory for headers can't be allocated. If it can't be, then this is no more problematic that it would be in other user code--think of the case where I have some big tree of objects I want to make some sort of copy of, with the intention of then letting go of the original when I'm done. I'll be freeing up headers at the end of that process, but if I run out of memory part-way-through, then I'm just stuck.
3) I assume that not every object is assumed to be serializable? For instance, an object representing a filehandle can't really be serialized in a useful way. So I'm not sure of what sort of "fidelity" is required of a generic serialization method--that is, how similar a deserialized structure is guaranteed to be to the original.
Dan Sugalski <d...@sidhe.org> wrote: > On Tue, 21 Oct 2003, Leopold Toetsch wrote: >> You can append items to the constant table. You can't declare existing >> items as constant, because you can't change the underlying object pool, >> where the object was allocated. This would change the objects address. > The object's address should be irrelevant for the constant table. PMCs are > referenced in the opstream by table offset.
Only in the opstream. But not when such PMCs are used then. I.e. when constant Sub PMC is refered to in the global stash.
>> Ah, that's the difference. How shall the system pretty-print dynamically >> loaded PMCs then, when only a bytecode-stream is available? IMHO only a >> vtable in the class can perform that job. > If the dynamically loaded PMC class doesn't have a backing Parrot class, > you can't, and get the default, relatively primitive dump.
I was thinking of plain PMCs, that where loaded to provide some special functionality. Parrot doesn't know anything about these, so will be unable to pretty print the opstream. Loaded classes OTOH as based on ParrotClass and should be printable.
>> Constructing the frozen stream definitely needs memory. I don't see the >> difference, to memory consumed by a seen hash. Can you please elaborate >> a bit more on this. > Constructing the frozen stream will need some memory, yes. At the moment > all it needs is a chunk of random memory and that's it, so we may well > fail because we're out of memory.
So, with the same argument I can say, (destructor level) freezing will need *system* memory for the stream plus the hash. So we may well fail. I don't see any difference. The hash hasn't to be a "fat" PerlHash.
If we don't want a hash one bit inside the objects arena flags should be able to serve the same functionality - this PMC already got serialized.
Anyway - how does/would freezing at destructor level look like from HLL POV? Shortly before, there ought to be a full DOD run (or all possible garbage would be frozen). At this time, the amount of still active and then to be serialized PMCs is known (an upper boundary is always known). So it should be possible to work around such constraints.
> ... We may, however, have general pool > memory handy. We can't guarantee that we have *any* headers, however, > since we can legitimately be called from within the destruct phase of a > DOD run, which may have been triggered by an out-of-headers condition.
I really doubt, that thawing a program (or some data of it), that died in middle of some non trivial operation, because it ran out of headers, will be of any use.
>> A scalar can't contain or reference other PMCs, so it can't be a >> potential source of freeze loops. If I now spit out (PMC: Int, ID=xy, >> value=5) twice or (PMC: ID=other) doesn't really matter. thaw() can take >> care of duplicates, if needed. Other PMCs have the next_for_GC pointer. > Thaw can only properly take care of duplicates if the duplicates are > correctly indicated in the serialization stream. Identical end-values are > *not* sufficient to note multiple references to the same PMC.
Sorry I thought of PMC IDs, which are the address of the frozen PMCs.
>> Albeit I'm not convinced, that we can't have a seen hash. > It takes an insane amount of memory and requires header allocation.
A PerlHash takes more memory, and yes. But we just need a hash of PMC addresses, or a bit inside the objects arena.
We have several different traverse-like functions:
* mark (DOD): called frequently, should get all possible speed * freeze (destruction): no speed issues, can't take Parrot resources * freeze (user): rarely used, can take resources * destruction ordering: only active objects to be visited * clone: can take resources thaw(freeze()), or separate vtable * dump/pretty-print: no vtable? * thaw: special class method, is different anyway
The first 2 critical items have diametral usage patterns. This does not really imply, that they should be implemented based on the same scheme.
> ... We > can't allocate headers, and the memory requirements are extreme. Been > there, done that, it was a bad idea. Consider this arbitrarily and > unconditionally ruled out if you're unwilling to believe the stats that > were previously posted about this.
You are speaking of Storable.pm? I'm not aware of any stats regarding that. But I'm not thinking of using a full fledged hash for such a special case.
Melvin Smith <melv...@us.ibm.com> wrote: >> Albeit I'm not convinced, that we can't have a seen hash. > A seen hash most likely would: > 1) Kill GC performance especially in pathological cases. The GC > should be quiet and invisible. > 2) Cause memory usage to double upon a mark run.
GC isn't involved. A mark() run sets the live bit in the PMCs arena. No hash is needed for both cases. I have stated several times, that I don't like to mix mark() and the other traverse functions.
>> At 08:21 -0400 10/21/03, Dan Sugalski wrote: >>>> I find the notion of an "XML header" a bit confusing, given Dan's >>>> statement to the effect that it was a throw to XML folks.
>>>> I think anything "XML folks" will be interested in will entail >>>> *wrapping* stuff, not *prefixing* it.
>>> Nah, I expect what they'll want is for the entire data stream of >>> serialized objects to be in XML format. Which is fine--they can have >>> that. >>> (It's why I mentioned the serialization routines can be overridden)
>>> For an XML stream the header might be <xml parrot format='xml' >>> version=1.0> with the rest of the stream in XML. A YAML stream would >>> start >>> <xml parrot format='yaml' version=1.0> with the rest in YAML, and teh >>> binary format as <xml parrot format='binary' version=1.0>. Or >>> something >>> like that, modulo actual correct XML.
>> If you want that to be looking like valid XML, it would have to be >> different:
>> error: Specification mandate value for attribute parrot >> <xml parrot/> >> ^ >> Better in my opinion would be something like:
>> <parrot format="xml" version="1.0"/>data yadda yadda yadda
> I'm not an XML guy, and I'm making all this up as I go along. If that's > better, fine with me. :)
Yeah, you can't put extra things in the "<?xml..." at the start of a document, and you can't create a tag of your own whose name starts with "XML" or "xml".
>> So are we talking about a header or a wrapper? If it is really a >> header, it's not XML and then it's prettyy useless from an XML point >> of view.
> We're talking about the first thing in a file (or stream, or > whatever). I > was under the impression that XML files should be entirely composed of > valid XML, hence the need for the stream type marker being valid XML.
No, XML _documents_ must be XML, but that doesn't mean that document == file. (For another example where this comes up, consider an XML document transmitted over HTTP. There are headers and other textual things in the stream along with the xml, and it's the HTTP protocol which determines where the document begins and ends, not xml's.) You can certainly have more than one XML document in a single file, but something needs to decide where an xml document begins and ends, and hand only that data to the xml parser.
> YAML doesn't care as much, so far as I understand, and for our own > internal > binary format we cna do whatever we want. If that's not true, then we > can > go for a more compact header.
Yes, if you want the whole serialized steam to count as a well-formed xml document, then you can't but arbitrary binary data in the middle. See my previous post for why.
> Once again, modulo my limited and inevitably incorrect YAML knowledge. > So > if the header says it's XML the whole thing is valid XML, while if it > doesn't the rest of the stream doesn't have to be. (Just enough of the > header so that an XML processing program can examine the stream and > decide > that the valid XML chunk at the beginning says that the rest of the > stream's not XML)
Most XML parsers aren't expecting to handle this. That is, there's no such thing as a valid half-of-an-xml document, from the perspective of the xml spec, and in many cases you'd have trouble getting a parser to stop before hitting something problematic and blowing up. In other words, you can't rely on an xml parser to process something which starts out looking like xml, but isn't.
> Basically we want some nice, fixed (mostly) thing at the head of the > stream that doesn't vary regardless of the way the stream is encoded, > and > XML seemed to be the most restrictive of the forms I know people will > clamor for. (I know, it means the stream can't be valid Lisp-style > sexprs, > but XML's more widespread :)
Yeah, if you're just needing to tag the stream with a label to indicate the type plus a version number, then xml's on the one hand overkill and on the other hand not necessarily a big help to xml proponents.
> Yeah, if you're just needing to tag the stream with a label to indicate > the type plus a version number, then xml's on the one hand overkill and > on the other hand not necessarily a big help to xml proponents.
So, in a nutshell, throwing an XML format type tag at the beginning buys us nothing regardless of whether it's an XML stream or not?
In that case, nuts to that. It's already terribly obvious I'm going to mess it up if I try, so we'll just skip it and move on to the next headache. :)
(FWIW, with respect to binary data in the output stream--if an encoded format doesn't allow binary data then the encoder is responsible for changing it to a non-binary format. So for XML and YAML (and any other text encoding format, I expect) that'll likely be a base64 encoding or something)
On Tue, 21 Oct 2003, Jeff Clites wrote: > I don't believe that is quite true. There are a couple of important > differences between traversal-for-GC and traversal-for-serialization, > which will be a challenge to reconcile in the one-true-traversal:
> 1) Serialization traversals need to "take note" of logical int and > float slots (e.g., as used in perlint.pmc and perlnum.pmc) so that they > can be serialized, but for GC you only need to worry about GC-able > objects. It's difficult to come up with a reasonable callback which can > take either int, float, or PObj arguments.
That's not an issue for us. A PMC is responsible for serializing itself, so if its got a string, float, or int component then it must take respnsibility for dumping those components to the serialization stream. Basically PMCs *must* dump themselves out completely, but the engine provides support to defer dumping of PMCs so that we don't get into recursive dumping and blow stack, as well as to make sure that we properly maintain multiple references to the same PMC.
> 2) It's reasonable for an object to have a pointer to some sort of > cache object, which is not logically part of the object, and shouldn't > be serialized along with it. This needs to be traversed for GC > purposes, but needs to not be traversed for serialization. (Situations > such as this--physical but not logical membership--are the origin of > the "mutable" keyword in C++.)
That's what custom mark routines are for, though it does argue that we should have a separate mark for freezing.
> 3) Traversal for GC needs to do loop detection, but can just stop going > down a particular branch of the object graph once it encounters an > object it's seen before. Serialization traversals would need to have a > way, upon encountering an object seen before, to include in the > serialization stream an indication that the current object has already > been serialized, and enough information to enable deserialization code > to go find it and recreate the loop. The only options I see here are > either for serialization to involve the allocation of unbounded > additional memory, or to expand the PObj structure to include a slot > for a UUID which can be used as a back-reference in a stream, or to > have serialization break loops (so that deserialized structures never > have loops).
The loop breaking needs for freezing are the same as for DOD sweeps, though with freezing we're at an advantage as we know where the tree starts.
In all cases (I made sure this was in the example, but it might not have been clear) we only include a marker for child PMCs in the parent PMC's serialized data, and serialize the child PMCs later on in the stream. So if PMC1 has a pointer to PMC2, the stream has PMC1 dumped to it but in the place of PMC2's data is just a marker saying "refer to PMC2 here" and then after the end of PMC1's data in the stream we dump out PMC2's data.
> 1) I assume that ultimately a user-space iterator would end up calling > the traversal code, right? If so, you can't reasonably mandate that > only one traversal be in progress at one time. That would be the > canonical way to compare two ordered collections--get an iterator for > each, and compare element-by-element.
While it could, I think it's infeasable to use the serialization iterator for normal user-space iteration, if only because the limits that have to be on the serialization iterator for use in restricted circumstances are a bit onerous for general use.
I'm not entirely sure that parrot's going to provide this form of iteration as it stands anyway--it's not necessary for the core langauge support and while it'd be really useful there's a limit to the number of Big Problems I'm up to solving. (Having said that there may, probably will, be enough introspective capabilites to do this without engine support)
> 2) I don't see it as a huge problem that serialization code could end > up creating additional objects if called from a destroy() method.
User code may, parrot may not. The reasons are twofold--while parrot will let you shoot yourself in the foot, it provides the gun, not the foot. It should also be possible for carefully written destroy methods to serialize but not eat any headers or memory. (I can see this being the case in some embedded applications or systems) If we make it so freezing is not a guaranteed possibility at destroy time then this can't happen and it lessens the utility of the system some.
We can, if we choose, loosen the restriction later if sufficient reason is presented. Can't really tighten it, though, so for now...
> 3) I assume that not every object is assumed to be serializable? For > instance, an object representing a filehandle can't really be > serialized in a useful way. So I'm not sure of what sort of "fidelity" > is required of a generic serialization method--that is, how similar a > deserialized structure is guaranteed to be to the original.
No fidelity is required at the moment, as we've not put any requirements at all on what goes in the output stream. It could, I suppose, consist of a near-infinite stream of fnords or something. That's the next bridge to burn, but I don't think I'm done being cooked over the current one :)