Marshal won't dump a Proc

HarryO

未读，

2001年10月3日 09:55:312001/10/3

收件人

I would really like to be able to dump a block of code via Marshal#dump,
but I get an error saying "can't dump Proc".

I have a terrible feeling this means it's not possible to do so, but
thought I'd ask, just in case it's actually possible and I'm just doing
something wrong.

What, exactly, determines whether something can be dumped/loaded via the
Marshal module? Ie, what are its limitations?

Michael Neumann

未读，

2001年10月3日 11:03:122001/10/3

收件人

You cannot dump

Proc, Thread, IO (File, Socket ...), Continuation, Method

Regards,

Michael

--
Michael Neumann
merlin.zwo InfoDesign GmbH
http://www.merlin-zwo.de

HarryO

未读，

2001年10月3日 11:30:592001/10/3

收件人

In article <200110031...@michael.neumann.all>, "Michael Neumann"
<neu...@s-direktnet.de> wrote:

> You cannot dump
>
> Proc, Thread, IO (File, Socket ...), Continuation, Method

I can understand why Thread and the IOs can't be dumped (because there
are underlying operating system structures associated with them), but
can't see why Proc, Continuation and Method couldn't be dumped/loaded ...
although I could see that it might not be possible in the special case
where their bindings included a Thread or an IO of some form.

Is it just that it's considered too hard to get right (and I'm not saying
it would be easy!) or is there some other reason?

Will Conant

未读，

2001年10月3日 11:27:272001/10/3

收件人

For the sake of clarity, I thought I'd give an explanation.

You can't dump any of those things because they are either handles to system
level resources or compiled (parsed) code.

In the first case, the reasons are obvious. How do you serialize a socket
between you and some other endpoint? That channel of communication could not
orthogonally be deserialized and brought back to the state it was in when it
was serialized in the first place.

In the second case, the reason, I assume, has mostly to do with lexical
scoping and closures. A Proc object doesn't just contain a chunk of parsed
ruby code, it is also dependent on the scope in which it was created. If you
serialized a Proc, you'd have to serialize the entire state of the running
application in order to maintain all of the scopes involved.

For example:

a = "foo"
myproc = Proc.new {
puts a
}

Serializing (marshaling) myproc would mean saving the state of it's
enclosing scope as well. It's not that such a thing is impossible (there is
a Java VM implementation that lets you save the state of a program and
resume it right where you left off) it's just not all that practical.

If you really want to store some Ruby code and pass it around, save it as
text and use eval.

-Will Conant

-----Original Message-----
From: Michael Neumann [mailto:neu...@s-direktnet.de]
Sent: Wednesday, October 03, 2001 9:03 AM
To: ruby...@ruby-lang.org
Subject: [ruby-talk:22011] Re: Marshal won't dump a Proc

HarryO wrote:
> I would really like to be able to dump a block of code via Marshal#dump,
> but I get an error saying "can't dump Proc".
>
> I have a terrible feeling this means it's not possible to do so, but
> thought I'd ask, just in case it's actually possible and I'm just doing
> something wrong.
>
> What, exactly, determines whether something can be dumped/loaded via the
> Marshal module? Ie, what are its limitations?

You cannot dump

Proc, Thread, IO (File, Socket ...), Continuation, Method

Yukihiro Matsumoto

未读，

2001年10月3日 12:38:312001/10/3

收件人

Hi,

In message "[ruby-talk:22015] Re: Marshal won't dump a Proc"

on 01/10/04, "HarryO" <har...@zipworld.com.au> writes:

|I can understand why Thread and the IOs can't be dumped (because there
|are underlying operating system structures associated with them), but
|can't see why Proc, Continuation and Method couldn't be dumped/loaded ...
|although I could see that it might not be possible in the special case
|where their bindings included a Thread or an IO of some form.
|
|Is it just that it's considered too hard to get right (and I'm not saying
|it would be easy!) or is there some other reason?

Procs, Bindings and Continuations contain references to C stack
information which is not portable. Methods have references to C
function. All of above information is not portable.

matz.

HarryO

未读，

2001年10月3日 16:36:332001/10/3

收件人

In article <1002126975.600438...@ev.netlab.jp>, "Yukihiro
Matsumoto" <ma...@ruby-lang.org> wrote:

> Procs, Bindings and Continuations contain references to C stack
> information which is not portable. Methods have references to C
> function. All of above information is not portable.

Thanks to matz and Will for those explanations. The Proc I wanted to
dump/restore is a block that's provided when an instance of my class is
created. It specifies how to get the nth element of a sequence and hence
can be pretty much anything. Hence, asking the user to write it as a
string that's going to be eval-ed would make using my class a little
messy.

I've provided a mechanism whereby the user can ask for the current set of
values to be saved between sessions. This is useful for sequences where
the calculation time is significant, eg generating prime numbers.

I then thought it would be nice to be able to store the Proc that defined
how to get the next element, so then you could load one of these
sequences without having to have the original code. Ie, the dump file
would carry all the information required to continue from where the last
program that used that sequence left off.

I can see now that storing a Proc would be much more complex than I'd
originally thought. While the block of code I want to save is generally
going to be self-contained, I guess there's no way for the serialisation
of a Proc object to know that.

In any case, I guess that even the simplest piece of code, eg "{ |i| i
}", uses Object and serialising that could presumably be quite complex,
especially since someone could have used the dynamic features of ruby to
modify Object by adding or changing methods.

Nothing's ever as simple as it seems, is it :-) ??

What I might be able to do is provide two means of constructing the
object, one by attaching a block and the other by passing a string that's
eval-ed into a Proc for run-time use and dumped/restored for persistence.

Thanks again for the lucid explanations.

Paul Brannan

未读，

2001年10月3日 17:35:192001/10/3

收件人

How about the following? The only major disadvantage is that it will be a
little bit slower than using a real Proc:

require 'delegate'

class DumpableProc < DelegateClass(Proc)
def initialize(str)
eval "@proc = proc { #{str} }"
@str = str
super(@proc)
end

def _dump(limit)
@str
end

def self._load(str)
self.new(str)
end
end

dp = DumpableProc.new("puts 'foo!'")
dp.call()

dump_str = Marshal.dump(dp)
puts dump_str

dp2 = Marshal.load(dump_str)
dp2.call()

Phil Tomson

未读，

2001年10月3日 18:35:132001/10/3

收件人

In article <20011004.063633....@zipworld.com.au>,

HarryO <har...@zipworld.com.au> wrote:
>In article <1002126975.600438...@ev.netlab.jp>, "Yukihiro
>Matsumoto" <ma...@ruby-lang.org> wrote:
>
>
>> Procs, Bindings and Continuations contain references to C stack
>> information which is not portable. Methods have references to C
>> function. All of above information is not portable.
>

This seems reasonable, but it also means that I can't use Proc's with
dRuby since dRuby uses Marshalling. :-( I was thinking of sending Proc's
over to objects running on different machines (using dRuby).

>
>What I might be able to do is provide two means of constructing the
>object, one by attaching a block and the other by passing a string that's
>eval-ed into a Proc for run-time use and dumped/restored for persistence.
>

I'm not sure I totally understand what you are doing, but it seems like
you're having the user supply a function to your class (a Proc) that is
used to supply a sequence of numbers (am I right?). What if you had a
module called UserSupplied (for example) that got included into your
class? The user would have to define a function in the UserSupplied
module.

Something like:
#file: UserSupplied.rb
module UserSupplied

def myUserFunction
puts "In myUserFunction"
end
#user can edit this file to add or change functions in this
#namespace.
end

#file: YourClass.rb
class YourClass
include UserSupplied

def initialize
load "UserSupplied.rb" #this would ensure that if the user changes
#UserSupplied.rb after the program began to
#run that the new or changed functions in
#the UserSupplied module would get mixed in.
#(at least I think it should work, I haven't
#tried it :)
end
end

Would this work for you? I guess the drawback is that your users would
have to know Ruby, but even with your other method where they supply a
string with Ruby code they would have to know Ruby, so I guess it doesn't
matter.

Phil

Mathieu Bouchard

未读，

2001年10月3日 19:05:452001/10/3

收件人

On Thu, 4 Oct 2001, Paul Brannan wrote:
> How about the following? The only major disadvantage is that it will be a
> little bit slower than using a real Proc:

I wrote this program with Guy Hurst on April 6th. You say that a proc
should keep its source by prepending a % sign just before the
opening-brace.

class Proc
attr_accessor :source
class << self
def _load(aString)
foo=eval(aString)
foo.source=aString
foo
end
end
def _dump(aDepth)
source or raise "sourceless Proc cannot be dumped"
end
end

module Kernel
alias :old_proc :proc
def Kernel.proc(string=nil,&b)
if string and b then raise "bad argument, dude" end
if string then
Proc._load("proc{#{string}}")
else
old_proc(&b)
end
end
def proc(string=nil,&b); Kernel.proc(string,&b); end
end

________________________________________________________________
Mathieu Bouchard http://hostname.2y.net/~matju

Florian G. Pflug

未读，

2001年10月3日 19:59:412001/10/3

收件人

On Thu, Oct 04, 2001 at 05:46:08AM +0900, HarryO wrote:
> Thanks to matz and Will for those explanations. The Proc I wanted to
> dump/restore is a block that's provided when an instance of my class is
> created. It specifies how to get the nth element of a sequence and hence
> can be pretty much anything. Hence, asking the user to write it as a
> string that's going to be eval-ed would make using my class a little
> messy.

Did you know that " %{ .... }" is a string literal in ruby?

By using this, from the users point of view the only different to passing a
"real" block, instead of a string, would be that he has to write a "%"
before the curly brace. (and this it would act like a closure, e.g. giving
you access to the surrounding variables).

greetings, Florian Pflug

HarryO

未读，

2001年10月4日 06:48:342001/10/4

收件人

In article <200110040...@perception.phlo.org>, "Florian G. Pflug"
<f...@phlo.org> wrote:

> Did you know that " %{ .... }" is a string literal in ruby?

Yes, I did know that, but I'm fairly sure that when I tried it, I got
parse errors, presumably due to the appearance of "}" in the middle of
some of the more complex blocks I wanted to use.

However, I did find that using a "here" document worked just fine,

> By using this, from the users point of view the only different to
> passing a "real" block, instead of a string, would be that he has to
> write a "%" before the curly brace. (and this it would act like a
> closure, e.g. giving you access to the surrounding variables).

I'm going to have another try using "%{ ... }", just in case the parse
error I was due to some other problem I had at the time, because I agree,
this would be a nicer syntax.

Thanks for the comments.

Michael Neumann

未读，

2001年10月4日 06:51:092001/10/4

收件人

Phil Tomson wrote:
> >> Procs, Bindings and Continuations contain references to C stack
> >> information which is not portable. Methods have references to C
> >> function. All of above information is not portable.
> >
>
> This seems reasonable, but it also means that I can't use Proc's with
> dRuby since dRuby uses Marshalling. :-( I was thinking of sending Proc's
> over to objects running on different machines (using dRuby).

It is possible to use Proc's and Code-Blocks onto remote objects using dRuby.
All objects that cannot be marshalled are passed by reference. To pass your own
"marshallable" objects as references, just include module DRb::DRbUndumped (DRbUndumped).

To demonstrate the usage of iterators, the following example iterates over a remote file.

# ----------- server ------------
require "drb/drb"
primary = DRb::DRbServer.new("druby://0.0.0.0:5555", File.new(ARGV.shift, "r"))
primary.thread.join

# ----------- client ------------
require "drb/drb"
primary = DRb::DRbServer.new
ro = DRb::DRbObject.new(nil, ARGV.shift)
ro.each {|line|
puts line

HarryO

未读，

2001年10月4日 07:03:412001/10/4

收件人

In article <BoMu7.1295$T%4.13...@sjcpnn01.usenetserver.com>, "Phil
Tomson" <pt...@shell1.aracnet.com> wrote:

> I'm not sure I totally understand what you are doing, but it seems like
> you're having the user supply a function to your class (a Proc) that is
> used to supply a sequence of numbers (am I right?).

Yes. That's basically right. The concept came from some perl code that
was discussed here a while back. The idea is to have what looks like a
stream of numbers (although, I guess it could potentially be anything,
really) that looks like an array, in that when you ask for the next item
out of the array it looks like it's "just there". What actually happens
is that when you ask for an item, the block that was passed when
constructing this type of object is called however many times are
necessary to calculate the array entries up to the one the user
requested.

This, in itself, isn't overly useful, other than providing an abstraction
for anyone who wants to use that sequence. However, the other neat idea
was to merge a number of these types of sequences together, thus
obtaining things in order. The classic example that was what opened the
discussion last time, was to generate all of the numbers that are
2^i * 3^j * 5^k, in order.

The pair of classes I've created provide this functionality in ruby (not
quite ready for public consumption yet, though).

>What if you had a
> module called UserSupplied (for example) that got included into your
> class? The user would have to define a function in the UserSupplied
> module.

This is a nice idea, except that I think it separates things more than
I'd like. I basically want the user to be able to say something like
(and this is currently how it works) ... note that this isn't a
particularly useful example, I just want to keep it simple ...

powersOf2 = Stream.new([1]) { |s, i| s[i - 1] * 2 }

The array parameter is whatever necessary initial values are required to
start the sequence off. For some cases, there won't be any, for some
there will be more than one value (see the example ahead).

Ie, the block is passed a reference to the Stream, plus the index of the
item that is required and its return value is that array item. It
can refer to any of the previous entries in order to calculate it. For
example,

fibonacci = Stream.new([1, 1]) { |s, i| s[i - 2] + s[i - 1] }

This is definitely not the most efficient way to generate such a
sequence, since there's a function call overhead for the generation of
each item. However, the facility to merge streams together provides for
neat approaches to solve some problems.

I assume you can see now why I think it's nicer to keep the block that
does the calculations close to where the stream is defined. It just
makes it easier for someone reading the code, rather than having to go
off to look at another module.

I'll have more of a think about what you've suggested, though, since it
may be another orthogonal way of providing the specification of the
calculation and there's no reason not to allow people to have more than
one way to do things, even if that is another language's idiom :-).

> Would this work for you? I guess the drawback is that your users would
> have to know Ruby,

This is intended for the ruby literate, so that's not an issue for me.

HarryO

未读，

2001年10月4日 07:09:372001/10/4

收件人

In article <Pine.LNX.4.40.011003...@zaphod.atdesk.com>,
"Paul Brannan" <pbra...@atdesk.com> wrote:

> How about the following? The only major disadvantage is that it will be
> a little bit slower than using a real Proc:

As I've already said to Paul in a separate reply, I like this approach,
since it's a nice class in itself, and it makes for a neat, orthogonal
way of handling the issue I had.

I hacked his code slightly, to make it not inherit from DelegateClass, but
simple implement a call() method itself. I haven't heard back from Paul
yet as to whether there's a particular reason he went in that direction,
so if anyone else can explain what I'd be missing by having a call()
method directly in DumpableProc I'd appreciate it.

HarryO

未读，

2001年10月4日 07:17:332001/10/4

收件人

In article <Pine.LNX.4.21.0110031855001.27045-100000@relayer>, "Mathieu
Bouchard" <ma...@sympatico.ca> wrote:

> I wrote this program with Guy Hurst on April 6th. You say that a proc
> should keep its source by prepending a % sign just before the
> opening-brace.

If I understand what you're getting at, the extension you've made to the
kernel module allows me to do either of these ...

p1 = proc { puts "hello, world" }
p2 = proc %{ puts "hello, world" }

Is that right?

Assuming you've read my long-winded reply, explaining what I'm trying
to achieve, I could do something like ...

fibonacci = Stream.new([1, 1]) %{ |s, i| s[i - 2] + s[i - 1] }

If this is how it works, it would be wonderful if that facility could be
added to the core kernel module, since I'm sure other people will come up
with ways they could use dumpable procs.

HarryO

未读，

2001年10月4日 07:26:162001/10/4

收件人

In article <20011004.211732...@zipworld.com.au>, "HarryO"
<har...@zipworld.com.au> wrote:

Sorry, I made a typo there. What I meant to say was that, if I
understood correctly, with that kernel module extension, I'd be able to do ...

fibonacci = Stream.new([1, 1]) proc %{ |s, i| s[i - 2] + s[i - 1] }

Paul Brannan

未读，

2001年10月4日 10:26:392001/10/4

收件人

The reason I went with the DelegateClass was that it is safer; if a new
method is added to Proc, then (assuming the DumpableProc class is created
after the new method is added) the new method will also be available to
the DumpableProc. The method_missing hack would also work (and as
discussed before, works with methods added at run-time), but is not as
concise.

Another other good reason for using the DelegateClass is that it works
when you pass the function a block. It is a bit slower as a result, but
the speed difference is not significant for most applications.

Deriving from Proc does not work, since the constructor is expecting a
block.

I particularly like matju's method, since it allows one to make a proc
dumpable with only a few changes. It is less safe, since it modifies Proc
and Kernel, and that also means a small speed hit, even when you don't
want to make your proc dumpable. But just being able to add a % to make
the proc dumpable is an interesting ability.

Paul

Mathieu Bouchard

未读，

2001年10月6日 00:12:452001/10/6

收件人

On Thu, 4 Oct 2001, HarryO wrote:

> In article <Pine.LNX.4.21.0110031855001.27045-100000@relayer>, "Mathieu
> Bouchard" <ma...@sympatico.ca> wrote:
> If I understand what you're getting at, the extension you've made to the
> kernel module allows me to do either of these ...
> p1 = proc { puts "hello, world" }
> p2 = proc %{ puts "hello, world" }
> Is that right?

yes.

> Assuming you've read my long-winded reply, explaining what I'm trying
> to achieve, I could do something like ...
> fibonacci = Stream.new([1, 1]) %{ |s, i| s[i - 2] + s[i - 1] }

On Thu, 4 Oct 2001, HarryO wrote:

> fibonacci = Stream.new([1, 1]) proc %{ |s, i| s[i - 2] + s[i - 1] }

neither is correct. The trick only works with proc. but you may use this
identity:

foo(a,b,c) { d }

is functionally equivalent to:

foo a,b,c,&proc { d }

oh, and by the way, if you need to access outer-local scopes, you can't
use the percent trick: it won't pick up the correct scope (not even the
correct self). You'd need the "binding.caller" feature, which is in a RCR
that I submitted some time ago; as a substitute using current means, you'd
have to write:

foo a,b,c,proc binding, %{ d }

in Ruby 1.6, or:

foo a,b,c,proc(binding, %{ d })

in Ruby 1.7 (which is slightly more cumbersome). Alternately you could get
away with:

foo a,b,c,proc(%{ d }){}

where you use the empty proc as a binding (iirc).