Here's some history (for the benefit of the list archives):
There are quite a few projects that have employed various techniques
for going beyond Java's standard reflection mechanism for instantiating
new objects.
The typical cases for doing this are:
- When a class doesn't have a public constructor.
- When a class doesn't have a default constructor.
- When code in the constructor needs to be ignored.
- When the object needs to be manipulated in a way that reflection
doesn't allow (e.g. setting final fields).
For most applications, this is not necessary, however, there are a few
specialist cases where this is really useful. For example:
- Serialization
- Remoting
- Persistence layers
- Dynamic mocks
- AOP engines
Many of us have solved the problem in weird and wonderful ways, often
unaware that others had done the same. Objenesis is an attempt to
consolidate our efforts
I'd like to leave this thread open (forever) for anyone to comment on
approaches they've tried and what they learned from it.
cheers
-Joe
XStream is a simple (to use) library for serializing objects to XML
(and other formats) and back again. http://xstream.codehaus.org/
To deserialize an object from a stream, XStream has to instantiate it
and set it's fields.
Initially I lived with the constraint that in order to do this, your
objects needed a default constructor because XStream had no idea what
the arguments passed to the constructor meant. Class.newInstance() did
the business.
Then I started poking inside the code java.io.ObjectInputStream code to
figure out how Java reflection did it and noticed it sneakily used an
internal undocumented Sun class:
sun.reflect.ReflectionFactory
What this class (apparently) did was generate a new default constructor
for a class to complement the 'real' constructor, which could then be
called as usual. Although this only seemed to work on certain Java
runtimes - namely 1.4 and onwards and if they used the Sun classes
(i.e. not the IBM JDK).
Nevertheless, this was still useful, so I created a little
ObjectFactory abstraction in XStream, allowing the strategy for
construction to be plugged in, using the best mechanism available on
the JVM.
The code as it stood in 2003 can be seen here:
http://fisheye.codehaus.org/browse/xstream/trunk/xstream/src/java/com/thoughtworks/xstream/objecttree/reflection/SunReflectionObjectFactory.java?r=2
Although this was limited to a specific set of JVMs, this covered most
of my needs.
-Joe
First of all, congratulations on the initiative :)
Second, I'd just like to make a quick comment:
Joe Walnes wrote:
> - When the object needs to be manipulated in a way that reflection
> doesn't allow (e.g. setting final fields).
As far as I know, reflection does allow the setting of final fields,
except when they become compile time constants. For instance, the code
below:
---
package test.pack;
import java.lang.reflect.Field;
class FinalTest {
final int x = 19;
final int y;
final StringBuffer str = new StringBuffer("You can't alter me!");
public FinalTest() {
y = 0;
}
@Override
public String toString() {
return "x="+x+",y="+y+", str="+str;
}
}
public class AcessTest {
public static void modify(FinalTest result) throws Exception {
Field fx = result.getClass().getDeclaredField("x");
fx.setAccessible(true);
fx.set(result, 1);
Field fy = result.getClass().getDeclaredField("y");
fy.setAccessible(true);
fy.set(result, 2);
Field fstr = result.getClass().getDeclaredField("str");
fstr.setAccessible(true);
fstr.set(result, new StringBuffer("Yes I can!"));
}
public static void main(String[] args) throws Exception {
FinalTest ft = new FinalTest();
System.out.println("Before: "+ ft.toString());
modify(ft);
System.out.println("After: "+ ft.toString());
}
}
----
Outputs:
Before: x=19,y=0, str=You can't alter me!
After: x=19,y=2, str=Yes I can!
Cheers,
Leonardo
About a year later, XStream had formed a good size user base.
Unfortunately this meant there were more and more people using it on
non Sun 1.4 JVMs - hense more complaints.
It dawned on me that there was already a JVM neutral way of
instantiating objects and skipping the constructor... using standard
serialization.
That is: if you have an object, serialize it (using
java.io.ObjectOutputStream) to a sequence of bytes, you can then
instantiate it as many times as you want without the default
constructor by sending those bytes to java.io.ObjectInputStream.
The problem is, for XStream I didn't necessarily have an original
object instance to generate the bytes from. So I needed to hand craft
the exact byte sequence to be sent to ObjectInputStream.
I tried reverse engineering the standard Java code, and pouring of the
serialization specification but just couldn't get it right.
I was chatting to Chris Nokleberg (from the CGLib project
http://cglib.sourceforge.net/) on IRC about this, but by this point I'd
given up. Luckily, Chris didn't. The next morning, he'd figured it out
and written all about it:
http://sixlegs.com/blog/java/skipping-constructors.html
This was great! I integrated this as another strategy into XStream.
It still had a constraint though. Because it used standard
serialization, it could only instantiate classes marked that
implemented Serializable. Nonetheless, combined with the dodgy Sun only
strategy, this allowed XStream to be used in many more situations.
To date, these are still the only two strategies used by XStream, but
the users seem to be happy :)
-Joe
Hi Leonardo. Welcome aboard.
> > - When the object needs to be manipulated in a way that reflection
> > doesn't allow (e.g. setting final fields).
>
> As far as I know, reflection does allow the setting of final fields,
> except when they become compile time constants. For instance, the code
> below:
[snip]
Very interesting! That's exactly the kind of thing I would like
Objenesis to cover :)
-Joe
My name is Leonardo Mesquita, I am the author of project JSerial.
This is a project intended to make serialization faster by using
bytecode generation of serialization code. In a nutshell, it works by
inspecting classes with reflection and generating code that tries to
access fields directly.
In order to deserialize objects, I have bumped into the Object
Instantiation problem. So far, I haven't found a single solution that
works for every case *and* is portable.
I'll list the methods I have discovered so far, and in time I'll
comment on the pros and cons of each approach:
* The default serialization approach: how Sun does it.
* The "sun.reflect.ReflectionFactory.newConstructorForSerialization"
approach: the entrails.
* The "java.io.ObjectStreamClass.newInstance" approach: perhaps a
viable trade-off?
* The "feed an ObjectInputStream" approach: closer to portability.
Leonardo
"A Serializable class must do the following:
(...)
* Have access to the no-arg constructor of its first
nonserializable superclass"
(http://java.sun.com/j2se/1.5.0/docs/guide/serialization/spec/serial-arch.html#4539)
While it may seem arbitrary, the explanation for that restriction
can be found in the java.io.Serializable documentation:
"To allow subtypes of non-serializable classes to be serialized, the
subtype may assume responsibility for saving and restoring the state of
the supertype's public, protected, and (if accessible) package fields.
The subtype may assume this responsibility only if the class it extends
has an accessible no-arg constructor to initialize the class's state. It
is an error to declare a class Serializable if this is not the case. The
error will be detected at runtime.
During deserialization, the fields of non-serializable classes will
be initialized using the public or protected no-arg constructor of the
class. A no-arg constructor must be accessible to the subclass that is
serializable. The fields of serializable subclasses will be restored
from the stream"
(http://java.sun.com/j2se/1.5.0/docs/api/java/io/Serializable.html)
So, contrary to belief, java does not bypass the constructors: it
just calls an appropriate (i.e., non-serializable) ancestor constructor.
Since most simple serialization objects actually extend
java.lang.Object, it *looks* as if no constructors were called. To see
that this is not the case, you can make a simple test class that extends
a non-serializable class that prints something in the no-arg constructor.
(to be continued...)
The java statements:
package test;
public class SillyClass extends SillyAncestor implements
java.io.Serializable
...
SillyClass obj = new SillyClass();
...
Generates the following bytecodes (dump with javap -c):
0: new #1; //class test/SillyClass
3: dup
4: invokespecial #20; //Method "test/SillyClass.<init>":()V
Essentially, the "new" instruction allocates the needed memory for
the object, and the "invokespecial" calls the constructor.
The trick used by the Sun VM to create instances for serialization is
to generate opcodes that call "invokespecial" for the first
non-serializable ancestor of the serialized object's class. Something like:
0: new #1; //class test/SillyClass
3: dup
4: invokespecial #14; //Method "test/SillyAncestor.<init>":()V
(In order to completely bypass constructors, one could call the
no-arg constructor of java.lang.Object (which is what is done by
xstream's Sun14ReflectionProvider). It is a convenient way to avoid all
constructors, although it is a way to instantiate objects that is not
totally compatible with the Java Serialization spec.)
But beware, oh foolish mortals!!! If you try to generate these
bytecodes yourself, the VM's bytecode verifier yells:
"java.lang.VerifyError: (class: test/SillyClass, method: create
signature: ()Ltest/SillyClass;) Call to wrong initialization method"
Then, you would ask, how do the sun.* classes do it?
The answer is quite simple: THEY CHEAT!!!
That's it. Sun's Java VM has a small "hack" that bypasses bytecode
verification for subclasses of sun.reflect.MagicAccessorImpl.
Bottom line is: Sun doesn't encourage the indiscriminate creation of
objects without calling constructors. There is currently no reflection
mechanism for doing this, and they resort to hacks to do it internally
for the very few cases where they believe it is necessary.
This makes our task all the hardest. We'll probably have to write
very different approachs for different VM implementations, and perhaps
even for different versions of each one.
I believe this calls for a JSR, if there isn't one already. There
ought to be a standard way to create objects through reflection without
calling constructors, or even allowing the selection of an ancestor's
constructor to call, so that there is a possibility to completely
customize serialization. Besides, there are all other reasons for
bypassing constructors that were previously mentioned on this forum.
Next, I intend to write about the currently known strategies for
creating objects.
I ask you to comment on this thread. Is it helpful? Does it clarify
the issue we are trying to overcome?
Leonardo Mesquita
Cheers,
Leonardo Mesquita
Henri Tremblay wrote:
> A JSR would be really helpful. Bypassing the constructor was made easy
> in JDK 1.4 because they did a refactoring of the object serialization
> code. So the so-called munged constructor was easy to obtain. In JDK
> 1.3, from what I remember, it's way more painful if not impossible.
> Since the munged constructor is quite useful for other things than
> object serialization, putting it out of the sun packages sounds like a
> cool JSR. For mocking and alternative serializing (like XStream), it's
> needed.
>
> For the rest of the mail, really interesting but I need to experiment
> by myself to be able to comment :-)
>
> On 11/5/06, *Leonardo Mesquita* < mrb...@gmail.com
I've just committed SunReflectionFactoryInstantiator, which uses the
sun.reflect.ReflectionFactory approach. The build is a bit smarter now
and only attempts to build it if the right JDK is available. Follow
the pattern in build.xml and you should be able to add 1.3 specific
stuff without breaking the build on other JDKs.
-j
By the way, I'd like to write a TCK for the alternative API that I
proposed, but I haven't worked for a collaborative project in the way I
am doing now, so I don't know if it could be considered "impolite" if I
modified your TCK to include my tests. What would you prefer me to do,
write a separate TCK and leave integration for later, or just go ahead
and make the changes myself?
Leonardo
Cool!
I'll have a go at fixing up the build.
> * The attemptToRegisterInstantiator crashes upon Errors, like
> java.lang.ExceptionInInitializerError. Maybe it is not a good practice
> to throw exceptions in a static block, which I did in Sun13Instantiator...
Yeah. In fact it's a very bad practice :) as you often have no control
over where the exception will be thrown.
> By the way, I'd like to write a TCK for the alternative API that I
> proposed, but I haven't worked for a collaborative project in the way I
> am doing now, so I don't know if it could be considered "impolite" if I
> modified your TCK to include my tests. What would you prefer me to do,
> write a separate TCK and leave integration for later, or just go ahead
> and make the changes myself?
Go ahead and make the changes. Please switch all the code over to
using the new API.
My usual policy is to feel free to make whatever changes you think are
suitable, commit them and then allow to discuss. Also, try and make
frequent small changes, syncing regularly as it reduces the chance of
conflict.
cheers
-j
Alright, I moved everything to the new API. I made sure every test was
running, and fixed a subtle gotcha in TextReporterTest that was making
it fail in Windows. Apparently, "\n" won't get you the same String as
PrintStream.println().
Also, I changed the catch(RuntimeException) (which I dislike) in
NewInstanceInstantiator for catch(Exception) (which I dislike even
more, but seems to be more appropriate for what you were trying to do).
I am not quite sure if the "return null on error" is the best policy.
It's too C++ish ;-)
Leonardo
Cool.
> Also, I changed the catch(RuntimeException) (which I dislike) in
> NewInstanceInstantiator for catch(Exception) (which I dislike even
> more, but seems to be more appropriate for what you were trying to do).
> I am not quite sure if the "return null on error" is the best policy.
> It's too C++ish ;-)
Yeah, it made me feel dirty too :)