Welcome to Objenesis... How we got here

Joe Walnes

unread,

Nov 2, 2006, 2:27:32 PM11/2/06

to Objenesis developers

Hello, here's the Objenesis project!

Here's some history (for the benefit of the list archives):

There are quite a few projects that have employed various techniques
for going beyond Java's standard reflection mechanism for instantiating
new objects.

The typical cases for doing this are:
- When a class doesn't have a public constructor.
- When a class doesn't have a default constructor.
- When code in the constructor needs to be ignored.
- When the object needs to be manipulated in a way that reflection
doesn't allow (e.g. setting final fields).

For most applications, this is not necessary, however, there are a few
specialist cases where this is really useful. For example:
- Serialization
- Remoting
- Persistence layers
- Dynamic mocks
- AOP engines

Many of us have solved the problem in weird and wonderful ways, often
unaware that others had done the same. Objenesis is an attempt to
consolidate our efforts

I'd like to leave this thread open (forever) for anyone to comment on
approaches they've tried and what they learned from it.

cheers
-Joe

Joe Walnes

unread,

Nov 2, 2006, 2:46:17 PM11/2/06

to Objenesis developers

So, here's what happened on XStream.

XStream is a simple (to use) library for serializing objects to XML
(and other formats) and back again. http://xstream.codehaus.org/

To deserialize an object from a stream, XStream has to instantiate it
and set it's fields.

Initially I lived with the constraint that in order to do this, your
objects needed a default constructor because XStream had no idea what
the arguments passed to the constructor meant. Class.newInstance() did
the business.

Then I started poking inside the code java.io.ObjectInputStream code to
figure out how Java reflection did it and noticed it sneakily used an
internal undocumented Sun class:
sun.reflect.ReflectionFactory

What this class (apparently) did was generate a new default constructor
for a class to complement the 'real' constructor, which could then be
called as usual. Although this only seemed to work on certain Java
runtimes - namely 1.4 and onwards and if they used the Sun classes
(i.e. not the IBM JDK).

Nevertheless, this was still useful, so I created a little
ObjectFactory abstraction in XStream, allowing the strategy for
construction to be plugged in, using the best mechanism available on
the JVM.

The code as it stood in 2003 can be seen here:
http://fisheye.codehaus.org/browse/xstream/trunk/xstream/src/java/com/thoughtworks/xstream/objecttree/reflection/SunReflectionObjectFactory.java?r=2

Although this was limited to a specific set of JVMs, this covered most
of my needs.

-Joe

Leonardo Mesquita

unread,

Nov 2, 2006, 2:52:57 PM11/2/06

to objene...@googlegroups.com

Hello!

First of all, congratulations on the initiative :)
Second, I'd just like to make a quick comment:

Joe Walnes wrote:
> - When the object needs to be manipulated in a way that reflection
> doesn't allow (e.g. setting final fields).

As far as I know, reflection does allow the setting of final fields,
except when they become compile time constants. For instance, the code
below:
---
package test.pack;

import java.lang.reflect.Field;

class FinalTest {
final int x = 19;
final int y;

final StringBuffer str = new StringBuffer("You can't alter me!");

public FinalTest() {
y = 0;
}

@Override
public String toString() {

return "x="+x+",y="+y+", str="+str;
}
}

public class AcessTest {
public static void modify(FinalTest result) throws Exception {
Field fx = result.getClass().getDeclaredField("x");
fx.setAccessible(true);
fx.set(result, 1);

Field fy = result.getClass().getDeclaredField("y");
fy.setAccessible(true);
fy.set(result, 2);

Field fstr = result.getClass().getDeclaredField("str");
fstr.setAccessible(true);
fstr.set(result, new StringBuffer("Yes I can!"));
}

public static void main(String[] args) throws Exception {
FinalTest ft = new FinalTest();
System.out.println("Before: "+ ft.toString());
modify(ft);
System.out.println("After: "+ ft.toString());
}
}
----

Outputs:
Before: x=19,y=0, str=You can't alter me!
After: x=19,y=2, str=Yes I can!

Cheers,
Leonardo

Joe Walnes

unread,

Nov 2, 2006, 3:04:18 PM11/2/06

to Objenesis developers

XStream continued...

About a year later, XStream had formed a good size user base.
Unfortunately this meant there were more and more people using it on
non Sun 1.4 JVMs - hense more complaints.

It dawned on me that there was already a JVM neutral way of
instantiating objects and skipping the constructor... using standard
serialization.

That is: if you have an object, serialize it (using
java.io.ObjectOutputStream) to a sequence of bytes, you can then
instantiate it as many times as you want without the default
constructor by sending those bytes to java.io.ObjectInputStream.

The problem is, for XStream I didn't necessarily have an original
object instance to generate the bytes from. So I needed to hand craft
the exact byte sequence to be sent to ObjectInputStream.

I tried reverse engineering the standard Java code, and pouring of the
serialization specification but just couldn't get it right.

I was chatting to Chris Nokleberg (from the CGLib project
http://cglib.sourceforge.net/) on IRC about this, but by this point I'd
given up. Luckily, Chris didn't. The next morning, he'd figured it out
and written all about it:

http://sixlegs.com/blog/java/skipping-constructors.html

This was great! I integrated this as another strategy into XStream.

It still had a constraint though. Because it used standard
serialization, it could only instantiate classes marked that
implemented Serializable. Nonetheless, combined with the dodgy Sun only
strategy, this allowed XStream to be used in many more situations.

To date, these are still the only two strategies used by XStream, but
the users seem to be happy :)

-Joe

Joe Walnes

unread,

Nov 2, 2006, 3:08:00 PM11/2/06

to Objenesis developers

Leonardo Mesquita wrote:
> Hello!

Hi Leonardo. Welcome aboard.

> > - When the object needs to be manipulated in a way that reflection
> > doesn't allow (e.g. setting final fields).
>
> As far as I know, reflection does allow the setting of final fields,
> except when they become compile time constants. For instance, the code
> below:

[snip]

Very interesting! That's exactly the kind of thing I would like
Objenesis to cover :)

-Joe

Leonardo Mesquita

unread,

Nov 2, 2006, 3:48:37 PM11/2/06

to objene...@googlegroups.com

Hello,

My name is Leonardo Mesquita, I am the author of project JSerial.
This is a project intended to make serialization faster by using
bytecode generation of serialization code. In a nutshell, it works by
inspecting classes with reflection and generating code that tries to
access fields directly.

In order to deserialize objects, I have bumped into the Object
Instantiation problem. So far, I haven't found a single solution that
works for every case *and* is portable.

I'll list the methods I have discovered so far, and in time I'll
comment on the pros and cons of each approach:
* The default serialization approach: how Sun does it.
* The "sun.reflect.ReflectionFactory.newConstructorForSerialization"
approach: the entrails.
* The "java.io.ObjectStreamClass.newInstance" approach: perhaps a
viable trade-off?
* The "feed an ObjectInputStream" approach: closer to portability.

Leonardo

Henri Tremblay

unread,

Nov 2, 2006, 5:25:17 PM11/2/06

to objene...@googlegroups.com

What happened on EasyMock class extension...

I was working on EasyMock class extension. After making lots of complex and tricky ways to instantiate mocks calling one constructor (like having a dedicated class loader hacking the bytecode when loading the class) I realized that what I was really needing was to bypass entirely the constructor. I've started to wonder how Sun serialization code was doing it.

Luckily, at the same time the project I was working on during my day job was requiring to serialize in XML instead of normal object serialization. So I looked on the net and XStream seemed to be the best thing around. On of the nice features were that the serialized class didn't need to have a no args constructor or even any constructor at all...

You probably know where I'm going now. I crossed linked some information in my head and borrowed some code from XStream which was doing the exact same thing I was trying to do. Since then, EasyMock class extension can mock a class without calling any constructor... on Sun JVM 1.4...

From there, Joe realized that EasyMock and XStream had the same needs and same problems to work for all JVMs available. And that the ObjectFactory might also be useful to other projects. Objenesis was created.
-
Henri

Leonardo Mesquita

unread,

Nov 2, 2006, 5:44:56 PM11/2/06

to objene...@googlegroups.com

The Java serialization specification states that:

"A Serializable class must do the following:
(...)
* Have access to the no-arg constructor of its first
nonserializable superclass"
(http://java.sun.com/j2se/1.5.0/docs/guide/serialization/spec/serial-arch.html#4539)

While it may seem arbitrary, the explanation for that restriction
can be found in the java.io.Serializable documentation:

"To allow subtypes of non-serializable classes to be serialized, the
subtype may assume responsibility for saving and restoring the state of
the supertype's public, protected, and (if accessible) package fields.
The subtype may assume this responsibility only if the class it extends
has an accessible no-arg constructor to initialize the class's state. It
is an error to declare a class Serializable if this is not the case. The
error will be detected at runtime.

During deserialization, the fields of non-serializable classes will
be initialized using the public or protected no-arg constructor of the
class. A no-arg constructor must be accessible to the subclass that is
serializable. The fields of serializable subclasses will be restored
from the stream"
(http://java.sun.com/j2se/1.5.0/docs/api/java/io/Serializable.html)

So, contrary to belief, java does not bypass the constructors: it
just calls an appropriate (i.e., non-serializable) ancestor constructor.
Since most simple serialization objects actually extend
java.lang.Object, it *looks* as if no constructors were called. To see
that this is not the case, you can make a simple test class that extends
a non-serializable class that prints something in the no-arg constructor.

(to be continued...)

Leonardo Mesquita

unread,

Nov 5, 2006, 1:24:21 PM11/5/06

to objene...@googlegroups.com

When creating an object, the compiler usually generates bytecodes as
follows:

The java statements:
package test;
public class SillyClass extends SillyAncestor implements
java.io.Serializable
...
SillyClass obj = new SillyClass();
...

Generates the following bytecodes (dump with javap -c):

0: new #1; //class test/SillyClass
3: dup
4: invokespecial #20; //Method "test/SillyClass.<init>":()V

Essentially, the "new" instruction allocates the needed memory for
the object, and the "invokespecial" calls the constructor.

The trick used by the Sun VM to create instances for serialization is
to generate opcodes that call "invokespecial" for the first
non-serializable ancestor of the serialized object's class. Something like:

0: new #1; //class test/SillyClass
3: dup
4: invokespecial #14; //Method "test/SillyAncestor.<init>":()V

(In order to completely bypass constructors, one could call the
no-arg constructor of java.lang.Object (which is what is done by
xstream's Sun14ReflectionProvider). It is a convenient way to avoid all
constructors, although it is a way to instantiate objects that is not
totally compatible with the Java Serialization spec.)

But beware, oh foolish mortals!!! If you try to generate these
bytecodes yourself, the VM's bytecode verifier yells:

"java.lang.VerifyError: (class: test/SillyClass, method: create
signature: ()Ltest/SillyClass;) Call to wrong initialization method"

Then, you would ask, how do the sun.* classes do it?

The answer is quite simple: THEY CHEAT!!!

That's it. Sun's Java VM has a small "hack" that bypasses bytecode
verification for subclasses of sun.reflect.MagicAccessorImpl.

Bottom line is: Sun doesn't encourage the indiscriminate creation of
objects without calling constructors. There is currently no reflection
mechanism for doing this, and they resort to hacks to do it internally
for the very few cases where they believe it is necessary.

This makes our task all the hardest. We'll probably have to write
very different approachs for different VM implementations, and perhaps
even for different versions of each one.

I believe this calls for a JSR, if there isn't one already. There
ought to be a standard way to create objects through reflection without
calling constructors, or even allowing the selection of an ancestor's
constructor to call, so that there is a possibility to completely
customize serialization. Besides, there are all other reasons for
bypassing constructors that were previously mentioned on this forum.

Next, I intend to write about the currently known strategies for
creating objects.

I ask you to comment on this thread. Is it helpful? Does it clarify
the issue we are trying to overcome?

Leonardo Mesquita

Henri Tremblay

unread,

Nov 5, 2006, 6:30:45 PM11/5/06

to objene...@googlegroups.com

A JSR would be really helpful. Bypassing the constructor was made easy in JDK 1.4 because they did a refactoring of the object serialization code. So the so-called munged constructor was easy to obtain. In JDK 1.3, from what I remember, it's way more painful if not impossible. Since the munged constructor is quite useful for other things than object serialization, putting it out of the sun packages sounds like a cool JSR. For mocking and alternative serializing (like XStream), it's needed.

For the rest of the mail, really interesting but I need to experiment by myself to be able to comment :-)

Joe Walnes

unread,

Nov 6, 2006, 7:13:00 AM11/6/06

to Objenesis developers

Ahhhh.... that's really cunning. I could never quite get by head around
what sun ReflectionFactory was doing. You've explained it really well.

Leonardo Mesquita

unread,

Nov 7, 2006, 6:43:09 AM11/7/06

to objene...@googlegroups.com

Alright, I found out how to do it in JDK1.3:
ObjectInputStream defines the private static native method
"allocateNewObject(Class aclass, Class initclass)", where the first
parameter is the class you want to instantiate, and the second parameter
is the superclass that has the appropriate constructor.
It is private... But nothing that our good old friend reflection
can't solve for us (given the SecurityManager allows it, of course).
Whenever it is possible, I'll test the idea. I haven't had much time
for that...

Cheers,
Leonardo Mesquita

Henri Tremblay wrote:
> A JSR would be really helpful. Bypassing the constructor was made easy
> in JDK 1.4 because they did a refactoring of the object serialization
> code. So the so-called munged constructor was easy to obtain. In JDK
> 1.3, from what I remember, it's way more painful if not impossible.
> Since the munged constructor is quite useful for other things than
> object serialization, putting it out of the sun packages sounds like a
> cool JSR. For mocking and alternative serializing (like XStream), it's
> needed.
>
> For the rest of the mail, really interesting but I need to experiment
> by myself to be able to comment :-)
>

> On 11/5/06, *Leonardo Mesquita* < mrb...@gmail.com

Joe Walnes

unread,

Nov 7, 2006, 8:58:00 AM11/7/06

to objene...@googlegroups.com

Oooh nice.

I've just committed SunReflectionFactoryInstantiator, which uses the
sun.reflect.ReflectionFactory approach. The build is a bit smarter now
and only attempts to build it if the right JDK is available. Follow
the pattern in build.xml and you should be able to add 1.3 specific
stuff without breaking the build on other JDKs.

-j

Leonardo Mesquita

unread,

Nov 9, 2006, 9:59:39 AM11/9/06

to objene...@googlegroups.com

I've committed the Sun13Instantiator, which tested successfully, but
I found some issues in the build:
* The java-to-jar macro fails in Windows because the "destjar"
attribute comes with a full path, and the
<mkdir dir="${tmp.dir}/@{destjar}.contents"/> command tries
to create a dir with a name like "C:\tmpdir\C:\destdir", which is invalid.
* I still have to devise a way to detect the presence of a Sun 1.3
JDK VM in the platform.check target.
* The attemptToRegisterInstantiator crashes upon Errors, like
java.lang.ExceptionInInitializerError. Maybe it is not a good practice
to throw exceptions in a static block, which I did in Sun13Instantiator...

By the way, I'd like to write a TCK for the alternative API that I
proposed, but I haven't worked for a collaborative project in the way I
am doing now, so I don't know if it could be considered "impolite" if I
modified your TCK to include my tests. What would you prefer me to do,
write a separate TCK and leave integration for later, or just go ahead
and make the changes myself?

Leonardo

Joe Walnes

unread,

Nov 9, 2006, 2:56:13 PM11/9/06

to objene...@googlegroups.com

On 11/9/06, Leonardo Mesquita <mrb...@gmail.com> wrote:
>
>
> I've committed the Sun13Instantiator, which tested successfully, but
> I found some issues in the build:

Cool!

I'll have a go at fixing up the build.

> * The attemptToRegisterInstantiator crashes upon Errors, like
> java.lang.ExceptionInInitializerError. Maybe it is not a good practice
> to throw exceptions in a static block, which I did in Sun13Instantiator...

Yeah. In fact it's a very bad practice :) as you often have no control
over where the exception will be thrown.

> By the way, I'd like to write a TCK for the alternative API that I
> proposed, but I haven't worked for a collaborative project in the way I
> am doing now, so I don't know if it could be considered "impolite" if I
> modified your TCK to include my tests. What would you prefer me to do,
> write a separate TCK and leave integration for later, or just go ahead
> and make the changes myself?

Go ahead and make the changes. Please switch all the code over to
using the new API.

My usual policy is to feel free to make whatever changes you think are
suitable, commit them and then allow to discuss. Also, try and make
frequent small changes, syncing regularly as it reduces the chance of
conflict.

cheers
-j

Leonardo Mesquita

unread,

Nov 10, 2006, 9:07:28 AM11/10/06

to Objenesis developers

Joe Walnes wrote:
>
> Go ahead and make the changes. Please switch all the code over to
> using the new API.
>

Alright, I moved everything to the new API. I made sure every test was
running, and fixed a subtle gotcha in TextReporterTest that was making
it fail in Windows. Apparently, "\n" won't get you the same String as
PrintStream.println().

Also, I changed the catch(RuntimeException) (which I dislike) in
NewInstanceInstantiator for catch(Exception) (which I dislike even
more, but seems to be more appropriate for what you were trying to do).
I am not quite sure if the "return null on error" is the best policy.
It's too C++ish ;-)

Leonardo

Joe Walnes

unread,

Nov 10, 2006, 11:12:24 AM11/10/06

to objene...@googlegroups.com

On 11/10/06, Leonardo Mesquita <mrb...@gmail.com> wrote:
> Alright, I moved everything to the new API. I made sure every test was
> running, and fixed a subtle gotcha in TextReporterTest that was making
> it fail in Windows. Apparently, "\n" won't get you the same String as
> PrintStream.println().

Cool.

> Also, I changed the catch(RuntimeException) (which I dislike) in
> NewInstanceInstantiator for catch(Exception) (which I dislike even
> more, but seems to be more appropriate for what you were trying to do).
> I am not quite sure if the "return null on error" is the best policy.
> It's too C++ish ;-)

Yeah, it made me feel dirty too :)

Reply all

Reply to author

Forward