Pain point: getting the classes of a Java package

7 views
Skip to first unread message

Frank Wierzbicki

unread,
Mar 20, 2009, 9:35:03 AM3/20/09
to jvm-la...@googlegroups.com
I had a short exchange with John Rose about the lack of getClasses()
on java.lang.Package. Jython has a need for something like this, and
it has a somewhat awkward way of getting at this information. I
thought I'd flesh out the question and the use cases on this list,
since others here may have similar needs.

To start with, I think I understand why getClasses isn't "just there"
on java.lang.Package. Even within one classloader, packages can come
from a whole host of places, and since packages have merging behavior
in Java, it is really hard to know that you have all of the classes
for a particular package. Python gets around this worry since it has
a "winner take all" package importing semantic (the downside is that
you can't really use the reverse url namespaces favored by Java).

John asked if a getLoadedClasses isolated to one classloader would
work. Though this might help make things better (and I don't think
the single classloader is a problem), I don't think it will be enough
to replace Jython's current functionality.

So here are the most important use cases from Jython:

1. A Java package can be imported and then the classes can be listed.

2. you can "import *" from a java package and drop all of the classes
from that package into the root namespace (though this is considered
poor style from both the Python and Jython communities)

>>> from java import nio
>>> dir(nio)
['Buffer', 'BufferOverflowException', 'BufferUnderflowException',
'ByteBuffer', 'ByteOrder', 'CharBuffer', 'DoubleBuffer',
'FloatBuffer', 'IntBuffer', 'InvalidMarkException', 'LongBuffer',
'MappedByteBuffer', 'ReadOnlyBufferException', 'ShortBuffer',
'__name__', 'channels', 'charset']
>>> from java.util import *
>>> x = ArrayList([1,2,3])

So I suspect that getLoadedClasses() would only give you those classes
that had already been used in some way and so wouldn't be sufficient
for these use cases. To continue the conversation -- would it be
possible to get a list or array of Strings that would represent the
classes in packages that a Classloader already knows about? I can do
the rest from there.

The method that Jython uses goes something like this:

The first time Jython is started, it walks through all of the jars and
class directories on the filesystem that it knows about (including,
for example classes.jar, the jars in the ext dir, etc) and creates a
filesystem cache of all of the package names and corresponding class
names. Each time it starts up again it will check the filesystem
timestamps vs. the cache timestamps and adjust the caches as needed.
It also updates the caches when jars or Java classes are added to
Jython's path at runtime. If it doesn't find a particular package it
will also check java.lang.Package.getPackages().

For those that are interested, I do plan to get this mechanism in
shape for use outside of Jython -- there is a start to this extraction
here:
https://jvm-language-runtime.googlecode.com/svn/trunk/packagecache but
it will need lots of work yet to become a general solution usable
outside of Jython. Hopefully in the next couple of months
I'll have time to help get this code in shape for use by JRuby.
Hopefully getting this mechanism usable across two projects will be
enough to get it nicely isolated from language specifics. I started
this extraction at Charlie Nutter's request (the request is sadly
going on two years now).

-Frank

John Cowan

unread,
Mar 20, 2009, 11:37:17 AM3/20/09
to jvm-la...@googlegroups.com
On Fri, Mar 20, 2009 at 9:35 AM, Frank Wierzbicki <fwier...@gmail.com> wrote:

> The first time Jython is started, it walks through all of the jars and
> class directories on the filesystem that it knows about (including,
> for example classes.jar, the jars in the ext dir, etc) and creates a
> filesystem cache of all of the package names and corresponding class
> names.

Great stuff! I could really use this part for my static compiler, so
it can discover what the current JRE contains (and therefore what your
code might reference). In my case it would be standalone, since I
don't want to bother with this every time I start up.

> https://jvm-language-runtime.googlecode.com/svn/trunk/packagecache

That's password protected.

--
GMail doesn't have rotating .sigs, but you can see mine at
http://www.ccil.org/~cowan/signatures

Jim Baker

unread,
Mar 20, 2009, 11:41:05 AM3/20/09
to jvm-la...@googlegroups.com

Neil Bartlett

unread,
Mar 20, 2009, 12:46:55 PM3/20/09
to jvm-la...@googlegroups.com
One word of caution. It sounds like this approach would not work in a
variety of classloading scenarios, such as J(2)EE or OSGi, since you
make some assumptions about JAR files and the visibility of the
application "classpath" etc.

I do a lot of work in OSGi and the proposal from John to get all
classes in a package according to one classloader would probably work
well, since OSGi strongly discourages "split packages", i.e. packages
provided by more than one module and therefore classloader.

Regards,
Neil

John Wilson

unread,
Mar 20, 2009, 1:00:36 PM3/20/09
to jvm-la...@googlegroups.com
2009/3/20 Neil Bartlett <njbar...@gmail.com>:

>
> One word of caution. It sounds like this approach would not work in a
> variety of classloading scenarios, such as J(2)EE or OSGi, since you
> make some assumptions about JAR files and the visibility of the
> application "classpath" etc.
>
> I do a lot of work in OSGi and the proposal from John to get all
> classes in a package according to one classloader would probably work
> well, since OSGi strongly discourages "split packages", i.e. packages
> provided by more than one module and therefore classloader.

Certainly it is not possible to answer the question "Please give me
the names of all the classes in this package" in all cases.

I have a ClassLoader which dynamically generate classes "on demand"
(it's producing adapter classes with characteristics determined by the
name of the class asked for). This results in a package with an
infinite number of classes.

John Wilson

Colin Walters

unread,
Mar 20, 2009, 1:28:01 PM3/20/09
to JVM Languages
On Mar 20, 9:35 am, Frank Wierzbicki <fwierzbi...@gmail.com> wrote:
> I had a short exchange with John Rose about the lack of getClasses()
> on java.lang.Package.  Jython has a need for something like this, and
> it has a somewhat awkward way of getting at this information.  I
> thought I'd flesh out the question and the use cases on this list,
> since others here may have similar needs.

Can integration be done in a nicer way on top of Jigsaw?

Frank Wierzbicki

unread,
Mar 20, 2009, 5:08:58 PM3/20/09
to jvm-la...@googlegroups.com
On Fri, Mar 20, 2009 at 12:46 PM, Neil Bartlett <njbar...@gmail.com> wrote:
>
> One word of caution. It sounds like this approach would not work in a
> variety of classloading scenarios, such as J(2)EE or OSGi, since you
> make some assumptions about JAR files and the visibility of the
> application "classpath" etc.
Agreed -- I know well that Jython's method has some powerful limitations.

> I do a lot of work in OSGi and the proposal from John to get all
> classes in a package according to one classloader would probably work
> well, since OSGi strongly discourages "split packages", i.e. packages
> provided by more than one module and therefore classloader.

I think John's method could work well for us. Given that the name is
"getLoadedClasses" I worry about classes that could be reached by the
class loader but have not yet been loaded. I am on shaky ground here
I must admit, but my understanding is that classes are not guaranteed
to be loaded until they are used -- I presume that this means some
code somewhere has triggered a ClassLoader#loadClass -- so would
looking for them in this manner kick them into being loaded?

The javadoc for java.lang.ClassLoader#loadClass gives me hope:

Returns the class with the given name if this loader has been recorded
by the Java virtual machine as an initiating loader of a class with
that name.

-Frank

Charles Oliver Nutter

unread,
Mar 20, 2009, 5:55:12 PM3/20/09
to jvm-la...@googlegroups.com
Frank Wierzbicki wrote:
> On Fri, Mar 20, 2009 at 12:46 PM, Neil Bartlett <njbar...@gmail.com> wrote:
>> One word of caution. It sounds like this approach would not work in a
>> variety of classloading scenarios, such as J(2)EE or OSGi, since you
>> make some assumptions about JAR files and the visibility of the
>> application "classpath" etc.
> Agreed -- I know well that Jython's method has some powerful limitations.
>
>> I do a lot of work in OSGi and the proposal from John to get all
>> classes in a package according to one classloader would probably work
>> well, since OSGi strongly discourages "split packages", i.e. packages
>> provided by more than one module and therefore classloader.
> I think John's method could work well for us. Given that the name is
> "getLoadedClasses" I worry about classes that could be reached by the
> class loader but have not yet been loaded. I am on shaky ground here
> I must admit, but my understanding is that classes are not guaranteed
> to be loaded until they are used -- I presume that this means some
> code somewhere has triggered a ClassLoader#loadClass -- so would
> looking for them in this manner kick them into being loaded?

I'm not sure how useful getLoadedClasses would even be. Most classes
load so late that you'd never see them in this list, and since the
purpose of doing the import is to get access to classes you haven't
accessed yet...they're almost certain to not be loaded.

I guess I don't see a good solution other than the Jython packagecache,
since in all cases the purpose of getting the list of classes is to
discover classes we haven't loaded yet. There's simply no way for
anything that depends on classes being loaded *first* to give us a
useful list.

- Charlie

Frank Wierzbicki

unread,
Mar 20, 2009, 6:33:29 PM3/20/09
to jvm-la...@googlegroups.com
On Fri, Mar 20, 2009 at 5:08 PM, Frank Wierzbicki <fwier...@gmail.com> wrote:
> The javadoc for java.lang.ClassLoader#loadClass gives me hope:
>
> Returns the class with the given name if this loader has been recorded
> by the Java virtual machine as an initiating loader of a class with
> that name.
Sorry, that was ClassLoader#findLoadedClass

-Frank

John Rose

unread,
Mar 20, 2009, 8:57:14 PM3/20/09
to jvm-la...@googlegroups.com
On Mar 20, 2009, at 2:55 PM, Charles Oliver Nutter wrote:

> I'm not sure how useful getLoadedClasses would even be. Most classes
> load so late that you'd never see them in this list, and since the
> purpose of doing the import is to get access to classes you haven't
> accessed yet...they're almost certain to not be loaded.

Yes, if this is for emulating things like import pkg.*,
getLoadedClasses is pretty useless.

It sounds like you mainly want the class loader to confess what it
knows about finding classes, in a way that lets you predict (more or
less) whether it would load a class of some given name. (Plus a way
of enumerating such names.)

Class loaders can do arbitrary magic, so such an API would have to
have a big Caveat Emptor on it. Still, a few possibilities suggest
themselves; see below.

I wonder if there is some way to cobble together the functionality
from ClassLoader.getResources(pkgName) plus some new functionality (or
guarantees) on directory URLs and listing. Again, caveats are needed,
but something useful might be possible.

-- John

/** Disclose the source(s) that this loader is likely to use in the
future for resolving class names.
* The result is purely advisory, as the loader may actually use
different sources when actually requested to load a class.
*/
List<URL> ClassLoader.getSourceURLs();

/** Given a package qualifier, return the names (minus the qualifier)
of all classes which this loader is likely to be able to load in the
future.
* If includeSubpackages, include in the list the names (partially
qualified) of classes in subpackages also.
* Passing the arguments ("", true) will produce a list of al
loadable classes in all packages, included the anonymous package.
* The result is purely advisory, as the loader may actually produce
different names when actually requested to load a class.
*/
List<String> ClassLoader.listPackageClasses(String package, boolean
includeSubpackages);

(Actually, List might have to be Enumeration for coherence with
existing CL methods.)

Frank Wierzbicki

unread,
Mar 20, 2009, 10:05:17 PM3/20/09
to jvm-la...@googlegroups.com
On Fri, Mar 20, 2009 at 8:57 PM, John Rose <John...@sun.com> wrote:

> /** Disclose the source(s) that this loader is likely to use in the
> future for resolving class names.
>  *  The result is purely advisory, as the loader may actually use
> different sources when actually requested to load a class.
>  */
> List<URL> ClassLoader.getSourceURLs();
>
> /** Given a package qualifier, return the names (minus the qualifier)
> of all classes which this loader is likely to be able to load in the
> future.
>  *  If includeSubpackages, include in the list the names (partially
> qualified) of classes in subpackages also.
>  *  Passing the arguments ("", true) will produce a list of al
> loadable classes in all packages, included the anonymous package.
>  *  The result is purely advisory, as the loader may actually produce
> different names when actually requested to load a class.
>  */
> List<String> ClassLoader.listPackageClasses(String package, boolean
> includeSubpackages);
>
> (Actually, List might have to be Enumeration for coherence with
> existing CL methods.)

That would be just about perfect (caveats are fine and expected), and
it would not take very much bending at all to get the packagecache
mechanism to provide a similar API for the cases that it covers (so
that older JDKs could use something like this).

-Frank

Charles Oliver Nutter

unread,
Mar 20, 2009, 10:16:24 PM3/20/09
to jvm-la...@googlegroups.com
Frank Wierzbicki wrote:
> That would be just about perfect (caveats are fine and expected), and
> it would not take very much bending at all to get the packagecache
> mechanism to provide a similar API for the cases that it covers (so
> that older JDKs could use something like this).

Yeah, reading John's response made me think "hey, that's essentially
what packagecache gives us, in another form". Walking resources and
having access to the same smarts about what's a class and what isn't
would get us most of the way, I think.

For what it's worth, JRuby handles import pkg.* by adding a
const_missing hook to that namespace that handles missing constants by
trying to classload them, prepending the package name. So it's sort of a
lazy way to add a package into the search sequence for a constant. We'd
rather use something like packagecache, since adding hooks like
const_missing is a little invasive (and there's always a possibility
someone else has installed their own we would be interfering with).

- Charlie

Frank Wierzbicki

unread,
Mar 20, 2009, 10:31:15 PM3/20/09
to jvm-la...@googlegroups.com
On Fri, Mar 20, 2009 at 10:16 PM, Charles Oliver Nutter
<charles...@sun.com> wrote:
> For what it's worth, JRuby handles import pkg.* by adding a
> const_missing hook to that namespace that handles missing constants by
> trying to classload them, prepending the package name. So it's sort of a
> lazy way to add a package into the search sequence for a constant. We'd
> rather use something like packagecache, since adding hooks like
> const_missing is a little invasive (and there's always a possibility
> someone else has installed their own we would be interfering with).
That's interesting, I might want to take a look at JRuby's
const_missing hook at some point. We allow users to turn the
packagecache mechanism off (there are lots of situations where it is
undesirable), but this causes code like

from java.nio import *

To work when it is on, but fail in an ugly way when off. It would be
nice to have a backup plan.

-Frank

Brian Frank

unread,
Mar 21, 2009, 9:11:06 AM3/21/09
to JVM Languages
> The first time Jython is started, it walks through all of the jars and
> class directories on the filesystem that it knows about (including,
> for example classes.jar, the jars in the ext dir, etc) and creates a
> filesystem cache of all of the package names and corresponding class
> names.

I feel your pain - we are doing the exact same thing for Fan for the
Java FFI. Although I want to know all the packags too - basically a
model of the entire type namespace (limited of course to classes which
have been statically compiled).

Frank Wierzbicki

unread,
Mar 21, 2009, 9:24:11 AM3/21/09
to jvm-la...@googlegroups.com
On Sat, Mar 21, 2009 at 9:11 AM, Brian Frank <brian...@gmail.com> wrote:
> I feel your pain - we are doing the exact same thing for Fan for the
> Java FFI.  Although I want to know all the packags too - basically a
> model of the entire type namespace (limited of course to classes which
> have been statically compiled).
If I'm understanding you, Jython does the same (it keeps a tree of
PyJavaPackage -- our internal representation of java.lang.Package).

-Frank

Reply all
Reply to author
Forward
0 new messages