Guice, Guava, and ProGuard

417 views
Skip to first unread message

Stuart McCulloch

unread,
Jul 5, 2011, 3:20:52 PM7/5/11
to google...@googlegroups.com
Hi folks,

Background:

The Guice project tree used to include a sub-set of Guava source code, manually refactored under "core/src/com/google/inject/internal/util". Guava is an internal implementation detail, just like CGLIB and ASM, which is why it is embedded under an internal package inside Guice. Recently this code was removed and replaced with a dependency to guava-r09.jar whose classes are now refactored and shrunk by jarjar at build time. Unfortunately, because of a limitation in jarjar (see an earlier discussion under http://code.google.com/p/google-guice/issues/detail?id=264) it has trouble shrinking Guava down to the subset of classes which are strictly required by Guice. This means that the current Guice jar built from trunk is now just over 1Mb compared to just under 700Kb for Guice 3, which might not be an issue for most desktop developers but could be a concern for mobile developers.

Enter ProGuard:

There are several jar shrinking tools around, one of the better ones being ProGuard http://proguard.sourceforge.net/ However it doesn't appear to have the capability to preserve the classname when you ask it to refactor packages, instead it renames them to short names such as A, B, etc. which isn't good for debugging. The best solution seems to be a combination of ProGuard (for shrinking) and jarjar (for package refactoring) and with a bit of tweaking I can get the Guice jar down to around 830kb (still not as good as when we manually copied in the subset of Guava source, but not bad).

Fragility Concerns:

While adding ProGuard to my prototype Guice build I ran into a couple of cases where extensions and tests relied on Guava methods that weren't used in Guice core. Because ProGuard (and jarjar) operate on Guice core before the tests or extensions are built, it has no knowledge that these methods should be retained, which leads to odd test failures. While I can add explicit instructions to tell ProGuard to keep these methods and classes around, I'm concerned that this makes the build much more fragile. An extension developer could easily depend on a new method in Guava (since it's on the project classpath) and this new method wouldn't be repackaged inside Guice unless the ProGuard instructions were updated (or unless Guice core just happened to also use the same method).

Note this wasn't an issue when we manually included the sub-set of Guava source code, because the extensions built against the classes under "core/src/com/google/inject/internal/util" and not against the whole of Guava.

Possible Options?

1) Do nothing (and live with the larger jar size where we basically embed all of Guava)

2) Add ProGuard to the core build (and live with having to update the instructions when extensions use new Guava methods)

3) Revert to including the sub-set of Guava source code (harder to update, but perhaps more manageable / less fragile?)

4) Build Guice core + extensions without any jarjar'ing or ProGuard'ing, only do that as a final distribution / packaging step

5) Make Guava an external dependency, since a lot of people already use Guava (leads to potential versioning issues)

Note option 4) would be moving more towards the model where everything is built as simple Java projects and we only apply jarjar, etc. to get a "nodeps" binary (to borrow the phrase from CGLIB) at the end. The downside of this approach is that the unit tests would run against the un-jarjar'd binaries, but there's no reason why we couldn't do an extra round of tests against the final jarjar'd artifact. The other worry is that people might take the interim jars which haven't been jarjar'd by mistake (but then again there have been requests in the past from users to produce such artifacts, see http://code.google.com/p/google-guice/issues/detail?id=289).

Would like to hear people's thoughts on these options, since I don't think there's an obvious "correct" choice*

--
Cheers, Stuart

(* of course if everyone used OSGi then we wouldn't need to worry about embedding and could just declare we needed the appropriate versions of ASM/CGLIB/Guava ;)

Sam Berlin

unread,
Jul 5, 2011, 3:47:30 PM7/5/11
to google...@googlegroups.com
Thanks for looking into this, Stuart!  Some replies inline..

On Tue, Jul 5, 2011 at 3:20 PM, Stuart McCulloch <mcc...@gmail.com> wrote:
Hi folks,

Background:

The Guice project tree used to include a sub-set of Guava source code, manually refactored under "core/src/com/google/inject/internal/util". Guava is an internal implementation detail, just like CGLIB and ASM, which is why it is embedded under an internal package inside Guice. Recently this code was removed and replaced with a dependency to guava-r09.jar whose classes are now refactored and shrunk by jarjar at build time. Unfortunately, because of a limitation in jarjar (see an earlier discussion under http://code.google.com/p/google-guice/issues/detail?id=264) it has trouble shrinking Guava down to the subset of classes which are strictly required by Guice. This means that the current Guice jar built from trunk is now just over 1Mb compared to just under 700Kb for Guice 3, which might not be an issue for most desktop developers but could be a concern for mobile developers.

Enter ProGuard:

There are several jar shrinking tools around, one of the better ones being ProGuard http://proguard.sourceforge.net/  However it doesn't appear to have the capability to preserve the classname when you ask it to refactor packages, instead it renames them to short names such as A, B, etc. which isn't good for debugging. The best solution seems to be a combination of ProGuard (for shrinking) and jarjar (for package refactoring) and with a bit of tweaking I can get the Guice jar down to around 830kb (still not as good as when we manually copied in the subset of Guava source, but not bad).

I haven't messed around with ProGuard much, but looking at the reference page @ http://proguard.sourceforge.net/index.html#/manual/refcard.html shows a few options that seem like they'd get the job done for not renaming: --keepnames (and a few others with that prefix), --dontoptimize, --dontobfuscate, --keeppackagenames, --keepparameternames.
 

Fragility Concerns:

While adding ProGuard to my prototype Guice build I ran into a couple of cases where extensions and tests relied on Guava methods that weren't used in Guice core. Because ProGuard (and jarjar) operate on Guice core before the tests or extensions are built, it has no knowledge that these methods should be retained, which leads to odd test failures. While I can add explicit instructions to tell ProGuard to keep these methods and classes around, I'm concerned that this makes the build much more fragile. An extension developer could easily depend on a new method in Guava (since it's on the project classpath) and this new method wouldn't be repackaged inside Guice unless the ProGuard instructions were updated (or unless Guice core just happened to also use the same method).

Note this wasn't an issue when we manually included the sub-set of Guava source code, because the extensions built against the classes under "core/src/com/google/inject/internal/util" and not against the whole of Guava.

Possible Options?

1)  Do nothing (and live with the larger jar size where we basically embed all of Guava)

If we need to, I think this is a valid choice.  The jar for mobile folks will be smaller anyway, since it won't include cglib or asm.  Also, the only bits of Guava that are kept are any referenced (transitively) classfiles.  In practice, that ends up as most of com.google.common.base & com.google.common.collect, plus a few scattered com.google.common.io classes.  For the non-mobile version, it's about a ~400k increase the main jar size.
 

2)  Add ProGuard to the core build (and live with having to update the instructions when extensions use new Guava methods)

IMO, this is the best option.  The extensions aren't updated all that often, and it would be rare that one needs to use a method/class that the core isn't using.  An alternative here would be to make the extensions ProGuard/jarjar their own Guavas.  That would slightly increase the size of extensions that use them, but not by much.
 

3)  Revert to including the sub-set of Guava source code (harder to update, but perhaps more manageable / less fragile?)

Definitely -1 here.  Pointing to Guava gives us a lot of nice things, such as making it a whole lot easier to update when new Guava releases come out (such as the soon-to-be-shipped r10 which removes FinalizableReferenceQueue entirely, fixing issue 288 for us just by dropping in a new jar).
 

4)  Build Guice core + extensions without any jarjar'ing or ProGuard'ing, only do that as a final distribution / packaging step

This sounds like an an implementation detail and we'd still have to resolve #2 or #3... and I'd very much like to avoid people accidentally building against the wrong version.
 

5)  Make Guava an external dependency, since a lot of people already use Guava (leads to potential versioning issues)

I'm -0.5 on this, because I don't really want to be tied to Guava versions.  Depending on it directly is kind of like making it part of the public API, and I'd rather not do that.

sam
 

Note option 4) would be moving more towards the model where everything is built as simple Java projects and we only apply jarjar, etc. to get a "nodeps" binary (to borrow the phrase from CGLIB) at the end. The downside of this approach is that the unit tests would run against the un-jarjar'd binaries, but there's no reason why we couldn't do an extra round of tests against the final jarjar'd artifact. The other worry is that people might take the interim jars which haven't been jarjar'd by mistake (but then again there have been requests in the past from users to produce such artifacts, see http://code.google.com/p/google-guice/issues/detail?id=289).

Would like to hear people's thoughts on these options, since I don't think there's an obvious "correct" choice*

--
Cheers, Stuart

(* of course if everyone used OSGi then we wouldn't need to worry about embedding and could just declare we needed the appropriate versions of ASM/CGLIB/Guava ;)

--
You received this message because you are subscribed to the Google Groups "google-guice" group.
To post to this group, send email to google...@googlegroups.com.
To unsubscribe from this group, send email to google-guice...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-guice?hl=en.


James Moore

unread,
Jul 5, 2011, 4:15:57 PM7/5/11
to google...@googlegroups.com
Speaking as a Guice + Guava (on Android) user, I'd vote for #5, make
Guava an external dependency.

1) Do nothing (and live with the larger jar size where we basically
embed all of Guava)

I think this is a problem from the Android point of view. What
happens when you're still on Guava r09, and I want r10? (That's the
situation right now, I think, since I'm switching to trunk Guava/r10 +
Guice). That's a lot of extra duplicated code to have installed.

--
James Moore
ja...@restphone.com
http://jamesmoorecode.blogspot.com/

Stuart McCulloch

unread,
Jul 5, 2011, 4:26:03 PM7/5/11
to google...@googlegroups.com
On 5 Jul 2011, at 20:47, Sam Berlin wrote:

Thanks for looking into this, Stuart!  Some replies inline..

On Tue, Jul 5, 2011 at 3:20 PM, Stuart McCulloch <mcc...@gmail.com> wrote:
Hi folks,

Background:

The Guice project tree used to include a sub-set of Guava source code, manually refactored under "core/src/com/google/inject/internal/util". Guava is an internal implementation detail, just like CGLIB and ASM, which is why it is embedded under an internal package inside Guice. Recently this code was removed and replaced with a dependency to guava-r09.jar whose classes are now refactored and shrunk by jarjar at build time. Unfortunately, because of a limitation in jarjar (see an earlier discussion under http://code.google.com/p/google-guice/issues/detail?id=264) it has trouble shrinking Guava down to the subset of classes which are strictly required by Guice. This means that the current Guice jar built from trunk is now just over 1Mb compared to just under 700Kb for Guice 3, which might not be an issue for most desktop developers but could be a concern for mobile developers.

Enter ProGuard:

There are several jar shrinking tools around, one of the better ones being ProGuard http://proguard.sourceforge.net/  However it doesn't appear to have the capability to preserve the classname when you ask it to refactor packages, instead it renames them to short names such as A, B, etc. which isn't good for debugging. The best solution seems to be a combination of ProGuard (for shrinking) and jarjar (for package refactoring) and with a bit of tweaking I can get the Guice jar down to around 830kb (still not as good as when we manually copied in the subset of Guava source, but not bad).

I haven't messed around with ProGuard much, but looking at the reference page @ http://proguard.sourceforge.net/index.html#/manual/refcard.html shows a few options that seem like they'd get the job done for not renaming: --keepnames (and a few others with that prefix), --dontoptimize, --dontobfuscate, --keeppackagenames, --keepparameternames.

yeah, those either keep the entire name (ie. package + simple name) or let you relocate classes to a different package but doesn't keep the simple class name

I've not (yet) found a ProGuard option that can match what jarjar does, which is relocate the package but retain the simple name (which helps when debugging)

Yuri de Wit

unread,
Jul 5, 2011, 5:42:35 PM7/5/11
to google...@googlegroups.com
+1 for #5. There are legit cases for needing different dependency
versions in larger projects and having that flexibility is a big plus.
Besides, for the default case, a maven or regular zip distributions
could automatically point or include the official, tested Guava
version for a given Guice version.

-- yuri

cowwoc

unread,
Jul 5, 2011, 5:59:08 PM7/5/11
to google...@googlegroups.com

I vote for making the common case easy and advanced case possible.
Large projects absolutely need the flexibility to choose which
dependency versions gets used. Simple projects prefer an all-in-use JAR
file. I believe #4 will make everyone happy.

When Maven is thrown into the picture it's even more difficult for
users to screw things up. You get the correct versions of dependencies
by default but you can override them if you need to. Guice should
provide the same model...

Gili

Sam Berlin

unread,
Jul 5, 2011, 6:03:03 PM7/5/11
to google...@googlegroups.com
I don't understand this need for large projects to know which version gets used.  It's an implementation detail within Guice.  There won't be any version conflicts.  Why does a "large project" have to be able to change the version?

sam

To unsubscribe from this group, send email to google-guice+unsubscribe@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/google-guice?hl=en.


--
You received this message because you are subscribed to the Google Groups "google-guice" group.
To post to this group, send email to google...@googlegroups.com.
To unsubscribe from this group, send email to google-guice+unsubscribe@googlegroups.com.

James Moore

unread,
Jul 5, 2011, 7:29:10 PM7/5/11
to google...@googlegroups.com
On Tue, Jul 5, 2011 at 2:59 PM, cowwoc <cow...@bbs.darktech.org> wrote:
>
>    I vote for making the common case easy and advanced case possible. Large
> projects absolutely need the flexibility to choose which dependency versions
> gets used. Simple projects prefer an all-in-use JAR file. I believe #4 will
> make everyone happy.

I think I would make a case for a different "common case." To me, the
common case is that Guice depends on Guava. That's an incredibly
common relationship. I certainly don't want every library I use to
start shipping all of its dependencies compiled into that library,
obfuscated in some way. i just want to know the list of things that
LibraryX depends on.

Providing other options (a single jar built with special sauce so you
don't need other jars) is fine, but it's special and extra, not a
default.

Guice in particular I think doesn't need this. Are there users who a)
understand and want the complexities and utility of Guice, and b) are
clueless about libraries with dependencies? I'd be surprised if there
were anyone in that set.

Yuri de Wit

unread,
Jul 5, 2011, 7:22:25 PM7/5/11
to google...@googlegroups.com
If all the dependencies we use in our apps were to embed their own
private dependencies there would be a lot of code duplication. True,
that would not be a show stopper nor a problem in some cases, but I
guess I don't necessarily see a lot of benefit in even worrying about
that. If the size becomes a problem for my app in my context, I am
still able to perform my own jarjar/proguard on the integrated
dependency tree to squeeze the size of the final app.
-- yuri

>>>> google-guice...@googlegroups.com.


>>>> For more options, visit this group at
>>>> http://groups.google.com/group/google-guice?hl=en.
>>>>
>>>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "google-guice" group.
>> To post to this group, send email to google...@googlegroups.com.
>> To unsubscribe from this group, send email to

>> google-guice...@googlegroups.com.


>> For more options, visit this group at
>> http://groups.google.com/group/google-guice?hl=en.
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "google-guice" group.
> To post to this group, send email to google...@googlegroups.com.
> To unsubscribe from this group, send email to

> google-guice...@googlegroups.com.

limpb...@gmail.com

unread,
Jul 5, 2011, 8:44:23 PM7/5/11
to google-guice
On Jul 5, 12:20 pm, Stuart McCulloch <mccu...@gmail.com> wrote:
> 5)  Make Guava an external dependency, since a lot of people already use Guava (leads to potential versioning issues)

If we avoid @Beta APIs, the versioning downside of a direct dependency
on Guava is quite small. The Guava Authors have committed to keeping
all of their non-beta APIs for deprecation+18 months.

I'm currently leaning towards shipping Guava as a public dependency.
Where that's a problem the application itself should use ProGuard or
jarjar to shrink their complete binary.

cowwoc

unread,
Jul 5, 2011, 9:05:23 PM7/5/11
to google...@googlegroups.com

I agree 100%!

To answer Sam's question about why controlling dependencies is
important, it's two-fold:

1) I don't trust JarJar. It's a great tool, but like Proguard, it'll
never work perfectly. There will always be problems where embedded
dependencies interfere with libraries outside of Guice. The only way to
avoid this is to JarJar all dependencies recursively and even then you
have to worry about non-class resource files (stuff in or around
manifest files), classes loaded by Reflection, etc.

2) We should be able to upgrade dependencies for security/performance
reasons. We don't have the luxury of waiting for the next Guice release
(averaging once every 2 years) to fix bugs in dependencies.

But most importantly of all, it's not clear to me what problem
you're trying to solve here. As James mentioned, it is highly unlikely
that users will simultaneously understand the complexities of Guice and
be clueless about libraries with external dependencies, especially when
Guice's deployment model is the exception to the rule. Most libraries do
not embed their dependencies. In the many years of reading mailing list
posts I can only recall about three incidents that were linked to mixing
incompatible dependencies. There's no need to baby us. Honest :)

Gili

Stuart McCulloch

unread,
Jul 5, 2011, 9:15:01 PM7/5/11
to google...@googlegroups.com
On 6 Jul 2011, at 01:44, "je...@swank.ca" <limpb...@gmail.com> wrote:

> On Jul 5, 12:20 pm, Stuart McCulloch <mccu...@gmail.com> wrote:
>> 5) Make Guava an external dependency, since a lot of people already use Guava (leads to potential versioning issues)
>
> If we avoid @Beta APIs, the versioning downside of a direct dependency
> on Guava is quite small. The Guava Authors have committed to keeping
> all of their non-beta APIs for deprecation+18 months.
>
> I'm currently leaning towards shipping Guava as a public dependency.

Yes, that would also be my preferred option - it seems odd to hide away a complete copy of Guava inside Guice when a lot of people will combine Guava+Guice anyway

> Where that's a problem the application itself should use ProGuard or
> jarjar to shrink their complete binary.
>

Colin Decker

unread,
Jul 5, 2011, 9:25:03 PM7/5/11
to google...@googlegroups.com
I'm in favor of this as well. Guava wasn't around when Guice first came out, but now that it is it makes sense as a public dependency. It's consistent with most libraries, will prevent redundancy for Guice/Guava users and, as mentioned, ProGuard on the application can eliminate unused parts of Guava. If Guice already depended on Guava directly, *users* would be able to fix issue 288 just by upgrading to Guava r10 when it comes out, with no need to do anything on the Guice end! Guice does need to avoid @Beta APIs of course.

-- 
Colin


Bob Lee

unread,
Jul 5, 2011, 9:30:42 PM7/5/11
to google...@googlegroups.com
+1 to making Guava a public dependency! It sounds like we're almost all in agreement.

Bob

Alen Vrečko

unread,
Jul 6, 2011, 6:14:45 AM7/6/11
to google-guice
The Nr.1 thing I enjoy about Guice is that is an AK-47 type of
library, just put it on the CP and it will work, no hassle. It just
Works. Fool proof.

Adding an external library will make it less It Just Works - It is
Fool Proof kind of thing. Maybe I am an extreme case here, but I'd
very much prefer X KB more stuff and 0% chance of dependency problems
than X KB less stuff and greater than 0% chance of dependency
problems.

I'd go with Option 4 (patch jarjar/proguard if needed to tweak
things) ;)

Cheers
Alen

Sam Berlin

unread,
Jul 6, 2011, 10:12:19 AM7/6/11
to google...@googlegroups.com
So it does look like a tidal wave of support for making it a public dependency, with a few lone dissenters.  I'll try to lay out the reasons I think making it a public dependency isn't as great as it sounds, but that said -- I won't stand in the way of doing it.

First, the only reasons why a public dependency is good: It reduces code duplication.

To me, that's the only reason I understand.  I frankly don't understand any of the other reasons.  Here's a few that were listed:

a) Wanting to upgrade the packaged dependencies.
    In my head, this is equivalent to saying, "I'd like to take the Guice source and modify it a bit."  The fact it uses Guava is entirely, 100% an implementation detail.  Why would you ever need to upgrade the packaged dependency?  It's not something that should matter to a user.

b) A reason for wanting to upgrade dependencies: security/performance.
   Again, this seems like it applies equally true to the core Guice code itself.  If you're going to be inspecting implementation details of a library, you may as well go into the source & build your own distribution of it.

c) Not trusting Jarjar.
    Sorry, this just isn't a valid reason to me.  JarJar works.  It does exactly what you're saying: recursive dependency analysis.  It reads the bytecode for classes you're asking to move and makes sure that any referenced classes are found in the right place.  Of course it fails on reflection, but you just don't use it for things that need reflection.  I haven't worked with ProGuard much, but AFAIK, a whole ton of apps use it, and it works fine.

d) A complete copy of Guava is inside Guice
  
If this were the case, I'd be inclined to go along with the prevailing sentiment.. but it's not true.  Guava itself is 1.1mb.  Guice after the change to use JarJar'd Guava is ~1mb.  That's clearly not the whole of Guava.  In fact, it's just bits of two packages:  com.google.common.{base,collect} and two classes from com.google.common.io.  A tiny fraction of Guava.

The reasons why I think exposing the dependency is a bad idea:

1) Where do we stop?  Should we expose cglib & asm as real dependencies that can be upgraded too (real question)?  That exposes the means by which Guice is doing its bytecode manipulation for grabbing line numbers & for AOP.  If Guice decides to change how it does it, then people may have a false dependency/requirement on cglib & asm.

 2) It's as if com.google.inject.internal APIs were exposed -- its something that Guice should be free to change without any user having to take any action at all, but now we're forcing them to think & act on it.

 3) Guice has averaged 2 year releases (for better or worse).  Guava gives ~1.5 year leeway before removing any APIs  Should we be forced to release a new version of Guice because a dependency changed its API?  If we don't release a new one, that gets into some severe versioning hell.

sam


On Tue, Jul 5, 2011 at 9:30 PM, Bob Lee <craz...@crazybob.org> wrote:
+1 to making Guava a public dependency! It sounds like we're almost all in agreement.

Bob

--

cowwoc

unread,
Jul 6, 2011, 10:40:27 AM7/6/11
to google...@googlegroups.com
Hi Sam,

On 06/07/2011 10:12 AM, Sam Berlin wrote:
> 1) Where do we stop? Should we expose cglib & asm as real
> dependencies that can be upgraded too (real question)? That exposes
> the means by which Guice is doing its bytecode manipulation for
> grabbing line numbers & for AOP. If Guice decides to change how it
> does it, then people may have a false dependency/requirement on cglib
> & asm.

For Maven users this is a non-issue. If Guice removes a dependency
and the user's code does not depend on it directly, it gets removed from
their project automatically.
For non-Maven users, you simply document the change of
dependencies. Other libraries have done this for years and the sky
didn't fall. Why is this such a big deal? As well, you need to consider
that you won't be changing your dependencies very often (the minor
version numbers might change but the actual libraries will not).

> 2) It's as if com.google.inject.internal APIs were exposed -- its
> something that Guice should be free to change without any user having
> to take any action at all, but now we're forcing them to think & act
> on it.

I am not proposing that com.google.inject.internal APIs be exposed.
There is a world of difference between hiding Guice-specific code and
hiding 3rd-party libraries that Guice links against.

> 3) Guice has averaged 2 year releases (for better or worse). Guava
> gives ~1.5 year leeway before removing any APIs Should we be forced
> to release a new version of Guice because a dependency changed its
> API? If we don't release a new one, that gets into some severe
> versioning hell.

This is a legitimate concern. Here's the thing: there is nothing
wrong with making Guice releases more iterative. If anything, Guice
users have been asking for this. I'd love to see a major release (with
new APIs) every 2 years so long as you have smaller bug-fix releases
along the way. Upgrading dependencies would fall under the category of
bug fixing.

Gili

Tim Peierls

unread,
Jul 6, 2011, 10:41:27 AM7/6/11
to google...@googlegroups.com
I'm with Sam and Alen. 

I use Guice and Guava in everything. To program otherwise, imho, is to tie your hands behind your back. So you'd think I wouldn't mind having one depend on the other, but I persist in seeing them as completely independent tools. The fact that one uses bits of the other internally doesn't matter to me, as long as it can be arranged that it doesn't contribute greatly to the jar size.

--tim

TJ Rothwell

unread,
Jul 6, 2011, 11:28:41 AM7/6/11
to google...@googlegroups.com
+1 Sam (use proguard to hide implementation details)

-- TJ

Fred Faber

unread,
Jul 6, 2011, 12:53:15 PM7/6/11
to google...@googlegroups.com
I think about the trade-off between the flexibility hiding the libraries gives with respect to the maintenance of Guice (as seen by its active maintainers, such as Sam) versus the arguments of standardization / transparency (note I really don't buy the "code size" argument here).

I think it'd be rare to need a security patch for Guava if Guice continues in the mode of not aggressively using Guava from head, or its latest release.  This implies that a security hole would need to be identified to a stable version of Guava, relatively long after it was released.

I do appreciate the arguments that proguarding for sake of hiding the internals from the (savvy) users is probably unneeded.

Bumping to the top of the thread:  mcculls was talking about the 300k difference in Guice from head versus 3.0 as a main motivator for using proguard (over jarjar) as one motivation to move to application-level proguard, and the fragility argument as another.  About the latter:  what extensions and tests were failing here (guice tests, I'm assuming)?  Would it be sufficient to include Guava as an external dependency for the test suite, treating them as a client of Guice? Similarly for extensions?

Fred

Bob Lee

unread,
Jul 6, 2011, 4:08:03 PM7/6/11
to google...@googlegroups.com
I use Guice in an Android app so I am sensitive to size.

On Wed, Jul 6, 2011 at 7:12 AM, Sam Berlin <sbe...@gmail.com> wrote:
The reasons why I think exposing the dependency is a bad idea:

1) Where do we stop?  Should we expose cglib & asm as real dependencies that can be upgraded too (real question)?  That exposes the means by which Guice is doing its bytecode manipulation for grabbing line numbers & for AOP.  If Guice decides to change how it does it, then people may have a false dependency/requirement on cglib & asm.

Cglib and ASM have been known to make incompatible changes. I'm pretty confident that Guava will not from now on.

Bob

Anthony MULLER

unread,
Jul 6, 2011, 4:14:36 PM7/6/11
to google...@googlegroups.com
Hello,

Just to be sure to understand: why it is not possible to update jarjar if it's tool currently have an issue?

If Guice only use a few Guava classes, why to not copy them at build time? It's probably not the clever solution but it could be the most efficient.

Anthony

--

Sam Berlin

unread,
Jul 6, 2011, 4:16:12 PM7/6/11
to google...@googlegroups.com
This is exactly how JarJar works and what it does now.  What JarJar doesn't do (and what ProGuard does do) is trim out methods that are unused, avoiding further dependencies used only by those methods.

sam

Anthony MULLER

unread,
Jul 6, 2011, 4:25:30 PM7/6/11
to google...@googlegroups.com
Ok, so I choose this option:


4)  Build Guice core + extensions without any jarjar'ing or ProGuard'ing, only do that as a final distribution / packaging step

Size is important for me and Guava is only an implementation detail (as CGLIB, etc...)

Cheers,
Anthony

Stuart McCulloch

unread,
Jul 6, 2011, 6:09:34 PM7/6/11
to google...@googlegroups.com
On 6 Jul 2011, at 17:53, Fred Faber wrote:

I think about the trade-off between the flexibility hiding the libraries gives with respect to the maintenance of Guice (as seen by its active maintainers, such as Sam) versus the arguments of standardization / transparency (note I really don't buy the "code size" argument here).

I think it'd be rare to need a security patch for Guava if Guice continues in the mode of not aggressively using Guava from head, or its latest release.  This implies that a security hole would need to be identified to a stable version of Guava, relatively long after it was released.

I do appreciate the arguments that proguarding for sake of hiding the internals from the (savvy) users is probably unneeded.

Bumping to the top of the thread:  mcculls was talking about the 300k difference in Guice from head versus 3.0 as a main motivator for using proguard (over jarjar) as one motivation to move to application-level proguard, and the fragility argument as another.  About the latter:  what extensions and tests were failing here (guice tests, I'm assuming)?

Both Guice tests and (official) Guice extensions. This is because the shrinking and embedding of Guava is currently done in the core sub-project, before any of the tests or extensions have been built.

To take a real example, Guice core doesn't use Guava's "Lists.newArrayList(Object)" method but the grapher extension does. So when we use ProGuard on Guice core, it thinks it can throw this method away when embedding Guava. The build then continues on to build the extensions, and update any references it finds there to Guava classes to match those embedded in Guice core. Note that it doesn't check to see if the updated package/method actually exists in the core - it just renames packages on the bytecode level. It's only when you test the final grapher extension jar against the final Guice core jar that you see a linkage error (because the method the grapher expects to be there has been removed). What's worse is that you may only see this error when you exercise the relevant code, hence my concern about fragility (of course this could always be handled with additional tooling to check all method references from extensions exist in the core).

Note that this could conceivably happen right now if an extension used a class from Guava that wasn't referenced (directly or indirectly) from Guice core. But the odds turn out to be much lower because jarjar only ever removes whole classes not methods, which is why we end up embedding most of the common Guava collection classes - which then covers the methods referenced from the extensions.

The cleanest approach imho would be to postpone any jarjar / Proguard until the very end, then you can apply it to everything in one shot. Maybe also have another round of integration tests to verify.

Would it be sufficient to include Guava as an external dependency for the test suite, treating them as a client of Guice? Similarly for extensions?

You mean make Guava an external dependency of the extensions, and not jarjar any Guava references found in them? That would be another option, but I think it would confuse people since they'd need to remember to add Guava as well as the extension they actually wanted.

Adrian Cole

unread,
Jul 6, 2011, 6:13:17 PM7/6/11
to google...@googlegroups.com
I receive complaints about the size of jclouds dependencies. Our
only dependencies of any size are guava and guice. These happen in a
context of embedding or android, most recently a chat with Glassfish
guys.

I'd prefer a solution that minimizes size, and takes a couple hundred
extra kilobytes seriously, conceding not all care about a couple
hundred k.

Provided that Guava @Beta methods aren't in use by Guice, a dependency
on Guava is the preferred option for jclouds.

my 2p,
-Adrian

Stuart McCulloch

unread,
Jul 6, 2011, 6:31:07 PM7/6/11
to google...@googlegroups.com
On 6 Jul 2011, at 21:14, Anthony MULLER wrote:

Hello,

Just to be sure to understand: why it is not possible to update jarjar if it's tool currently have an issue?

jarjar only removes unused classes, it doesn't remove unused methods from those classes

Often this distinction doesn't matter - but Guava has some utility classes like Maps that a) touch a wide range of collections classes and b) are themselves used by a lot of classes.

I did submit a potential patch to let jarjar remove unused static methods (http://code.google.com/p/jarjar/issues/detail?id=22) but there wasn't much interest in adding that feature.

If Guice only use a few Guava classes, why to not copy them at build time? It's probably not the clever solution but it could be the most efficient.

Well you'd want to find the minimal set of classes to copy, but that is what jarjar does. The problem of just copying classes is that you may only call the ImmutableMap.of() method... but that class uses Maps, which uses Synchronized, which uses ForwardingIterator, etc... So you'd still end up including a lot of Guava classes that were never actually used by Guice, but just happened to be referenced from classes that were used by Guice.

Stuart McCulloch

unread,
Jul 6, 2011, 6:37:23 PM7/6/11
to google...@googlegroups.com
On 6 Jul 2011, at 23:29, Fred Faber wrote:

On Wed, Jul 6, 2011 at 6:09 PM, Stuart McCulloch <mcc...@gmail.com> wrote:
On 6 Jul 2011, at 17:53, Fred Faber wrote:
Would it be sufficient to include Guava as an external dependency for the test suite, treating them as a client of Guice? Similarly for extensions?

You mean make Guava an external dependency of the extensions, and not jarjar any Guava references found in them? That would be another option, but I think it would confuse people since they'd need to remember to add Guava as well as the extension they actually wanted.

Yes.  And mind you I'm enjoying a game of devil's advocate right now,

no worries, I think it's important we consider all views :)

but folks would have to remember to add Guava as a dep when using Guice proper, and so I don't think doing so when adding extensions would be much worse.  Especially given extensions could declare other deps as well. 

... which then raises the question that if it's ok to have Guava as an extension dependency, why can't it be a core dependency?


Fred

Stuart McCulloch

unread,
Jul 6, 2011, 6:58:02 PM7/6/11
to google...@googlegroups.com
On 6 Jul 2011, at 21:25, Anthony MULLER wrote:

Ok, so I choose this option:

4)  Build Guice core + extensions without any jarjar'ing or ProGuard'ing, only do that as a final distribution / packaging step

Of course even if we decided to make Guava an external dependency we could always provide a separate "nodeps" flavour of Guice that embeds Guava (like CGLIB has a nodeps jar that embeds ASM)

Fred Faber

unread,
Jul 6, 2011, 6:29:10 PM7/6/11
to google...@googlegroups.com
On Wed, Jul 6, 2011 at 6:09 PM, Stuart McCulloch <mcc...@gmail.com> wrote:
On 6 Jul 2011, at 17:53, Fred Faber wrote:

I think about the trade-off between the flexibility hiding the libraries gives with respect to the maintenance of Guice (as seen by its active maintainers, such as Sam) versus the arguments of standardization / transparency (note I really don't buy the "code size" argument here).

I think it'd be rare to need a security patch for Guava if Guice continues in the mode of not aggressively using Guava from head, or its latest release.  This implies that a security hole would need to be identified to a stable version of Guava, relatively long after it was released.

I do appreciate the arguments that proguarding for sake of hiding the internals from the (savvy) users is probably unneeded.

Bumping to the top of the thread:  mcculls was talking about the 300k difference in Guice from head versus 3.0 as a main motivator for using proguard (over jarjar) as one motivation to move to application-level proguard, and the fragility argument as another.  About the latter:  what extensions and tests were failing here (guice tests, I'm assuming)?

Both Guice tests and (official) Guice extensions. This is because the shrinking and embedding of Guava is currently done in the core sub-project, before any of the tests or extensions have been built.

To take a real example, Guice core doesn't use Guava's "Lists.newArrayList(Object)" method but the grapher extension does. So when we use ProGuard on Guice core, it thinks it can throw this method away when embedding Guava. The build then continues on to build the extensions, and update any references it finds there to Guava classes to match those embedded in Guice core. Note that it doesn't check to see if the updated package/method actually exists in the core - it just renames packages on the bytecode level. It's only when you test the final grapher extension jar against the final Guice core jar that you see a linkage error (because the method the grapher expects to be there has been removed). What's worse is that you may only see this error when you exercise the relevant code, hence my concern about fragility (of course this could always be handled with additional tooling to check all method references from extensions exist in the core).

Note that this could conceivably happen right now if an extension used a class from Guava that wasn't referenced (directly or indirectly) from Guice core. But the odds turn out to be much lower because jarjar only ever removes whole classes not methods, which is why we end up embedding most of the common Guava collection classes - which then covers the methods referenced from the extensions.

The cleanest approach imho would be to postpone any jarjar / Proguard until the very end, then you can apply it to everything in one shot. Maybe also have another round of integration tests to verify.

Would it be sufficient to include Guava as an external dependency for the test suite, treating them as a client of Guice? Similarly for extensions?

You mean make Guava an external dependency of the extensions, and not jarjar any Guava references found in them? That would be another option, but I think it would confuse people since they'd need to remember to add Guava as well as the extension they actually wanted.

Yes.  And mind you I'm enjoying a game of devil's advocate right now, but folks would have to remember to add Guava as a dep when using Guice proper, and so I don't think doing so when adding extensions would be much worse.  Especially given extensions could declare other deps as well.

jhulford

unread,
Jul 7, 2011, 10:07:13 AM7/7/11
to google-guice


On Jul 6, 6:58 pm, Stuart McCulloch <mccu...@gmail.com> wrote:
> On 6 Jul 2011, at 21:25, Anthony MULLER wrote:
>
> > Ok, so I choose this option:
>
> > 4)  Build Guice core + extensions without any jarjar'ing or ProGuard'ing, only do that as a final distribution / packaging step
>
> Of course even if we decided to make Guava an external dependency we could always provide a separate "nodeps" flavour of Guice that embeds Guava (like CGLIB has a nodeps jar that embeds ASM)

This would have been my suggestion. Have 2 release jars, 1 that's
solely guice code and another "nodeps" one with embedded deps (guava,
cgilib, asm) that you''ve jarjar and proguard'ed. There's likely very
little tooling work you'd need to modify to make the simple guice only
jar and then use the nodeps one for all the internal build testing.
That doesn't solve the extensions issue wrt Guava, but it would seem
to appease the two main camps of opinions about how to deploy the jar.

Stuart McCulloch

unread,
Jul 22, 2011, 10:01:45 AM7/22/11
to google...@googlegroups.com

FYI, for those interested... I'm experimenting with making Guava an external dependency in the sisu-guice repo (https://github.com/sonatype/sisu-guice)

Reply all
Reply to author
Forward
0 new messages