Re: [google-collections] Interner

18 views
Skip to first unread message

Kevin Bourrillion

unread,
Jan 6, 2010, 5:21:38 PM1/6/10
to Blair Zajac, Dimitris Andreou, guava-...@googlegroups.com
On Mon, Jan 4, 2010 at 9:25 PM, Kevin Bourrillion <kev...@google.com> wrote:

If you wanted multiple interners, which I'll need for different types, I don't want to carry around too many, so I'm thinking of an interface like this:

public interface Interner
{
   <E> E intern(E sample);
}

You're absolutely right!  I am dismayed to find out that the code I attached still uses the wrong API.  I thought I had updated it.  

I just remembered what the problem was!


Then InternReference would extend FinalizableWeakReference<Object> and at the end of intern() there would be a check like this:

         Object canonical = sneakyRef.get();
         if (canonical != null) {
           return sample.getClass().cast(canonical);
         }

Unfortunately this check is not quite enough.  One could intern a List<ArrayList<String>> and get back a List<LinkedList<String>>, and this check wouldn't catch it; it would show up to bite you sometime later.
 
Granted, it's hard to imagine this actually happening in real life!  But still, it punches a hole in the type system, which I hate to do.  I'm thinking we should probably just stick with Interner<E>, and if you want to use it heterogenously, just use Interner<Object> and do the casting yourself; maybe even wrap a simple convenience method around that -- point is, *you* would be the one to decide to suppress the resulting warning, not me ;-)


--
Kevin Bourrillion @ Google
internal:  http://go/javalibraries
external: guava-libraries.googlecode.com

Blair Zajac

unread,
Jan 9, 2010, 12:48:51 PM1/9/10
to Dimitris Andreou, Kevin Bourrillion, guava-...@googlegroups.com
[resending to guava-discuss instead]

Dimitris Andreou wrote:
> 2010/1/5 Blair Zajac <bl...@orcaware.com>:
>> On 1/4/10 7:28 PM, Dimitris Andreou wrote:
>>> 2010/1/5 Kevin Bourrillion<kev...@google.com>:
>>>> On Mon, Jan 4, 2010 at 6:40 PM, Dimitris Andreou<jim.a...@gmail.com>
>>>> wrote:
>>>>
>>> At some point someone, or me, should perhaps compare String.intern()
>>> with a (weak) Interner<String>, to have a feel of its relative
>>> performance. (Of course it's unfair, but still).
>> Also, I understand once you use String#intern() it will never be GCed, so
>> even if Interner<String> is slower it won't consume all available memory.
>>
>> Blair
>>
>
> Various sources claim the opposite. In any case, I can't make the
> following to throw an OOME:
>
> public static void main(String[] args) {
> int sum = 0;
> for (int i = 0; i < 1000000; i++) {
> char[] c = new char[640];
> for (int j = 0; j < c.length; j++) {
> c[j++] = (char)i;
> }
>
> sum += new String(c).intern().hashCode();
> }
> System.out.println(sum);
> }
>
> Dimitris
>

Found this good post:

http://www.codeinstructions.com/2009/01/busting-javalangstringintern-myths.html

Strings are interned in the permanent generation but can be garbage collected.

I tested the code and watched java in jconsole on Mac OS X 10.6's with JDK6 and
the article's statements are accurate. So using Google's interner with a
String. Would avoid permanent generation problems.

I would be interested to see the speed difference between String#intern() and
Interner<String>.

Blair


Nikolas Everett

unread,
Jan 9, 2010, 4:52:00 PM1/9/10
to guava-...@googlegroups.com, Dimitris Andreou, Kevin Bourrillion
If I remember correctly string#intern isn't super fast.  It seemed to be O(log n) instead of O(1).  That was just from off hand observation about a year and a half ago though.
Reply all
Reply to author
Forward
0 new messages