Are keywords and symbols garbage-collected?

307 views
Skip to first unread message

samppi

unread,
Jun 24, 2009, 11:31:19 AM6/24/09
to Clojure
Are keywords and symbols garbage-collected? If I generated a lot of
keywords or symbols, put them into a collection, and then removed
them, would they disappear and free up space? I'm wondering if they're
similar to Ruby symbols, which are never garbage collected.

Stephen C. Gilardi

unread,
Jun 24, 2009, 12:22:54 PM6/24/09
to clo...@googlegroups.com


Symbol objects are subject to garbage collection, but the "namespace"
and "name" strings that identify them are not. Those strings are
"interned" via the "intern" method on java.lang.String. Once a String
is interned, there exists a single canonical String object that
represents it throughout the remaining lifetime the JVM instance. Any
two Symbols of the same namespace and name will reference those
canonical namespace and name strings and will thus have identical
namespaces and names.

Not interning the Symbols themselves turns out to be important in
Clojure because it allows two Symbols with identical namespace and
name to have different metadata.

Keyword objects are interned by Clojure and are not garbage collected.
When a Keyword is created, it's placed in a ConcurrentHashMap that
maps a Symbol of the same name and namespace to the Keyword object. In
a given JVM instance, there is at most one Keyword object with a given
name and namespace. Whenever the reader reads a Keyword's text
representation, it returns the unique Keyword object associated with it.

Here's some repl playing to support this:

user=> (identical? 'a/b 'a/b)
false
user=> (identical? (name 'a/b) (name 'a/b))
true
user=> (identical? (namespace 'a/b) (namespace 'a/b))
true
user=> (= (quote #^{:tagged true} a/b) (quote a/b))
true
user=> (= (meta (quote #^{:tagged true} a/b)) (meta (quote a/b)))
false
user=> (identical? :a/b :a/b)
true

--Steve

Four of Seventeen

unread,
Jun 24, 2009, 1:30:56 PM6/24/09
to Clojure
On Jun 24, 12:22 pm, "Stephen C. Gilardi" <squee...@mac.com> wrote:
> On Jun 24, 2009, at 11:31 AM, samppi wrote:
>
> > Are keywords and symbols garbage-collected? If I generated a lot of
> > keywords or symbols, put them into a collection, and then removed
> > them, would they disappear and free up space? I'm wondering if they're
> > similar to Ruby symbols, which are never garbage collected.
>
> Symbol objects are subject to garbage collection, but the "namespace"  
> and "name" strings that identify them are not. Those strings are  
> "interned" via the "intern" method on java.lang.String. Once a String  
> is interned, there exists a single canonical String object that  
> represents it throughout the remaining lifetime the JVM instance.

I'm not sure this is correct. I think recent Sun JVMs can GC
unreferenced, interned strings.

> Keyword objects are interned by Clojure and are not garbage collected.  
> When a Keyword is created, it's placed in a ConcurrentHashMap that  
> maps a Symbol of the same name and namespace to the Keyword object.

In principle, this could be fixed by using WeakReference and a few
other tricks in the implementation. In practice, it will tend not to
matter, since only a small number of distinct keyword objects are
likely to be used by a typical system.

It does suggest that for the time being you be circumspect about using
"(keyword foo)" outside, perhaps, of def-style macros, and
particularly where it may be invoked with an open-ended set of values
for "foo" originating from user input, I/O, or other sources, and
similarly for eval'ing or read'ing strings, files, or other data
containing occurrences of ":foo" with an open-ended set of values for
"foo".

Stephen C. Gilardi

unread,
Jun 24, 2009, 1:52:25 PM6/24/09
to clo...@googlegroups.com

On Jun 24, 2009, at 1:30 PM, Four of Seventeen wrote:
>
> On Jun 24, 12:22 pm, "Stephen C. Gilardi" <squee...@mac.com> wrote:
>> Symbol objects are subject to garbage collection, but the "namespace"
>> and "name" strings that identify them are not. Those strings are
>> "interned" via the "intern" method on java.lang.String. Once a String
>> is interned, there exists a single canonical String object that
>> represents it throughout the remaining lifetime the JVM instance.
>
> I'm not sure this is correct. I think recent Sun JVMs can GC
> unreferenced, interned strings.

Right you are. On reading further, I see that unreferenced interned
Strings can be collected in Java 1.2+ because the interning mechanism
holds only a weak reference to them.

Thanks for the correction.

--Steve

samppi

unread,
Jun 24, 2009, 6:19:49 PM6/24/09
to Clojure
Thanks for the answers. I need to generate symbols to distinguish them
from strings in a parser. It seems, then, that it's better to use
symbols rather than keywords in this case.
>  smime.p7s
> 3KViewDownload
Reply all
Reply to author
Forward
0 new messages