Just a dumb question: what is the real difference between { :aKey =>
"aValue" } and { "aKey" => "aValue" } ? I know the first key is a
symbol the latter is a string. I like string keys why should I use
symbols? Why symbols worth to use as keys?
Thanks,
Gábor
Because symbols
- are faster and
- save you one byte in your rb file.
Malte
Always use symbols for situations like these. The reason is that a
symbol is immutable and also that no new string needs to be created for
it if used more than once. Also, using strings as symbols and then
having the string altered will force a rehash of the table. It's all
about memory savings and execution speed,
nikolai
--
::: name: Nikolai Weibull :: aliases: pcp / lone-star / aka :::
::: born: Chicago, IL USA :: loc atm: Gothenburg, Sweden :::
::: page: www.pcppopper.org :: fun atm: gf,lps,ruby,lisp,war3 :::
main(){printf(&linux["\021%six\012\0"],(linux)["have"]+"fun"-97);}
Symbols take up less memory space (only allocated once for the same
Symbol) and have a faster #hash function (#object_id, not computed).
'x' == 'x' # => true
'x'.object_id == 'x'.object_id # => false
:x == :x # => true
:x.object_id == :x.object_id # => true
--
Eric Hodel - drb...@segment7.net - http://segment7.net
FEC2 57F1 D465 EB15 5D6E 7C11 332A 551C 796C 9F04
> Also, using strings as symbols and then having the string altered will
> force a rehash of the table.
You mean this?
key = 'foo'
hash = {}
hash[key] = 5
key.gsub! /foo/, 'bar'
In this case, hash.rehash does not need to be called because Ruby
copies String hash keys:
hash.keys.first.object_id == key.object_id # => false
Also, String keys are frozen, so you can't modify them:
hash.keys.first.gsub! /foo/, 'bar' # => raises TypeError
[basically saying that this isn't so]
OK, so this strengthens the argument for using symbols even further, as
keys will be copied. Thanks for pointing this out,
I rather make the distinction on the semantic level: for example, if you
write an initializer for a class that accepts a hash to init any number of
instance fields I'd prefer to use symbols here. Also, if there is only a
certain fixed set of values allowed. I use strings if they are read from
some source and I don't know beforehand, what they might be.
Incidentally it's typical for the key like things to occur rather often,
which fits nicely with the memory and speed savings incurred by symbols.
Kind regards
robert
Regards,
Peter
Oh, no...not immutable vs. mutable strings again...
Well, if strings were immutable, then that would mean that strings could
share contents, and thus immutable strings wouldn't fill up memory. I
have suggested on the ruby-core list that Ruby should provide a second
data structure that acts like a string, namely the _rope_, and that it
be implemented in a way that allows for it to be used for tasks where
immutable "strings" are desired.
A rope is basically a string represented by a tree. Leafs of the tree
point to the subsequences of the whole string. These subsequences can
be shared with other ropes and can be generated lazily, i.e., from IO or
other generators. All that is needed is the length of the subsequence.
Every internal node keeps track of its own size and the size of its left
child. Thus, the offset of a node in the tree is the size of its left
child plus its ancestors. Ropes can be used to represent long strings
efficiently and many operations on ropes are O(1) where they are O(n) on
a string. This is offset by the fact that lookup in a rope is O(lg n)
versus O(1) for a string, but in many cases this isn't a problem.
Anyway, the rope data structure is further described in [1]. Boehm has
actually implemented this in C for his garbage collector, so see that
package for an example implementation (not though that it uses a lot of
C-hacks which makes it undesirable to use as-is). There's also a rope
data structure in STL, but it's limited to only using ropes and strings,
not IO,
nikolai (the rope and piece table lover)
[1] Hans-J Boehm, "Ropes: an Alternative to Strings", Software--Practice
and Experience, vol. 25(12), 1315--1330, Dec. 1995. Available at
http://rubyurl.com/2FRbO.
Some people (such as Guido) dislike mutable strings.
Others (such as Matz, and incidentally me) like them.
Personally, my limited Java experience juggling String and
StringBuffer was enough to convince me that strings should
be mutable.
Hal
Why?
Personally I think :symbols are great, makes it much clearer when you
are reading code that you are representing something else, rather than
storing a piece of data. And you can use them without having to define
them as constants before hand. Great :)
Faster to type too.
Douglas
Use Strings for their content. Use Symbols for their arbitrary uniqueness.
--
-- Jim Weirich j...@weirichhouse.org http://onestepback.org
-----------------------------------------------------------------
"Beware of bugs in the above code; I have only proved it correct,
not tried it." -- Donald Knuth (in a memo to Peter van Emde Boas)
I used to do this, but ran into problems.
Symbols are great for things related to ruby becuase the :bar form for symbol
literals accepts the same kind of chars as ruby identifiers. I use them be
preference in interacting with ruby's meta-programming APIs.
They start to fall down outside of this. For example, I tried to use
with mime types:
:text
==>:text
:video
==>:video
:octet-stream
NameError: undefined local variable or method `stream' for main:Object
from (irb):3
'octet-stream'.intern
==>:"octet-stream"
You CAN use them for things outside of the domain of ruby names, but it gets
painful if the names of those things are arbitarily unique, but have "-"
characters in their name, you first have to create a String!
You can get around this by creating constants:
OCTETSTREAM = 'octet-stream'.intern
TEXT = :text
etc., but that might not fit your API goals very well.
Anyhow, I moved back to using strings instead of symbols. The need to create a
string and intern it for things that are logically symbols but have a "-" in
them was too painful.
That was my experience, anyhow.
Cheers,
Sam
I believe you can do things like :"octet-stream" -- but I grant
that is not much better.
Hal
Why couldn't you do :octet_stream ? If your answer is because the dash comes
from outside ruby, then I would suggest that the content ("_" vs "-") is
important ... indicating that you should use strings.
> Sam Roberts wrote:
>> [...]
>> Anyhow, I moved back to using strings instead of symbols. The need to
>> create a
>> string and intern it for things that are logically symbols but have a
>> "-" in
>> them was too painful.
>>
>> That was my experience, anyhow.
>
> I believe you can do things like :"octet-stream" -- but I grant
> that is not much better.
And there's also the %s(octet-stream) family.
But a little better, I didn't know that, thanks.
Sam
Maybe I don't know what you mean by "arbitrarily unique".
"_" vs "-" is no more (or less) important than "a" vs. "z".
Cheers,
Sam
If the choice if symbol names is arbitrary, then I can change the name of
the symbol everywhere that references it without changing the semantics of
the program.
For example, if any of the following choices are equally valid:
:octetstream, :OctetStream, :octet_stream, :stream_of_octets, :octets,
:fido, then the choice of name is arbitrary. Of course, some choices are
more transparent and convey meaning better, but the program will still
work even if we call the symbol :xyzzy. That's what it means to be
arbitrary.
If the choice of letters is constrained by some outside force, then it is
not arbitrary. For example, it might come to you as an attribute in an
XML message. Or perhaps you need to write it to a file, and other
programs expect that exact sequence of strings. In all these cases, the
content (sequence of letters) is important and cannot be changed without
breaking the program. When the content of the item is important, use a
string.
Ah. Then, no, its not really arbitrary. More specifically, I can make it
arbitrary, but then I might be forced to make it more and more
arbitrary! If I map:
x-mailer => :xmailer
Then somebody decides to make a header
xmailer
I have to map:
xmailer => :zz_xmailer
etc. I guess I could madk a mapping table, hashing strings to
symbols, but at this point symbols aren't making my code clearer or
easier to use.
In the example of mime types, I probably could use abitray symbols.
Anybody who decides to make a new mime type called application/octet_stream or
application/octet_stream given tha application/octet-stream is a
standard name deserves to be publically humiliated. So I could use
:octetstream, arbitrarily.
I just wanted to use symbols for the efficiency, and to emphasize their
uniqueness in terms of case-sensitivity, it seemed to fit, but for
serveral reasons I discovered it didn't.
Cheers,
Sam