[Facets] Bug in Hash#rekey(!)

7 views
Skip to first unread message

Calamitas

unread,
Aug 6, 2010, 9:09:10 AM8/6/10
to Facets
Hash#rekey! has a bug when the set of new keys intersects with the set
of old keys. Short example:

{ 1 => :a, 2 => :b }.rekey! { |k| k + 1 } #=> {3 => :a}

This is because the hash is changed in-place and 1 is mapped to 2, the
value for 2 is overwritten.

If I reordered the keys, the problem does not occur:

{ 2 => :b, 1 => :a }.rekey! { |k| k + 1 } #=> {3=>:b, 2=>:a}

Because Hash#rekey calls Hash#rekey!, it also has this bug.

A possible solution is to put a non-destructive the algorithm in
Hash#rekey and implement Hash#rekey! as replace(rekey(&blk)).

Peter
_______________________________________________
facets-universal mailing list
facets-u...@rubyforge.org
http://rubyforge.org/mailman/listinfo/facets-universal

Trans

unread,
Aug 12, 2010, 10:25:46 PM8/12/10
to facets-u...@rubyforge.org

On Aug 6, 9:09 am, Calamitas <calamita...@gmail.com> wrote:
> Hash#rekey! has a bug when the set of new keys intersects with the set
> of old keys. Short example:
>
>   { 1 => :a, 2 => :b }.rekey! { |k| k + 1 } #=> {3 => :a}
>
> This is because the hash is changed in-place and 1 is mapped to 2, the
> value for 2 is overwritten.
>
> If I reordered the keys, the problem does not occur:
>
>   { 2 => :b, 1 => :a }.rekey! { |k| k + 1 } #=> {3=>:b, 2=>:a}
>
> Because Hash#rekey calls Hash#rekey!, it also has this bug.
>
> A possible solution is to put a non-destructive the algorithm in
> Hash#rekey and implement Hash#rekey! as replace(rekey(&blk)).

Ouch. Nice catch. I'll get the fix in place soon. I've been working
all week on getting v3.0 out the door. I'm almost there, but I have to
go out of town for a week. So it'll have to wait for when I get back.

Thanks for reporting.

Trans

unread,
Aug 21, 2010, 1:17:52 PM8/21/10
to facets-u...@rubyforge.org

On Aug 6, 9:09 am, Calamitas <calamita...@gmail.com> wrote:

> Hash#rekey! has a bug when the set of new keys intersects with the set
> of old keys. Short example:
>
>   { 1 => :a, 2 => :b }.rekey! { |k| k + 1 } #=> {3 => :a}
>
> This is because the hash is changed in-place and 1 is mapped to 2, the
> value for 2 is overwritten.
>
> If I reordered the keys, the problem does not occur:
>
>   { 2 => :b, 1 => :a }.rekey! { |k| k + 1 } #=> {3=>:b, 2=>:a}
>
> Because Hash#rekey calls Hash#rekey!, it also has this bug.
>
> A possible solution is to put a non-destructive the algorithm in
> Hash#rekey and implement Hash#rekey! as replace(rekey(&blk)).

Okay, I reworked #rekey. In the process I decided that a key-map hash
is better than the 2-argument alias-like interface. So...

# Rekey a hash, ...
#
# rekey()
# rekey(from_key => to_key, ...)
# rekey{|from_key| to_key}
#
# If no key map or block is given, then all keys are converted
# to Symbols.
#
# If a key map is given, then the first key is changed to the second
key.
#
# foo = { :a=>1, :b=>2 }
# foo.rekey(:a=>'a') #=> { 'a'=>1, :b=>2 }
# foo.rekey(:b=>:x) #=> { :a =>1, :x=>2 }
# foo.rekey('foo'=>'bar') #=> { :a =>1, :b=>2 }
#
# If a block is given, converts all keys in the Hash according to
the
# given block procedure. If the block returns +nil+ for given key,
# then that key will be left intact.
#
# foo = { :name=>'Gavin', :wife=>:Lisa }
# foo.rekey{ |k| k.to_s } #=> { "name"=>"Gavin", "wife"=>:Lisa }
# foo #=> { :name =>"Gavin", :wife=>:Lisa }
#
# Note that if both a +key_map+ and a block are given, the +key_map+
is
# applied first then the block.

def rekey(key_map={}, &block)
if key_map.empty? && !block
block = lambda{|k| k.to_sym}
end

hash = {}

(keys - key_map.keys).each do |key|
hash[key] = self[key]
end

key_map.each do |from, to|
hash[to] = self[from]
end

hash2 = {}

if block
hash.each do |k, v|
nk = block[k]
hash2[nk || k] = v
end
else
hash2 = hash
end

hash2
end

See any other issues with this? I'm just getting back from vacation,
my minds not 100% in flow yet. I'm sure some golf can be played with
the code.

Calamitas

unread,
Aug 25, 2010, 11:07:08 AM8/25/10
to Facets

Can you use nil as default value for the key_map parameter instead? If
the code that calls this method generates that key map based on other
input data, it may end up being empty and the calling code would have
to handle this case separately. So start with:

def rekey(key_map=nil, &block)
if !key_map && !block


block = lambda{|k| k.to_sym}
end

key_map ||= {}

This allows to make the distinction between not given, and given but empty.

I'm also not entirely happy with the fact that a nil on the key_map
and a nil returned by the block have a different meaning. It can make
sense to have nil as a key, so I'm not sure what is best.

Peter

Trans

unread,
Aug 26, 2010, 12:12:36 PM8/26/10
to facets-u...@rubyforge.org

On Aug 25, 11:07 am, Calamitas <calamita...@gmail.com> wrote:

> Can you use nil as default value for the key_map parameter instead? If
> the code that calls this method generates that key map based on other
> input data, it may end up being empty and the calling code would have
> to handle this case separately. So start with:
>
>   def rekey(key_map=nil, &block)
>     if !key_map && !block
>       block = lambda{|k| k.to_sym}
>     end
>     key_map ||= {}
>
> This allows to make the distinction between not given, and given but empty.

Yep. Done.

> I'm also not entirely happy with the fact that a nil on the key_map
> and a nil returned by the block have a different meaning. It can make
> sense to have nil as a key, so I'm not sure what is best.

I agree. I've been thinking about how best to get rid of that too. I
decided I will replace it with NA or NULL --a "dummy" class I will add
to core to be used in these cases. But I haven't figured out the best
name/implementation for that yet. My first take was:

class NA < ArgumentError
def method_missing(*); self; end
end

Then I remembered the optional NullClass library Facets had in the
more library:

# = Nullclass
#
# NullClass is essentially NilClass but it differs in one
# important way. When a method is called against it that it
# deoesn't have, it will simply return null value rather then
# raise an error.
#
# TODO: Perhaps NullClass should be called NackClass?

class NullClass #< NilClass
class << self
def new
@null ||= NullClass.allocate
end
end
def inspect ; 'null' ; end
def nil? ; true ; end
def null? ; true ; end
def [](key); nil; end
def method_missing(sym, *args)
return nil if sym.to_s[-1,1] == '?'
self
end
end

module Kernel
def null
NullClass.new
end
end

class Object
def null?
false
end
end

What do you suggest for name/implementation?

Trans

unread,
Aug 26, 2010, 7:43:45 PM8/26/10
to facets-u...@rubyforge.org

On Aug 26, 12:12 pm, Trans <transf...@gmail.com> wrote:

>   class NA < ArgumentError
>     def method_missing(*); self; end
>   end

that should be

class NA < ArgumentError
class << self


def method_missing(*); self; end
end

end

Reply all
Reply to author
Forward
0 new messages