Google Groups Home
Help | Sign in
Hash weirdness
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  8 messages - Collapse all
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
Jared Nedzel  
View profile
 More options May 14, 4:55 pm
From: Jared Nedzel <jned...@broad.mit.edu>
Date: Wed, 14 May 2008 16:55:38 -0400
Local: Wed, May 14 2008 4:55 pm
Subject: Hash weirdness
Folks:

I'm getting some weird behavior that I don't understand.  I'm probably
doing something really noobish.  I've subclassed Hash:

class ResultHash < Hash

   def process_line(line, value_type)
     # do some stuff that isn't important here....
   end
end

There's nothing else in my ResultHash class.

In a different class, I create a ResultHash instance, call process_line
repeatedly, which creates multi-level hash of hashes.

Then I want to iterate over the top level of keys:

result_hash = ResultHash.new()
# add a bunch of stuff to using process_line

# loop over it
keys = result_hash.keys
keys.each do |well_key|
   # do some stuff here
end

At the keys.each call, I get the exception "wrong number of arguments (1
for 0)"

In the debugger I can see that the "keys" temporary variable is an
instance of Array with the 96 elements that I expect.

I did a small test case using just bare Hash:

   def test_hash
     a_hash = {"A" => 1, "B" => 2, "C" => 3}
     keys = a_hash.keys
     keys.each do |key|
       puts "key: " + key + " value: " + a_hash[key].to_s
     end
   end

and this works as expected.

I was already thinking about changing my design to get rid of my
ResultHash methods (since all it adds is a single method that can
logically go elsewhere).  But I'd like to know what I'm doing wrong.

Any ideas?

--
Jared Nedzel
Cancer Genomics Informatics
Broad Institute
7 Cambridge Center
Cambridge, MA 02142

617-324-4825
jned...@broad.mit.edu


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ron Newman  
View profile
 More options May 14, 5:09 pm
From: "Ron Newman" <rnew...@thecia.net>
Date: Wed, 14 May 2008 17:09:05 -0400
Local: Wed, May 14 2008 5:09 pm
Subject: Re: Hash weirdness
I created a Ruby source file with exactly your code, and did not get any exception:

class ResultHash < Hash

   def process_line(line, value_type)
     # do some stuff that isn't important here....
   end
end

result_hash = ResultHash.new()
# add a bunch of stuff to using process_line
result_hash[3] = 8
result_hash['foo'] = 'bar'

# loop over it
keys = result_hash.keys
keys.each do |well_key|
   # do some stuff here
   puts well_key
end

It prints
   foo
   3
as expected.


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Doug Pfeffer  
View profile
 More options May 14, 5:23 pm
From: "Doug Pfeffer" <doug.pfef...@gmail.com>
Date: Wed, 14 May 2008 17:23:02 -0400
Local: Wed, May 14 2008 5:23 pm
Subject: Re: Hash weirdness
Is process_line() returning some kind of funky object? Maybe it's just
not a hash?

Doug


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Jared Nedzel  
View profile
 More options May 14, 5:46 pm
From: Jared Nedzel <jned...@broad.mit.edu>
Date: Wed, 14 May 2008 17:46:43 -0400
Local: Wed, May 14 2008 5:46 pm
Subject: Re: Hash weirdness
No, process_line just adds things to the hash:

   def process_line(line, value_type)
     if (line == nil || line.empty?)
       return
     end

     well_location = line[0]
     return if well_location.empty?

     well_hash = nil
     if (!self.has_key?(well_location))
       self[well_location] = Hash.new()
     end
     well_hash = self[well_location]

     bead_num = 1
     for i in 2..(line.length() - 3)
       value = line[i]

       if (!well_hash.has_key?(bead_num))
           self[well_location][bead_num] = Hash.new()
       end
       bead_hash = self[well_location][bead_num]

       bead_hash[value_type] = value
       bead_num += 1
     end
     self
   end

I'm parsing a file that has a bunch of lines that look like this:

A1,,"37.8108108108108","15.4789473684211","9.33838383838384","32.9772727272 727","24.7336956521739","21.9666666666667","11.7808219178082","558.33766233 7662","272.688073394495","10.1146788990826","27.2953020134228","7.552941176 47059","6.39285714285714","16.1666666666667","12.3673469387755","26.8061224 489796","9.72222222222222","11.3552631578947","13.4054054054054","19.448717 9487179","100.627272727273","16.2857142857143","22.1777777777778","15.45569 62025316","35.962962962963","13.8641975308642","28.546875","9.6538461538461 5","41.4260869565217","12.2441314553991","25.8511627906977","16.76470588235 29","85.6335877862595","18.4748603351955","23.2871794871795","157.963235294 118","231.4","6.22685185185185","22.5991189427313","9.04878048780488","7.35 ","21.1284403669725","22.3235294117647","50.7471264367816","32.794642857142 9","8.60869565217391","65.4065934065934","24.71","31.9396551724138","19.952 380952381","59.9272727272727","10.86","70.4367816091954","41.5462962962963" ,"138.794117647
059","56.61","26.3736263736264","37.6865671641791","49.5733333333333","35.1 063829787234","46.2661290322581","20.3575418994413","35.0940170940171","58. 7946428571429","23.3913043478261","53.1682242990654","38.1470588235294","36 .075","54.5555555555556","47.6172839506173","16.9107142857143","25.11111111 11111","22.5185185185185","55.8602150537634","65.0422535211268","21.6463414 634146","14.5483870967742","49.6526315789474","43.5","22.2020202020202","20 .1401869158878","18.7246376811594","63.1971830985916","187.8","26.529411764 7059","86.8807339449541","27.9791666666667","100.045454545455","40.05617977 52809","52.6736842105263","66.1166666666667","1825.46875","323.371681415929 ","30.6818181818182","61.0157480314961","39.4190476190476","2108.7981651376 1","100.670454545455","24785.1619047619","150.12037037037",13306,"Sample
Empty"
A2,,"39.7594501718213","15.5533596837945","8.5764192139738","34.67870036101 08","30.2253521126761","25.5748031496063","11.8037735849057","729.758293838 863","336.157024793388","10.2933333333333","27.4210526315789","7.7134146341 4634","7.3047619047619","19.2368421052632","12.7637795275591","25.018181818 1818","11.6813186813187","9.56818181818182","17.25","19.7375","202.24242424 2424","17.5494505494506","24.3737373737374","13.4235294117647","35.25688073 3945","15.0238095238095","31.9158878504673","7.56521739130435","44.88888888 88889","14.2791666666667","32.1936936936937","20.7925531914894","74.9831932 773109","17.8461538461538","27.8851674641148","209.347826086957","328.92682 9268293","7.21611721611722","24.7683397683398","10.0648148148148","8.927710 84337349","27.3092105263158","28.5042016806723","60.1477272727273","40.6033 05785124","9.77931034482759","95.1428571428571","23.1868131868132","30.0952 380952381","24","59.6639344262295","12.979797979798","73.2903225806452","47 .3898305084746"
,"149.336363636364","66.1415094339623","24.4646464646465","42.9736842105263 ","60.4556962025316","40.6282051282051","49.5658914728682","19.801169590643 3","58.7155172413793","64.9259259259259","31.2260869565217","46","40.625"," 44.8260869565217","56.7241379310345","63.1022727272727","16.4444444444444", "21.6060606060606","22.7916666666667","72.6194690265487","52.9425287356322" ,"24.6714285714286","15.2992125984252","51.4536082474227","36.2293577981651 ","25.5714285714286","20.2053571428571","20.5604395604396","56.68","386.684 931506849","31.7361111111111","80.9404761904762","30.0943396226415","123.49 1525423729","65.1964285714286","59.4945054945055","62.2592592592593","2181. 97674418605","519.81512605042","35.0625","61.0747663551402","43.60204081632 65","2829.99212598425","98.5454545454545","25799.5365853659","132.223140495 868",14255,"Sample
Empty"
...

The 0 column (e.g., A1 on the first row), is a well location on a 96
well plate (A-H1-12).  Columns 2 through 101 are data columns,
representing the value for bead_nums 1-100.

The file has repeating blocks of this pattern, with each block
containing the values for a particular value_type.

So I'm creating a structure that looks like:

<well location> --> <bead_name_hash> --> <value_type_hash> --> value

For example,

A1 --> 1 --> trimmed_mean --> 37.8108108108108
          --> peak --> 51.2
          ...
    --> 2 --> trimmed_mean --> 15.4789473684211
    ...
    --> 100 --> trimmed_mean --> 150.1203704
A2 --> 1 --> trimmed_mean --> 39.7594501718213

This works for a small test case.  But when I run on a real data file
(96 wells x 100 beads per well x 11 value_types) I get this behavior.

I've tried refactoring the process_line method to a different class and
just using an instance of Hash, and I'm getting the same behavior.

Thanks,

Jared

--
Jared Nedzel
Cancer Genomics Informatics
Broad Institute
7 Cambridge Center
Cambridge, MA 02142

617-324-4825
jned...@broad.mit.edu


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Tyler McMullen  
View profile
 More options May 14, 6:00 pm
From: "Tyler McMullen" <tbmcmul...@gmail.com>
Date: Wed, 14 May 2008 18:00:20 -0400
Local: Wed, May 14 2008 6:00 pm
Subject: Re: Hash weirdness

You probably already looked into this, but is there any chance the line that
it is failing on (I assume it's the same line everytime) is out of the
ordinary in some way?  Specifically in number of data points.  I don't
immediately see how something like this would cause it to fail, but if there
is something out of the ordinary it could certainly point you in the right
direction...

Also, because I'm a succinctness nazi...

well_hash = nil
if (!self.has_key?(well_location))
  self[well_location] = Hash.new()
end
well_hash = self[well_location]

... can be refactored into ...

well_hash = self[well_location] || {}

And...

if (!well_hash.has_key?(bead_num))
  self[well_location][bead_num] = Hash.new()
end
bead_hash = self[well_location][bead_num]

... can be refactored into ...

bead_hash = self[well_location][bead_num] ||= {}

Sorry if the refactoring was too forward. :)

tyler


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Tyler McMullen  
View profile
 More options May 14, 6:01 pm
From: "Tyler McMullen" <tbmcmul...@gmail.com>
Date: Wed, 14 May 2008 18:01:30 -0400
Local: Wed, May 14 2008 6:01 pm
Subject: Re: Hash weirdness

And I just realized that first succinctness tweak is wrong and should look
more like the second one.  Sorry.

tyler

On Wed, May 14, 2008 at 6:00 PM, Tyler McMullen <tbmcmul...@gmail.com>
wrote:


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Jared Nedzel  
View profile
 More options May 14, 6:36 pm
From: Jared Nedzel <jned...@broad.mit.edu>
Date: Wed, 14 May 2008 18:36:19 -0400
Local: Wed, May 14 2008 6:36 pm
Subject: Re: Hash weirdness
Oddly enough, while looping over the hash this way breaks:

keys.each do |well_key|
   well_hash = result_hash[well_key]
   puts "processing well: " + well_key + " size: " +
      well_hash.size.to_s
end

This way works (please excuse the many temp variables in here, I put
them in so I could easily see what was going on in the debugger):

keys = result_hash.keys
length = keys.length
for i in 0..(length - 1)
   key = keys[i]
   well_hash = result_hash[key]
   puts "processing well: " + key + " size: " + well_hash.size.to_s
end

That's pretty nasty, but since it is working, I'll live it with it for
now.  Thanks for the suggestions.

Jared

--
Jared Nedzel
Cancer Genomics Informatics
Broad Institute
7 Cambridge Center
Cambridge, MA 02142

617-324-4825
jned...@broad.mit.edu


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ron Newman  
View profile
 More options May 14, 6:41 pm
From: "Ron Newman" <rnew...@thecia.net>
Date: Wed, 14 May 2008 18:41:20 -0400
Local: Wed, May 14 2008 6:41 pm
Subject: Re: Hash weirdness