Web Images Videos Maps News Shopping Gmail more »
Recently Visited Groups | Help | Sign in
Google Groups Home
using narray and mmap with HUGE data sets
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  1 message - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Ara.T.Howard  
View profile  
 More options Oct 2 2004, 11:21 am
Newsgroups: comp.lang.ruby
From: "Ara.T.Howard" <Ara.T.How...@noaa.gov>
Date: Sat, 2 Oct 2004 09:21:46 -0600
Local: Sat, Oct 2 2004 11:21 am
Subject: using narray and mmap with HUGE data sets

scientific rubyists-

i have played with using mmap with narray before with some success, but thought
this was so neat i'd post it for posterity:

   - i have a huge satelite mosaic (actually the ones i'm using are 4 times as
     big as this one - eg just under a gig each):

       jib:~/fa/thailand > ls -ltar etm_mosaics/L72001195_19720020208_b70.mos
       -rw-rw-r--    1 ahoward  ahoward  184816141 Oct  1 13:32 etm_mosaics/L72001195_19720020208_b70.mos

   - here's a litte program that takes a list of scanlines and shows how many
     elements were non zero and zero:

       jib:~/fa/thailand > cat mmap_narray_test.rb
       require 'mmap'
       require 'narray'

       path, samples, lines = ARGV.shift, Integer(ARGV.shift), Integer(ARGV.shift)
       mmap = Mmap::new path, 'r', Mmap::MAP_SHARED
       narray = NArray::to_na mmap.to_str, NArray::BYTE, lines, samples

       while((line = ARGV.shift))
         line = Integer line
         scanline = narray[line, true]
         non_zero, zero = scanline.ne(0).where2

         puts <<-yaml
         -
           scanline: #{ line }
             elements : #{ samples }
             non_zero : #{ non_zero.size }
             zero     : #{ zero.size }
         yaml
       end

   - running is plenty fast

     jib:~/fa/thailand > time ruby mmap_narray_test.rb \
       etm_mosaics/L72001195_19720020208_b70.mos 10441 17701 183 184 185

       -
         scanline: 183
           elements : 10441
           non_zero : 6100
           zero     : 4341
       -
         scanline: 184
           elements : 10441
           non_zero : 6100
           zero     : 4341
       -
         scanline: 185
           elements : 10441
           non_zero : 6102
           zero     : 4339

     real    0m0.809s
     user    0m0.200s
     sys     0m0.610s

obviously you could do this specific task using some itelligent seeking and
some expensive unpacking - i just thought it was really cool that both mmap
and narray could work together so well.  using them together gives you nice
logical access to huge datasets without performing any un-needed i/o while
offering all of narray's capabilities.  for example i'm using this to maintain
a set of about 10 files that total 10GB for some code that needs to loop over
sections of these files in relatively small chunks (10000 x 10000 tiles)
gathering some stats along the way.  using mmap and narray enabled me to write
the code in about 15 minutes while completely ignoring the fact that machine
i'm running on only has 4GB of ram.  could be faster in c but that wouldn't
take me 15 minutes to write.

cheers.

-a
--
=========================================================================== ====
| EMAIL   :: Ara [dot] T [dot] Howard [at] noaa [dot] gov
| PHONE   :: 303.497.6469
| A flower falls, even though we love it;
| and a weed grows, even though we do not love it.
|   --Dogen
=========================================================================== ====


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2009 Google