Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Message from discussion OT: Binary Tree Lookups for Dummies
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
"Choate, Paul@DDS"  
View profile  
 More options Apr 14 2006, 12:32 pm
Newsgroups: comp.soft-sys.sas
From: pcho...@DDS.CA.GOV ("Choate, Paul@DDS")
Date: Fri, 14 Apr 2006 09:32:15 -0700
Local: Fri, Apr 14 2006 12:32 pm
Subject: OT: Binary Tree Lookups for Dummies
One goals for the last SUGI was to learn about and start applying more optimal lookup and merging methods.  Of course Paul Dorfman's most excellent papers on hashing offering a most interesting place to start.

A couple weeks later, still not nearly caught up at work, but noodling a little about something heard - binary tree searching.  Hadn't thought about it much before, but the light in my brain flashed on for a brief second.

So here it is - my home-brewed binary tree lookup.  Sorry if this is too pre-school to any über geeks out there, please have mercy.

Any comments? Grossly missed efficiencies?  How this is worse (or better?) than datastep lookups out there in practice?

Does this always work?  Does the ceil function keeps me out of trouble, hmmmmm?  It worked on a few dozen tests.

The distance between my target and sample could be interpolated rather than simply halved with each iteration. Attempting interpolation seemed inelegant, and this had already had become uglier than anticipated.

Tests show that in a set of .6 million records it takes fewer than 20 lookups to find the target.

The data are sorted on UCI, a seven digit character string.

data found_it;
     find='0010524';
     *using Zeno's arrow to hit a target - iterate up & down by nobs/2**n;
     tries+1;
     direction+0;
     sample+sum(direction=0,direction)*(ceil(nobs/(2**tries))<>1);
     sample=sample><nobs; *if outside 1<->nobs reset to min or max;
     sample=sample<>1;

     set big_data(keep=uci) point=sample nobs=nobs;

     direction= sum(((uci>find)*(-1)),((uci<find)*1));
     if uci=find or ceil(nobs/(2**tries))=1 then do;
          record=i;
          if uci=find then output;
                        else put / / 'Not Found:' (_all_)(=);
          stop;
     end;
run;

TIA

Paul Choate
DDS Data Extraction
(916) 654-2160


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.