apt-file rewrite done

Showing 1-2 of 2 messages
apt-file rewrite done Nick Black 3/24/13 9:32 PM
Howdy!

I've got raptorial-file in a state for people to play with. The wins weren't
as impressive as apt-show-versions(1) for the common cases (more like 25%),
though some uncommon cases saw reductions in total time of >90%.

By the way, you'll see some different numbers from wl here; that's because
raptorial-file(1) doesn't filter common paths. If I have two distributions
offering a package with a file, I want to see it twice, as opposed to only
seeing it once when I have one distribution offering said package. Throwing
the output through uniq will, of course, restore the old behavior for those
who want it.

On my quadcore 2600K for a common case:

[skynet](130) $ for i in `seq 1 10` ; do /usr/bin/time ./raptorial-file bin/rapt | wc -l ; done
2.57user 0.03system 0:01.42elapsed 183%CPU (0avgtext+0avgdata 160792maxresident)k 0inputs+0outputs (0major+41728minor)pagefaults 0swaps
5
2.69user 0.03system 0:01.94elapsed 140%CPU (0avgtext+0avgdata 132196maxresident)k 0inputs+0outputs (0major+48057minor)pagefaults 0swaps
5
2.58user 0.06system 0:00.95elapsed 276%CPU (0avgtext+0avgdata 169748maxresident)k 0inputs+0outputs (0major+29355minor)pagefaults 0swaps
5
2.60user 0.03system 0:01.16elapsed 227%CPU (0avgtext+0avgdata 168184maxresident)k 0inputs+0outputs (0major+27978minor)pagefaults 0swaps
5
2.54user 0.04system 0:01.53elapsed 169%CPU (0avgtext+0avgdata 134156maxresident)k 0inputs+0outputs (0major+43645minor)pagefaults 0swaps
5
2.63user 0.05system 0:01.74elapsed 153%CPU (0avgtext+0avgdata 169760maxresident)k 0inputs+0outputs (0major+42352minor)pagefaults 0swaps
5
2.58user 0.05system 0:01.70elapsed 155%CPU (0avgtext+0avgdata 134336maxresident)k 0inputs+0outputs (0major+46362minor)pagefaults 0swaps
5
2.62user 0.05system 0:01.08elapsed 247%CPU (0avgtext+0avgdata 168136maxresident)k 0inputs+0outputs (0major+26693minor)pagefaults 0swaps
5
2.52user 0.03system 0:01.25elapsed 204%CPU (0avgtext+0avgdata 147208maxresident)k 0inputs+0outputs (0major+35132minor)pagefaults 0swaps
5
2.61user 0.03system 0:01.51elapsed 175%CPU (0avgtext+0avgdata 171856maxresident)k 0inputs+0outputs (0major+32406minor)pagefaults 0swaps
5
[skynet](0) $

More variance in runtimes than I'd like to see, but a nice low one (<1s) and
very constant total user times, suggesting this is a matter of how many cpus
are truly available (I didn't have anything heavy going on, but I did have
my web browser, mplayer, mpd, etc). Average of 1.428s; if we throw out the
low and bottom, we get an average of...1.423s.

[skynet](0) $ for i in `seq 1 10` ; do /usr/bin/time apt-file search bin/rapt | wc -l ; done
1.79user 0.10system 0:01.63elapsed 116%CPU (0avgtext+0avgdata 11676maxresident)k 0inputs+0outputs (0major+19630minor)pagefaults 0swaps
4
1.79user 0.13system 0:01.65elapsed 116%CPU (0avgtext+0avgdata 11672maxresident)k 0inputs+0outputs (0major+19629minor)pagefaults 0swaps
4
1.80user 0.12system 0:01.66elapsed 116%CPU (0avgtext+0avgdata 11676maxresident)k 0inputs+0outputs (0major+19621minor)pagefaults 0swaps
4
1.76user 0.13system 0:01.63elapsed 116%CPU (0avgtext+0avgdata 11672maxresident)k 0inputs+0outputs (0major+19633minor)pagefaults 0swaps
4
1.76user 0.12system 0:01.63elapsed 116%CPU (0avgtext+0avgdata 11676maxresident)k 0inputs+0outputs (0major+19631minor)pagefaults 0swaps
4
1.78user 0.15system 0:01.66elapsed 116%CPU (0avgtext+0avgdata 11676maxresident)k 0inputs+0outputs (0major+19638minor)pagefaults 0swaps
4
1.79user 0.10system 0:01.63elapsed 115%CPU (0avgtext+0avgdata 11676maxresident)k 0inputs+0outputs (0major+19624minor)pagefaults 0swaps
4
1.76user 0.14system 0:01.64elapsed 116%CPU (0avgtext+0avgdata 11672maxresident)k 0inputs+0outputs (0major+19626minor)pagefaults 0swaps
4
1.81user 0.11system 0:01.65elapsed 116%CPU (0avgtext+0avgdata 11672maxresident)k 0inputs+0outputs (0major+19621minor)pagefaults 0swaps
4
1.83user 0.10system 0:01.66elapsed 116%CPU (0avgtext+0avgdata 11676maxresident)k 0inputs+0outputs (0major+19632minor)pagefaults 0swaps
4
[skynet](0) $

Pretty well-defined average at call it 1.63s. Only a 13% improvement in
total runtime. Well, it's tough to beat grep at its own game. Now a typical
query with significantly more hits:

[skynet](0) $ for i in `seq 1 10` ; do /usr/bin/time ./raptorial-file compiz | wc -l ; done
2.60user 0.03system 0:01.23elapsed 213%CPU (0avgtext+0avgdata 172168maxresident)k 0inputs+0outputs (0major+39465minor)pagefaults 0swaps
1191
2.64user 0.03system 0:01.09elapsed 246%CPU (0avgtext+0avgdata 171572maxresident)k 0inputs+0outputs (0major+34647minor)pagefaults 0swaps
1191
2.60user 0.03system 0:01.05elapsed 251%CPU (0avgtext+0avgdata 171760maxresident)k 0inputs+0outputs (0major+31924minor)pagefaults 0swaps
1191
2.53user 0.05system 0:01.65elapsed 156%CPU (0avgtext+0avgdata 158992maxresident)k 0inputs+0outputs (0major+32338minor)pagefaults 0swaps
1191
2.67user 0.05system 0:01.76elapsed 154%CPU (0avgtext+0avgdata 169472maxresident)k 0inputs+0outputs (0major+42855minor)pagefaults 0swaps
1191
2.53user 0.07system 0:01.79elapsed 145%CPU (0avgtext+0avgdata 161980maxresident)k 0inputs+0outputs (0major+42540minor)pagefaults 0swaps
1191
2.83user 0.04system 0:01.27elapsed 225%CPU (0avgtext+0avgdata 170388maxresident)k 0inputs+0outputs (0major+38347minor)pagefaults 0swaps
1191
2.57user 0.06system 0:01.10elapsed 239%CPU (0avgtext+0avgdata 170008maxresident)k 0inputs+0outputs (0major+35109minor)pagefaults 0swaps
1191
2.57user 0.06system 0:01.45elapsed 181%CPU (0avgtext+0avgdata 132300maxresident)k 0inputs+0outputs (0major+24178minor)pagefaults 0swaps
1191
2.60user 0.05system 0:01.70elapsed 155%CPU (0avgtext+0avgdata 132680maxresident)k 0inputs+0outputs (0major+45375minor)pagefaults 0swaps
1191
[skynet](0) $

An average of 1.41, very similar to our first results. Some nice, low
numbers in there.

[skynet](0) $ for i in `seq 1 10` ; do /usr/bin/time apt-file search compiz | wc -l ; done
2.08user 0.16system 0:01.82elapsed 123%CPU (0avgtext+0avgdata 12080maxresident)k
0inputs+0outputs (0major+19929minor)pagefaults 0swaps
1112
2.06user 0.16system 0:01.82elapsed 122%CPU (0avgtext+0avgdata 12076maxresident)k 0inputs+0outputs (0major+19926minor)pagefaults 0swaps
1112
2.03user 0.23system 0:01.83elapsed 123%CPU (0avgtext+0avgdata 12076maxresident)k 0inputs+0outputs (0major+19914minor)pagefaults 0swaps
1112
2.04user 0.22system 0:01.84elapsed 123%CPU (0avgtext+0avgdata 12080maxresident)k 0inputs+0outputs (0major+19938minor)pagefaults 0swaps
1112
1.97user 0.27system 0:01.83elapsed 122%CPU (0avgtext+0avgdata 12080maxresident)k 0inputs+0outputs (0major+19927minor)pagefaults 0swaps
1112
2.01user 0.17system 0:01.79elapsed 122%CPU (0avgtext+0avgdata 12080maxresident)k 0inputs+0outputs (0major+19918minor)pagefaults 0swaps
1112
1.94user 0.27system 0:01.80elapsed 123%CPU (0avgtext+0avgdata 12080maxresident)k 0inputs+0outputs (0major+19936minor)pagefaults 0swaps
1112
1.98user 0.29system 0:01.84elapsed 123%CPU (0avgtext+0avgdata 12080maxresident)k 0inputs+0outputs (0major+19928minor)pagefaults 0swaps
1112
1.97user 0.26system 0:01.81elapsed 123%CPU (0avgtext+0avgdata 12080maxresident)k 0inputs+0outputs (0major+19943minor)pagefaults 0swaps
1112
1.86user 0.31system 0:01.78elapsed 122%CPU (0avgtext+0avgdata 12076maxresident)k 0inputs+0outputs (0major+19933minor)pagefaults 0swaps
1112
[skynet](0) $

Here we see the crappiness of apt-file(1)'s postprocessing at work. It blows
up to call it 1.78. That's a little over 20% reduction in runtime -- nothing
to write home about, but not too shabby for a weekend's work. apt-file(1)
started off much better than apt-show-versions(1) did from last week.

Let's take a look at a more atypical case:

[skynet](0) $ for i in `seq 1 3` ; do /usr/bin/time ./raptorial-file a | wc -l ; done
6.60user 7.35system 0:03.47elapsed 401%CPU (0avgtext+0avgdata 230600maxresident)k 0inputs+0outputs (0major+35387minor)pagefaults 0swaps
4887112
6.51user 10.29system 0:03.49elapsed 481%CPU (0avgtext+0avgdata 252104maxresident)k 0inputs+0outputs (0major+37747minor)pagefaults 0swaps
4887112
6.24user 6.63system 0:03.36elapsed 382%CPU (0avgtext+0avgdata 228724maxresident)k 0inputs+0outputs (0major+35701minor)pagefaults 0swaps
4887112
[skynet](0) $

Worst-case of 3.49s. I didn't want to sit around for 3 runs of "apt-file
search a", so I canceled after the first:

[skynet](0) $ for i in `seq 1 3` ; do /usr/bin/time apt-file search a | wc -l ; done
99.89user 3.88system 1:38.81elapsed 105%CPU (0avgtext+0avgdata 2159292maxresident)k 0inputs+0outputs (0major+1225930minor)pagefaults 0swaps
4424513
[skynet](130) $

I'll take a 96.5% reduction in runtime even for just the pathological cases.
Here's a more plausible one; let's count our binaries:

[skynet](130) $ for i in `seq 1 3` ; do /usr/bin/time apt-file search /bin/ | wc -l ; done
3.72user 0.19system 0:02.88elapsed 135%CPU (0avgtext+0avgdata 29356maxresident)k 0inputs+0outputs (0major+31174minor)pagefaults 0swaps
35353
3.78user 0.18system 0:02.90elapsed 136%CPU (0avgtext+0avgdata 29356maxresident)k 0inputs+0outputs (0major+31157minor)pagefaults 0swaps
35353
3.51user 0.20system 0:02.74elapsed 135%CPU (0avgtext+0avgdata 29356maxresident)k 0inputs+0outputs (0major+31163minor)pagefaults 0swaps
35353
[skynet](0) $ for i in `seq 1 3` ; do /usr/bin/time ./raptorial-file /bin/ | wc -l ; done
2.58user 0.04system 0:01.89elapsed 139%CPU (0avgtext+0avgdata 133412maxresident)k 0inputs+0outputs (0major+33123minor)pagefaults 0swaps
39821
2.58user 0.07system 0:01.75elapsed 151%CPU (0avgtext+0avgdata 134356maxresident)k 0inputs+0outputs (0major+46199minor)pagefaults 0swaps
39821
2.55user 0.04system 0:01.87elapsed 138%CPU (0avgtext+0avgdata 133776maxresident)k 0inputs+0outputs (0major+31806minor)pagefaults 0swaps
39821
[skynet](0) $

RAPTORIAL average: 1.84s
apt-file average: 2.84s
CHAMPION: RAPTORIAL (64.8% of runtime)

I can't find a case on this machine where apt-file(1) beats raptorial-file,
or even ties it. That said, the gain on typical searches is only a 1.1x--1.2x
speedup.

That might not be worth replacing the well-tested apt-file(1) codebase with
raptorial-file(1) by itself. However:

 (1) raptorial-file(1) uses the same codebase as rapt-show-versions(1), so
      why not have one if you have the other?

 (2) raptorial-file(1) is 84 lines of C to apt-file's 779 lines of perl

 (3) raptorial-file(1) will beat apt-file(1) more and more as cores are added

 (4) raptorial-file(1) is pretty standard C; there's no tricky code. maybe
      apt-file(1) is pretty standard perl. i wouldn't know. IMHO 779 lines of
      perl is unmaintainable perl, no offense to the apt-file(1) authors.

 (5) the apt-file(1) maintainer has recently requested to hand off the
      package, while the likely raptorial-file(1) maintainer is agitating
      for his package's inclusion :).

I hope this builds excitement regarding RAPTORIAL and convinces folks that
I'm on to something here. Hack on!

--rigorously, nick
  Hacker-in-Charge, SprezzOS Project

--
nick black     http://www.sprezzatech.com -- unix and hpc consulting
to make an apple pie from scratch, you need first invent a universe.
Re: apt-file rewrite done Nick Black 3/25/13 12:15 AM
nick black left as an exercise for the reader:
> I've got raptorial-file in a state for people to play with. The wins weren't
> as impressive as apt-show-versions(1) for the common cases (more like 25%),
> though some uncommon cases saw reductions in total time of >90%.

Spoke too soon! After fixing a stupid bug [0], improvement is much more
stark, closer to a 2x speedup. the case for raptorial-file(1) just became
much more convincing:

[skynet](130) $ for i in `seq 1 10` ; do /usr/bin/time ./raptorial-file compiz > /dev/null ; done
4.07user 1.29system 0:00.95elapsed 565%CPU (0avgtext+0avgdata 194304maxresident)k 0inputs+0outputs (0major+37572minor)pagefaults 0swaps
3.74user 1.89system 0:01.07elapsed 526%CPU (0avgtext+0avgdata 183116maxresident)k 0inputs+0outputs (0major+39856minor)pagefaults 0swaps
3.91user 1.21system 0:00.99elapsed 515%CPU (0avgtext+0avgdata 193564maxresident)k 0inputs+0outputs (0major+38071minor)pagefaults 0swaps
3.93user 1.31system 0:01.06elapsed 495%CPU (0avgtext+0avgdata 186644maxresident)k 0inputs+0outputs (0major+38400minor)pagefaults 0swaps
4.03user 1.54system 0:01.01elapsed 551%CPU (0avgtext+0avgdata 192504maxresident)k 0inputs+0outputs (0major+39826minor)pagefaults 0swaps
3.64user 1.03system 0:01.05elapsed 442%CPU (0avgtext+0avgdata 186892maxresident)k 0inputs+0outputs (0major+37151minor)pagefaults 0swaps
3.87user 1.84system 0:01.10elapsed 516%CPU (0avgtext+0avgdata 193324maxresident)k 0inputs+0outputs (0major+34636minor)pagefaults 0swaps
4.03user 1.84system 0:01.01elapsed 582%CPU (0avgtext+0avgdata 192576maxresident)k 0inputs+0outputs (0major+39498minor)pagefaults 0swaps
3.97user 1.65system 0:01.04elapsed 538%CPU (0avgtext+0avgdata 194552maxresident)k 0inputs+0outputs (0major+38737minor)pagefaults 0swaps
3.85user 1.18system 0:01.03elapsed 485%CPU (0avgtext+0avgdata 190348maxresident)k 0inputs+0outputs (0major+36283minor)pagefaults 0swaps
[skynet](0) $

[skynet](1) $ for i in `seq 1 10` ; do /usr/bin/time apt-file search compiz
> /dev/null ; done
2.06user 0.12system 0:01.77elapsed 123%CPU (0avgtext+0avgdata 12076maxresident)k 0inputs+0outputs (0major+19894minor)pagefaults 0swaps
2.07user 0.13system 0:01.80elapsed 122%CPU (0avgtext+0avgdata 12076maxresident)k 0inputs+0outputs (0major+19909minor)pagefaults 0swaps
2.10user 0.12system 0:01.82elapsed 122%CPU (0avgtext+0avgdata 12076maxresident)k 0inputs+0outputs (0major+19901minor)pagefaults 0swaps
2.03user 0.15system 0:01.78elapsed 122%CPU (0avgtext+0avgdata 12080maxresident)k 0inputs+0outputs (0major+19904minor)pagefaults 0swaps
2.17user 0.11system 0:01.85elapsed 123%CPU (0avgtext+0avgdata 12080maxresident)k 0inputs+0outputs (0major+19895minor)pagefaults 0swaps
2.15user 0.12system 0:01.84elapsed 123%CPU (0avgtext+0avgdata 12080maxresident)k 0inputs+0outputs (0major+19894minor)pagefaults 0swaps
2.04user 0.16system 0:01.79elapsed 122%CPU (0avgtext+0avgdata 12080maxresident)k 0inputs+0outputs (0major+19890minor)pagefaults 0swaps
2.03user 0.13system 0:01.77elapsed 122%CPU (0avgtext+0avgdata 12080maxresident)k 0inputs+0outputs (0major+19897minor)pagefaults 0swaps
2.05user 0.12system 0:01.77elapsed 122%CPU (0avgtext+0avgdata 12080maxresident)k 0inputs+0outputs (0major+19895minor)pagefaults 0swaps
2.08user 0.13system 0:01.80elapsed 122%CPU (0avgtext+0avgdata 12076maxresident)k 0inputs+0outputs (0major+19899minor)pagefaults 0swaps
[skynet](0) $

RAPTORIAL: 1.01s
apt-file: 1.77s
CHAMPION: RAPTORIAL (57%)

now *THAT'S* what i'm talkin' about! w00t w00t! man look at those cpus
getting pegged! rawwwwwwhide!

--hack on, nick
  Hacker-in-Charge, SprezzOS Project

[0] https://github.com/dankamongmen/raptorial/commit/6818a945733f246511a408165d99620352b19963