Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Fwd: Fast lookup of 234 possible 1/2/3 character codes

14 views
Skip to first unread message

Robert Prins

unread,
Mar 20, 2023, 4:06:22 PM3/20/23
to
On 2023-03-09 15:28, Robert Prins wrote:
> On 2023-03-09 10:19, Terje Mathisen wrote:
>> This is much better!
>
> Thank you!
>
>> I still think you should pay more attention to the cost of branches, i.e. a
>> sequential scan of a shortish segment of the country array is very cheap, on
>> the order of 2-3 clock cycles/country, while a single mispredicted branch cost
>> 8-20 cycles depending upon the CPU you are using.
>
> An AMD FX8150 and an Intel i7-4710MQ.
>
>> Assuming this really is so important, why not "waste" a bit more lookup table
>> RAM for a full 2-char index?
>
> It's Utterly Unimportant (with two huge "U"s), the program as-is runs in 0.32
> seconds clock-time, but it's just interesting to see how far I can go in pure
> Pascal, to translate that subsequently into in-line assembler, where I'm in many
> cases harking back to Turbo Pascal 3 times, by coding post-Pentium instructions
> as "db" sequences, the original code dates back to 1994, and pretty amazingly,
> many of my original record structures are very AVX friendly!
>
> Anyway, given that my ['A'..'Z'] array had one position open, I've used that to
> replace the general last-matched country with a per-initial character
> last-matched country, and that had cut the chop-count even more:
>
> Original full range                 : 40,362
> Low/high per initial character (pic): 14,907
> As previous + first most used pic   :  2,871
> As previous + cache pic             :  1,747
>
> And now I'm going to think slightly outside the box, but will have to knock up a
> bit of REXX to actually test this, by what might be counter-intuitive, upping
> the low and/or high value of some ranges into the previous/next (or, unlikely,
> but never say never, pre-previous or over-next) range, so that an initial first
> lookup immediately finds the second most-used country, without generating an
> excessive number of additional lookups for even less used countries.
>
> E.g. for countries starting with "L" I currently only have three, in order of
> use, "LT", "LV" (74 lookups), and "L" (45 lookups), both requiring three chops
> for a total of 357 chops. If I were to extend the "L" range into the "M" range
> so that a first chop returns "LV" I save 148 chops, which means that even if I
> need up to three more chops for "L", I still save time, and for this specific
> case, extending the "Lx" range to "MEX" gives me single-chop hits on "LV" and
> still the same three-chop hits on "L", so reducing the number of chops to 209.
> In this specific case, I could go even further, and take the lo-range back into
> the "K" countries and get to "L" in just two chops.
>
> Doing the same for the also easy to try three additional (after "GB") "G*"
> countries saves me another 96 chops if I put "GR", the second most used, in the
> initial middle of the range (by extending the high value to "IR"), without
> affecting "GE" and "GH"chop counts.
>
> I'm going to see how this works out before doing anything else, as this change
> would be easy to implement in Pascal and in my PL/I version of the same on z/OS,
> I keep the two and they should give the same output, if they don't I know that
> I've screwed up somewhere!

And the optimal for the current (and probably near future) set of data is now

1,173

I did write some REXX to scan both the input data and data that was passed into
the "srch_cnty" routine. I then extended the chop intervals by the
length-of-the-interval + 20 on both the left and right (obviously not going
below 0 or above 234) and let that run, marking the lowest number of chops and
shortest intervals (not less than zero) and that gave me the above number, just
over 2.9% of the original.

I'm going to leave it at that for now. The new intervals will, as I said,
unlikely change a lot in the near future, I may encounter a few more
nationalities of drivers and visit a few more countries, but in small numbers
they won't materially affect the current chopping intervals.

This is what I got out for the per interval scans:

+++ -1- min max 1/2
// 'A' 15 A A AFG AZ 8 (ARK)
// 'B' 18 B B AZ D 32 (BY ) -2-
// 'C' 15 CZ <> CH C DJI 42 (CO )
// 'D' 6 D D CYM EAK 51 (DK ) -2-
// 'E' 12 E E DZ GB 64 (EST) -2-
// 'F' 6 F F FIN FSM 70 (FL ) -2-
// 'G' 14 GB GB G IR 84 (GR ) -2-
// 'H' 5 H <> HR H IR 91 (HR )
// 'I' 7 I <> IL IL J 96 (IRL) -2-
// 'J' 2 J J JA JA 100 (JA )
// 'K' 10 KZ KZ IRQ KZ 103 (KGZ)
// 'L' 7 LT <> L KS MNE 117 (LV ) -2-
// 'M' 15 MA MAL M NA 126 (MK )
// 'N' 11 NL NL LS PNG 133 (N ) -2-
// 'O' 1 OM OM 144 (OM )
// 'P' 11 PL <> PK NZ PY 149 (PK )
// 'Q' 1 Q Q 156 (Q )
// 'R' 24 RO <> RL RA SGP 171 (RMM)
// 'S' 19 S S RMM T 185 (SF ) -2-
// 'T' 12 TR TR T USA 204 (TJ )
// 'U' 5 UA <> USA UA V 214 (USA) -2-
// 'V' 3 VN VN V VU 218 (VN )
// 'W' 9 WAN WAN UZ WV 222 (WAN)
// 'X'
// 'Y' 3 YU YU YAR YV 230 (YU )
// 'Z' 3 ZA ZA Z ZW 233 (ZA )

In other words, if it's not the last scanned value (-1-), I next try the most
frequent (+++) value, and after that I scan the sometimes longer interval that
is created to (mostly) find the second most occurring value on the first chop,
and in some cases even the third most on the second chop, manually added the
I-know-these as "-2-" above, and obviously a last found value is saved in the
"-1-" position.

And yes, now that I have the code, it wouldn't be too hard to adapt it to
actually only include the combination of 95 nationalities of drivers and
countries I have visited, and reduce the chop-count even more. Obviously it is
highly unlikely to ever beat a look-up table, but the advantage of this is that
it's easily portable between x86 (little endian) and z/OS (big endian).

Robert
--
Robert AH Prins
robert(a)prino(d)org
The hitchhiking grandfather - https://prino.neocities.org/
Some REXX code for use on z/OS - https://prino.neocities.org/zOS/zOS-Tools.html

0 new messages