Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

String compare limits

133 views
Skip to first unread message

Werner

unread,
Oct 20, 2012, 4:06:43 AM10/20/12
to
Anyone else notice that comparing strings (or SORTing them) does not work when they are longer than 2047 chars? (48G,49G)

"... A"
"... B"
<

returns 0 when the strings are 2048 chars or longer, but 1 when the strings are less than 2048 chars in length.
Why on earth would anyone impose an arbitrary limit on the length of a string? Let memory decide how long your strings can grow...

Han

unread,
Oct 20, 2012, 10:13:44 AM10/20/12
to
Looks like some poor design choices in the algorithm for comparing two strings. Whoever wrote the algorithm for =$>$?_ decided to limit the size of strings to fit only in the X (exponent) field (notice the A=0 M and C=0 M opcodes). Had they just left that out, then the strings could be as long as 1/2*(16^5) characters (which of course means insufficient memory to even store such a string, but that's beside the point). By design, strings can only be 1/2*(16)^5 characters in length because their length field is only 5 nibbles wide. Here's what Jazz sees in ROM. L12D90 is just a jump to =SAVPTR


GOSUB L12D90
C=DAT1 A
D0=C
D0=D0+ 5
D1=D1+ 5
C=DAT1 A
D1=C
D1=D1+ 5
LC(5) 5
A=DAT0 A
D0=D0+ 5
A=A-C A
A=0 M
ASRB
B=C A
ABEX A
C=DAT1 A
D1=D1+ 5
C=C-A A
C=0 M
CSRB
D=C A
D=D-1 A
GOC L12E23
B=B-1 A
GOC L12E43
L12E03 A=DAT0 B
C=DAT1 B
D0=D0+ 2
D1=D1+ 2
?C>A B
GOYES L12E43
?C<A B
GOYES L12E23
B=B-1 A
GOC L12E3E
D=D-1 A
GONC L12E03
L12E23 LC(5) =%0
A=C A
GOSBVL =GETPTR
D1=D1+ 10
D=D+1 A
D=D+1 A
PC=(A)

Werner

unread,
Oct 22, 2012, 2:51:50 AM10/22/12
to
This counts as a bug, not a poor design choice.
Just moving the A=0.M instruction right before A=DAT0.A (and do the same for the C=0.M) would've avoided the problem, at no extra cost.
Well, we'll have to live with it, I gather ;-)

Han

unread,
Oct 22, 2012, 3:21:19 PM10/22/12
to
Call it what you will, but the A=0.M or C=0.M is actually completely unnecessary _if_ the original intent was to have longer string comparisons. Moving it before A=DAT0.A does nothing since all the relevant opcodes specify the A field when counting down. However, there is really no reason to zero out the A field since the length of a string object is 5 nibbles wide anyway. If the original intent was to have strings as long as possible (within design limits, of course), then why use the M field unless you ONLY want to zero out the top 2 nibbles (non-exponent) of the A field? Using the M field with A=0 or C=0 is the quickest way to zero out those top two nibbles in the A field. That is why I think it was a poor design choice, not a bug.

But you're right -- either way, it's still bad since I don't think any ROM update will ever happen at this point.

Han

unread,
Oct 22, 2012, 4:17:28 PM10/22/12
to
shorter and faster alternative for $>$?_ without the "bug" and uses one less register

CODE
GOSBVL =SAVPTR

C=DAT1 A
D0=C
D1=D1+ 5
C=DAT1 A
D1=C

D0=D0+ 5
D1=D1+ 5
A=DAT0 A * A[A] = len of $1 + 5
C=DAT1 A * C[A] = len of $2 + 5
D0=D0+ 5 * point D0,D1 to string bodies
D1=D1+ 5

?C>A A * quick test of lengths
GOYES push%0
?C<A A
GOYES push%1

C=C-CON A,5 * lengths equal at this point
CSRB.F A * C[A] = num of chars
B=C A * use B[A] as counter for both
loop B=B-1 A
GOC push%1 * also handles empty strings
A=DAT0 B
C=DAT1 B
D0=D0+ 2
D1=D1+ 2
?C<A B
GOYES push%1
?C=A B
GOYES loop

push%0 LC(5) =%0 * fall through to here if not > or =
GONC exit * only reached here with CC

push%1 LC(5) =%1

exit A=C A
GOSBVL =GETPTR
D1=D1+ 10
D=D+1 A
D=D+1 A
PC=(A)
ENDCODE

Han

unread,
Oct 22, 2012, 4:19:28 PM10/22/12
to
Bah... should be A=DAT1.A and C=DAT0.A so that A[A] = len $1 and C[A] = len $2 resp.

Werner

unread,
Oct 24, 2012, 2:43:38 AM10/24/12
to
The A=0.M instruction is needed to zero out the top bit that ASRB introduces in the A field. It's a common error to forget it.
Or, they might've used ASRB.A but I think early versions of the Saturn did not have that instruction yet.

Han

unread,
Oct 24, 2012, 8:29:51 AM10/24/12
to
On Wednesday, October 24, 2012 2:43:39 AM UTC-4, Werner wrote:
> The A=0.M instruction is needed to zero out the top bit that ASRB introduces in the A field. It's a common error to forget it.
> Or, they might've used ASRB.A but I think early versions of the Saturn did not have that instruction yet.

The code is exactly the same as found in the HP48 series, which did have the opcode ASRB.A (a.k.a. ASRB.F A in HP syntax). I am not sure how far back the code goes, but I imagine probably only as far back as the HP28SX series since that is the earliest model I can think of in which strings are actually accessible to users. The HP42S has "primitive" uses of strings but I am not sure if its ROM was used as a basis for the HP28 or HP48 series.
0 new messages