LTrim(), AllTrim(), Trim(), Rtrim() Bug

200 views
Skip to first unread message

carpen...@gmail.com

unread,
Sep 18, 2021, 2:03:22 AM9/18/21
to Harbour Users
I'm using Harbour 3.0.0 (Rev. 16951)

Thought I'd pass this along, after spending way too much time assuming I KNEW what these functions did - until  I finally began testing my assumptions which HAD to be right - but weren't...

I've done TRIM(LTRIM(field or variable)) for DECADES.  I KNOW what it does!  But now, I've got a dbf field containing a leading and trailing TAB [Chr(9)] character which I expect to SURVIVE any kind of "trimming".  You know what they say about assumptions, right?

First of all, Trim() does exactly what I expect.  The trailing Tab byte survives. 

NOT SO MUCH with LTrim() and AllTrim().  In Harbour, the leading Tab character is deleted.

I still have Clipper, Summer '87 so I decided to test that version.  I don't know if it's better or worse, BUT in Clipper the AllTrim() ONLY removes leading and trailing SPACES.  The Tab characters survive.  However, it turns out that LTrim() actually DOES destroy leading Tab characters.  Trim() does what is expected, and the trailing Tabs survive.

Every so often I find myself beating my head against a stone wall, looking for an "error" in my logic which doesn't seem to exist.  Finally I start checking the things which I'm SURE I know how they work - except they don't QUITE do what I expect.  Thought I'd take a moment to pass along this bug or "feature".  It's true that I've never run into this problem in all these years mainly because things like Tab characters don't belong in dbf files.  But now they are in the records and I need them to survive "trimming".  Now that I know the problem I can work around it with Substr().  But I spent a lot of time being POSITIVE that trimming could NOT destroy Tab characters.  Turns out I was only PARTIALLY correct about that.

Tom.

Anand Gupta

unread,
Sep 18, 2021, 6:43:14 AM9/18/21
to harbou...@googlegroups.com
Thanks for the head up Tom

As far as I know, we should avoid characters below 32 and above 127 for dbf data store, else all sorts of problems occur in OEM setting and indexing.
I use Clipper, Harbour,Xbase++ and also ADS, for the same DBF file, so I have faced many hurdles and decided on it. 

Anand
Working from Home


--
--
You received this message because you are subscribed to the Google
Groups "Harbour Users" group.
Unsubscribe: harbour-user...@googlegroups.com
Web: http://groups.google.com/group/harbour-users

---
You received this message because you are subscribed to the Google Groups "Harbour Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to harbour-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/harbour-users/0cfabfce-2f62-4636-9af6-efa4f1447291n%40googlegroups.com.

carpen...@gmail.com

unread,
Sep 18, 2021, 4:32:08 PM9/18/21
to Harbour Users
What I sent previously is accurate, but I wanted to do some more formal testing - including the "gold standard" for these very old functions - dBase III Plus.

I created a test dbf file with one field of type character with a width of 15 bytes. It contains one record, and the contents of that record are as follows:

3 spaces + 2 tab bytes + "ABCDE" + 2 tab bytes + 3 spaces

I never changed the contents of this file, and used it for all of my testing.

I started with dBase III Plus, and just worked at the dot prompt.  At that time the ONLY trim functions were TRIM() and LTRIM().  The bottom line is that these two functions performed PERFECTLY.  None of the tab bytes were destroyed by either function.  I suspect this is some part of the reason that it took me so long to become suspicious of these functions.  I also had used other functions in my program and tested them first.  I.e. I had to eliminate other possibilities before I was FORCED to test these old trim functions.  ("Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth."  Sherlock Holmes)

So, dBase III Plus had NO DIFFICULTY dealing with the tab bytes contained in the dbf record.  Those "difficulties" have been "extra added features" of later iterations of those functions...

Moving on, I created a short source code program to be compiled by both Clipper, Summer '86, and then by Harbour 3.0.0 (Rev. 16951).  I used the exact same source program to create both exe files.  This incorporated essentially the same commands I had performed at the dBase dot prompt: checking the LEN() of various TRIMmed and LTRIMmed strings using the same dbf file I had used with dBase III Plus.  The only additions were some similar commands using the ALLTRIM() function which had never existed in dBase III Plus.

Again, the results were exactly as I had previously described.  The Clipper LTRIM() function WILL destroy leading tab bytes; however, for some perverse reason ALLTRIM() works perfectly, as does the TRIM() function.  Neither of those last two functions destroys ANY leading or trailing tab bytes.

Then there's Harbour.  At least it's "consistent" - sort of.  Any LEADING tab bytes WILL BE DESTROYED, either by LTRIM() or ALLTRIM().  Leading tab bytes are DOOMED.  The TRIM() function is the only one which does NOT destroy tab bytes - so long as they are trailing.

What began as a very simple and useful set of functions in dBase III Plus has become more "complicated" - and not in a GOOD way.  I really have enjoyed using Harbour.  It's a very powerful tool.  Maybe that's another reason that I had to be FORCED to the conclusion that there were some problems with these very old functions.

Tom.

Auge & Ohr

unread,
Sep 18, 2021, 8:14:24 PM9/18/21
to Harbour Users
hi,

you might be right ... but why use CHR(9) AND TRIM() / LTRIM() ?

i guess you talk about ANSI Version. have you try UNICODE Function ?

Jimmy
p.s. there was no "Clipper, Summer '86". it was "Clipper Autumn'86"  

Klas Engwall

unread,
Sep 19, 2021, 7:22:55 PM9/19/21
to harbou...@googlegroups.com
Hi Tom,
I have in front of my eyes right now the Clipper and Harbour results of
a simple Trimtest() function in two different windows, and the results
are identical between the two.

procedure Main()
local cTest := chr( 9 ) + ' ' + 'X' + ' ' + chr( 9 )
? 'ltrim ', len( ltrim( cTest ) ) // 3
? 'rtrim ', len( rtrim( cTest ) ) // 5
? 'alltrim', len( alltrim( cTest ) ) // 3
return

The reason for that is that Clipper's behavior has been investigated
thoroughly and duplicated in Harbour. So the difference between left and
right trimming in Harbour is because of Clipper compatibility (the most
important Harbour cornerstone). Why Clipper's left and right trimming
functions are different regarding white space is anyone's guess ...

In harbour\src\rtl\trim.c you can see the logic behind how the trimming
is done:

Ltrim() uses the C function hb_strLTrim() which trims ALL white space
characters. The two leftmost characters are lost.

Rtrim() uses the hb_strRTrimLen() C function which trims either only
chr(32) or all white space characters depending on the 3rd argument
passed from the PRG function calling it. In the case of Rtrim() and
Alltrim() that argument is to only trim chr(32). So Rtrim() loses no
characters in the example above.

Alltrim() calls both C level functions, so the two leftmost characters
are lost.

The xhb_* variants of the PRG level trimming functions all have that
extra argument as an extension so the application programmer can choose
one or the other trimming options.

This is the macro that is used to determine what is white space in the
cases where all white space characters are trimmed:

#define HB_ISSPACE( c ) ( ( c ) == ' ' || \
( c ) == HB_CHAR_HT || \
( c ) == HB_CHAR_LF || \
( c ) == HB_CHAR_CR )

Regards,
Klas

carpen...@gmail.com

unread,
Sep 20, 2021, 2:34:05 PM9/20/21
to Harbour Users
Klas,

I appreciate your response, and I can understand the value of controlling what kind of white space beyond Chr(32) is "trimmed" away.  So far I have not tried to alter any of the underlying C code if I can avoid it.  I'm not a C programmer and would prefer not to go down that road except perhaps as a last resort.

For now, I've built my own User Defined Function which will do exactly what LTrim() did in dBase III Plus, which was to trim only Chr(32) bytes.

I'm hoping you can point me in the direction of some Harbour functions which will help with the error handling routine.  I'm aware of ProcName() and ProcLine(), but in the native Harbour error handling it returns the "Called from" information going back to the line in the source code which called the function.  How can I retrieve the "Called from" information from my UDF?  (The basic error would be an invalid data argument such as trimming a numerical value.)

I appreciate any help you can give me on this.

Thanks again.

Tom.

carpen...@gmail.com

unread,
Sep 20, 2021, 5:46:34 PM9/20/21
to Harbour Users
I think I've got it.  Started wondering what kind of ARGUMENT might work with ProcName() and ProcLine().  I found some references online which talked about a "Level" as an argument.  (Numbers?  Positive?  Negative?)  So I ran some tests.  I get the current level with NO argument - which is also what I get with a zero as the argument.  Next I tried a ONE.  (Positive, not negative.) and I got lucky!!  I got the information for one level UP.  The Calling routine.  Then, just to know how it "breaks", I put in a TWO and ran again.  The name was Nil, and the line number was zero.

So, I think I've figured out how to work with these to functions.

Thanks again!

Tom.

Klas Engwall

unread,
Sep 20, 2021, 7:02:48 PM9/20/21
to harbou...@googlegroups.com
Hi Tom,

> I think I've got it.  Started wondering what kind of ARGUMENT might work
> with ProcName() and ProcLine().  I found some references online which
> talked about a "Level" as an argument.  (Numbers?  Positive?
> Negative?)  So I ran some tests.  I get the current level with NO
> argument - which is also what I get with a zero as the argument.  Next I
> tried a ONE.  (Positive, not negative.) and I got lucky!!  I got the
> information for one level UP.  The Calling routine.  Then, just to know
> how it "breaks", I put in a TWO and ran again.  The name was Nil, and
> the line number was zero.
>
> So, I think I've figured out how to work with these to functions.

The standard error handler exists in src\rtl\errorsys.prg - you can make
your own error handler based on it and make the new one the standard
error handler by using the ErrorSys() procedure in the same file. If you
want more detail the xHarbour error handler exists in
contrib\xhb\xhberr.prg which also includes logging to disk. They are
both quite educational :-)

Regards,
Klas
Reply all
Reply to author
Forward
0 new messages