Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Text File for uploading to DBF delimited CRLF vs LF (Windows)

90 views
Skip to first unread message

poopall

unread,
Aug 25, 2022, 5:01:18 AM8/25/22
to
I have recently received a txt file which it appears has LF as the end of line markets, windows required CRLF

Is there a function call to easily test for this condition before attempting a append "delimited".

I would be nice if there as an automatic conversion

dlzc

unread,
Aug 25, 2022, 4:32:36 PM8/25/22
to
On Thursday, August 25, 2022 at 2:01:18 AM UTC-7, poopall wrote:
> I have recently received a txt file which it appears has LF as the end
> of line markets, windows required CRLF
>
> Is there a function call to easily test for this condition before
> attempting a append "delimited".

Read in the "first 1000" characters, and see if any chr(13) are in the string. That should tell you.

> I would be nice if there as an automatic conversion

There is, at the source.
https://www.hanselman.com/blog/carriage-returns-and-line-feeds-will-ultimately-bite-you-some-git-tips

Might be a "RUN" command / call...
https://waterlan.home.xs4all.nl/dos2unix.html
... you'd want the reverse. Might find out what would happen if you always pushed the file (Linux or DOS) through dos2unix first, then back through unix2dos. MIGHT save even doing a check.

David A. Smith

poopall

unread,
Aug 25, 2022, 9:21:26 PM8/25/22
to
The problem is its not apparent that the .txt file received is in UNIX format, as the CR or LF are hidden symbol.

When dealing with this issue one does not always assume a txt file is anything but CRLF when working in windows, being windows centric.

Many wasted hours thinking I have lost my mind, as only 1st line could be read.

Simple solution for me was to use Notepad++ and convert EOL to Windows

It would be nice if there was a function to check for this condition in a file (hoping someone will create something)





Message has been deleted

dlzc

unread,
Aug 26, 2022, 10:15:00 AM8/26/22
to
Dear poopall:

On Thursday, August 25, 2022 at 6:21:26 PM UTC-7, poopall wrote:

local cBuffer := ""
local hasCR := .F.
nReadMe := fopen( "Suspect.txt", FO_READ + FO_DENYNONE )
fread( nReadMe, @cBuffer, 1000 ) ;hopefully less than 100 character records
hasCR := ( at( chr(13), cBuffer ) > 0 )
fclose( nReadMe )
release cBuffer

... hasCR is true if there is a carriage return.

David A. Smith

poopall

unread,
Aug 28, 2022, 10:19:56 PM8/28/22
to
Thank U, how about a routine that changes all of the to CRLF if they are only LF

dlzc

unread,
Aug 29, 2022, 1:11:37 PM8/29/22
to
Dear poopall:

On Sunday, August 28, 2022 at 7:19:56 PM UTC-7, poopall wrote:
> On Saturday, 27 August 2022 at 12:15:00 am UTC+10, dlzc wrote:
...
> Thank U, how about a routine that changes all of the to CRLF
> if they are only LF

Open the file
Read (up to 1000 characters) until CR is found
if found, terminate and proceed with your APPEND
If eof() or 1000 characters
create temporary file (errors?)
read 100 characters (or EOF())
place each character in buffer unless it is a LF (and previous was not CR), then place CRLF in buffer and continue
write buffer to second file
if .not. eof() loop
if errors stop, do not allow APPEND
close source
close copy
delete source (errors?)
rename copy to source name
release whatever variables you don't need.
Proceed with APPEND.

Now you could add making sure no blank lines are added, and try squirting characters into the same strings (rather than reallocating string space), but that is an exercise for another day.

Me, I'd use the utilities I pointed out to you earlier. I've got a 6000+ record database that describe key elements of movie files, and you can be sure I didn't learn how to parse all possible movie formats...

David A. Smith

Dan

unread,
Aug 29, 2022, 5:53:02 PM8/29/22
to
I know of 3 possibilities: CR is line end (Apple), LF is line end
(nixes) and CRLF is line end (Win).

fopen() the txt. Read one char at a time. Use 3 vars: cr_found,
lf_found, crlf_found. When you encounter CR or LF you consider the
subsequent character and determine the EOL.
You can have CR true and no LF subsequent, or CR and LF or LF and no
previous CR.

Now you can read line by line and write to a new normalized file with
the desired EOL.

readline - reads a line from a txt file
Usage: nRet = readline(<nhdl>,@cBuffer [,<eol>])
Returns: 0 if eof not encountered, -1 if eof

// nHdl = handle of the file FOPENed
// cLine = buffer for line (pass for reference)
// eol = end-of-line
function readline(nHdl,cLine,eol)
local nEol:=.f.,RetVal:=0
local byte:=" ",z
if empty(eol)
eol = hb_osnewline() //hb_eol()
endif
cLine=""
do while .t.
z:=fread(nHdl,@byte,1)
if z=0
if len(cLine)=0
retval:=-1
endif
exit
endif
cLine=cLine+byte
if len(cLine)>=len(eol)
if right(cLine,len(eol))=eol
cLine=strtran(cLine,eol,"")
exit
endif
endif
enddo
return RetVal

-----
nHdlIn := fopen("suspect.txt",FO_READ)
nHdlOut := fcreate("normalized.txt", FO_WRITE)
buffer := ""
EOLIn := chr(10)
EOLOut := chr(13)+chr(10)

do while readline ( nHdl, @buffer, EolIn ) == 0
fwrite ( nHdlOut, buffer+EOLOut )
enddo
fclose ( nHdlIn )
fclose ( nHdlOut )

(UNTESTED)

HTH
Dan
0 new messages