On Saturday, January 1, 2022 at 9:24:34 PM UTC+3, Anton Ertl wrote:
> SLURP-FILE takes more work to replace. I posted an implementation in
> 1998 <72v4vm$g6p$
1...@reader3.wxs.nl> (interestingly, it was not included
> in Gforth 0.5 (2000), but only in 0.6 (2003)), and have mentioned this
> word a number of times since then (34 occurences in my Articles file).
> Apparently non-Gforth users do not find it as useful as I do,
> otherwise I would have expected it to spread to other systems. And of
> course, with only one system having it, the chances of standardization
> are slim (and nobody has tried).
I've investigated several "SLURP-FILE" implementations and I'm really
depressed.
This is GForth (from stuff.fs), my comments start with \ !!
: slurp-file ( c-addr1 u1 -- c-addr2 u2 ) \ gforth
\G @var{c-addr1 u1} is the filename, @var{c-addr2 u2} is the file's contents
r/o bin open-file throw >r
\ !! Resource leak here. File not closed. (twice)
r@ file-size throw abort" file too large"
\ !! Resource leak here. File not closed.
dup allocate throw swap
\ !! Resource leak here. File not closed. Memory not freed
2dup r@ read-file throw over <> abort" could not read whole file"
\ !! Resource leak here. Memory not freed.
r> close-file throw ;
This makes gforth "slurp-file" implementation not usable in real
applications.
KForth:
1024 1024 * 64 * value MAX_SLURP \ 64 MB limit
0 ptr slurp_buf
variable slurp_fid
variable slurp_size
: slurp-file ( c-addr1 u1 -- c-addr2 u2)
0 slurp_size !
R/O BIN open-file
if -69 throw
then dup slurp_fid !
file-size
if -66 throw
then
0<> over MAX_SLURP > or \ !! U> should be used here.
if 1 throw \ File too large
then dup slurp_size ! allocate
if -59 throw
then to slurp_buf
0 s>d slurp_fid @ reposition-file
if -73 throw
then slurp_buf slurp_size @ slurp_fid @ read-file
if -70 throw
then slurp_size @ over <>
if 2 throw \ Slurp size and read size do not match
then slurp_buf swap
slurp_fid @ close-file
if -62 throw
then ;
1. Bad practice. Global variables.
2. reposition-file -- useless? flie-size shouldn't modify position.
3. Same resource leaks as in gforth version. In theory user can cleanup
in catch handler, but he must clear globals first. Otherwise he may use
garbage left from previous calls.
4. Minor point: numeric values usage should be discouraged. Symbolic
names must be used instead. I.e. ERROR-CLOSE-FILE, instead of -62.
5. Minor point: I don't understand the reason for MAX_SLURP at all.
6. The "close-file if -62 throw then" likes are hiding real error
reasons. Why not just "close-file throw"?
The whole design is radically different from gforths and is much worse.
I.e. one should write something like (not tested):
: gforth-compat-slurp-file ( c-addr u -- addr u )
-1 TO MAX_SLURP \ assuming we use U>
0 slurp_buf ! -1 slurp_fid ! \ clear globals
['] slurp-file catch ?DUP IF
slurp_buf @ 0<> IF slurp_buf @ FREE DROP THEN
slurp_fid @ -1 <> IF slurp_fid @ CLOSE-FILE DROP THEN
THROW
THEN
;
For gforth compatibility.
Now VFX variant from (Examples/Lin32/PowerNet/Services/Pages.fth):
: Slurp-File \ fileid addr -- len ior
\ *G Read the open file into memory, close it and return the
\ ** length read and an ior (0=success).
over file-size 2drop \ -- fileid addr len
rot >r \ -- addr len ; R: -- fileid
tuck r@ read-file nip \ -- len ior ; R: -- fileid
r> close-file drop
;
Just great. No error checking at all... Also, output buffer size is
never checked, so we can easily have buffer overflow.
It should be noted, that in PowerNet v4, this word was removed and
"data-file" was used instead. Which also imposes resource leakage.
This is from "kernel.fth" "data-file" word:
r@ file-size #-414 ?throw drop \ -- size ; R: -- handle
here over r@ read-file #-414 ?throw drop \ -- size ; R: -- handle
I.e. file not closed on error...
As a bonus -- "slurp-fid" from gforth:
: slurp-fid ( fid -- addr u ) \ gforth
\G @var{addr u} is the content of the file @var{fid}
{ fid }
0 0 begin ( awhole uwhole )
\ !! memory leak, extend-mem can throw
dup 1024 + dup >r extend-mem ( anew awhole uwhole R: unew )
\ !! memory leak
rot r@ fid read-file throw ( awhole uwhole uread R: unew )
r> 2dup =
while ( awhole uwhole uread unew )
2drop
repeat
\ !! memory leak
- + dup >r resize throw r> ;
Also, this code uses "old style" implementation, when we read file in
pieces. That was cool 20-30 years ago, but now is almost useless, at
least with such small buffer size.
And I'm not sure that "extend-memory" won't cause performance issues
compared to single "ALLOCATE"...
Second bonus (ruvim code):
: FILE-CONTENT ( h-file -- addr u ior )
\ addr should be freed via FREE on success
>R
R@ FILE-SIZE DUP IF RDROP EXIT THEN DROP
( d-size ) IF 0
-1005 RDROP EXIT THEN \ too big
DUP ALLOCATE DUP IF RDROP EXIT THEN DROP SWAP ( addr u )
2DUP R@ READ-FILE-EXACT DUP IF
( addr u ior )
2>R FREE DROP 0 2R>
THEN ( addr u 0 )
RDROP
;
: FILENAME-CONTENT ( d-txt-filename -- addr u )
R/O OPEN-FILE-SHARED THROW >R
R@ FILE-CONTENT
\ !! memory leak here if close-file fails.
R> CLOSE-FILE SWAP THROW THROW
;
Well, this is the best implementation. Only one leak, which can be
easily fixed.
P.S. Vfx "Lib/FileHacks.fth" is not interesting at all, because it just
aborts on any error.