improve "dbReadLines"

30 views
Skip to first unread message

Qian Yun

unread,
Apr 5, 2024, 10:17:13 PM4/5/24
to fricas-devel
I'd like to improve "dbReadLines" from br-search.boot
into the following:

--------
read_text_stream stream ==
if stream then
[read_line stream while not EOFP stream]

read_text_file filename ==
handle_input_file(filename, function read_text_stream, [])
--------

It's more correct (closes stream when exception happens) and
function name is more generic (not limited to db operation.)

However I can't find a proper place to put this function.

I think current naming scheme under src/interp/ lacks tiering:
there should be files that are libraries used by other modules,
then there should be files making up module A/B/C, then there
should be higher structure like compiler/interpreter depends on
module A/B/C.

Anyway, I suggest to add a new file under src/interp/ that
contains commonly used library functions, including this one.

- Qian

Waldek Hebisch

unread,
Apr 6, 2024, 8:21:58 PM4/6/24
to fricas...@googlegroups.com
I admit that my long term plan is somewhat different: to do
substantial developements at Spad level. That means for example
better file handling on Spad level. Of course, there is question
how much time Spad plan will take? FriCAS has more than 15 years
and there is still a lot of Boot code, so realistically we can
not expect fast change here. OTOH Spad code is slowly, but
steadily growing, Boot and Lisp oscilate at around fixed level,
new things are implemented but also things are removed or moved
to Spad code. _My_ main focus when working on Boot code it to
prepare transition to Spad. More precisely, IMO big advantage
of Spad code is that it forces clearer structure. And main
difficulty is that current Boot code has rather convolved
structure, for example code is reused in strange ways. Main
intent of changes to Boot code is to claryfy structure. When
structure is clear enough, then transition to Spad will be
relatively easy.

Concerning tiering in 'src/interp', this subdirectory contains
several subsystems:
- runtime support for Spad
- interpreter proper
- HyperDoc
- Spad compiler

They share various part. For example part of type handling is
shared between Spad compiler and interpreter. IMO ideally
there would be common type handling. Currently file handling
and scanner are common to interpreter and Spad compiler, but
pile handing is partially common (Spad compiler has a special
part), parsers are separate. HyperDoc partially uses common
type machinery but parts are special and use separate
databases ('libdb.text', 'comdb.text').

Some sharing is undesirable, but IMO to avoid massive code
duplication we should have significant code sharing between
subsystems. Which IMO means that trying to divide this between
directories is mostly futile: when working on single issue
probably one will have to look both at code specific to
given part and at shared code. So either we would have split
into many very specific directories and most work will involve
multiple directories, or directory structure will be in loose
relation to functionality. To say it differently, I feel
that having single 'src/interp' is better than alternatives.
Note that file names try to indicate subsystem: 'br-*' are
main HyperDoc files 'bc-*' are builtin HyperDoc pages,
'i-*' are interpreter files, but shared functinality makes
this division fuzzy.

To put it differnely, I think that we should try to make sure
that given part works with appropiate "types" and in particular
that reuse respect types. Main cause of problems with reuse is
that diffent users of given functionality have different structure
of data (effectively different "type"). Boot does not have
type declarations, but one possible way forward is to add assertions,
that is signal errors when data has unexpected structure.

Coming back to your specific question, we can add new file
for utility type routines, but I do not expect such file to
get big and I prefer to keep 'src/interp' as a flat directory.

--
Waldek Hebisch

Qian Yun

unread,
Apr 7, 2024, 8:28:25 AM4/7/24
to fricas...@googlegroups.com


On 4/7/24 08:21, Waldek Hebisch wrote:
>
> I admit that my long term plan is somewhat different: to do
> substantial developements at Spad level. That means for example

I agree with this long term goal.

> better file handling on Spad level. Of course, there is question
> how much time Spad plan will take? FriCAS has more than 15 years
> and there is still a lot of Boot code, so realistically we can
> not expect fast change here. OTOH Spad code is slowly, but
> steadily growing, Boot and Lisp oscilate at around fixed level,
> new things are implemented but also things are removed or moved
> to Spad code. _My_ main focus when working on Boot code it to
> prepare transition to Spad. More precisely, IMO big advantage
> of Spad code is that it forces clearer structure. And main
> difficulty is that current Boot code has rather convolved
> structure, for example code is reused in strange ways. Main
> intent of changes to Boot code is to claryfy structure. When
> structure is clear enough, then transition to Spad will be
> relatively easy.

Agree.

On this subject, I find that you can call Spad function from Boot.
I wonder if i-output.boot is a good target to port to Spad first.

>
> Some sharing is undesirable, but IMO to avoid massive code
> duplication we should have significant code sharing between
> subsystems. Which IMO means that trying to divide this between
> directories is mostly futile

I'm not talking adding new sub-directories. What I meant is to
extract commonly used functionality into new files.

>
> To put it differnely, I think that we should try to make sure
> that given part works with appropiate "types" and in particular
> that reuse respect types. Main cause of problems with reuse is
> that diffent users of given functionality have different structure
> of data (effectively different "type"). Boot does not have
> type declarations, but one possible way forward is to add assertions,
> that is signal errors when data has unexpected structure.
>
> Coming back to your specific question, we can add new file
> for utility type routines, but I do not expect such file to
> get big and I prefer to keep 'src/interp' as a flat directory.
>

I'll modify "dbReadLines" in-place then. And I'll come up with
a list of functions that should exist in this new file "util.boot".

- Qian

Ralf Hemmecke

unread,
Apr 7, 2024, 11:46:34 AM4/7/24
to fricas...@googlegroups.com
On 4/7/24 14:28, Qian Yun wrote:
> On this subject, I find that you can call Spad function from Boot.
> I wonder if i-output.boot is a good target to port to Spad first.

Have you seen

https://github.com/fricas/fricas/blob/master/src/input/outputtest.input

? Right, Format2D is not exactly doing everything that i-output.boot is
doing (it is basically 2d-output written from scratch), but it should
cover nearly every output that i-output.boot can do.

The biggest problem is that Format2D contains no way to break lines.

Ralf

Martin Baker

unread,
Apr 7, 2024, 12:16:34 PM4/7/24
to fricas...@googlegroups.com
On 07/04/2024 13:28, Qian Yun wrote:
> I wonder if i-output.boot is a good target to port to Spad first.

About 10 years ago I wrote some SPAD code to implement the functionality
of the 2D output (algebraFormat) code in 'i-output.boot':

https://github.com/martinbaker/multivector/blob/master/monospace.spad.pamphlet

Waldek was not interested in this at the time. I think because, even if
you debugged it extremely carefully you could never 100% guarantee to be
identical with the output of the boot code.

I assume this requirement is the reason that progress is so slow.

Martin

Waldek Hebisch

unread,
Apr 8, 2024, 4:23:59 PM4/8/24
to fricas...@googlegroups.com
On Sun, Apr 07, 2024 at 08:28:20PM +0800, Qian Yun wrote:
>
>
> On 4/7/24 08:21, Waldek Hebisch wrote:
> >
> > I admit that my long term plan is somewhat different: to do
> > substantial developements at Spad level. That means for example
>
> I agree with this long term goal.
<snip>
> On this subject, I find that you can call Spad function from Boot.
> I wonder if i-output.boot is a good target to port to Spad first.

It is reasonably good target. The problem as of today is that
code in 'i-output.boot' is our most robust formatter and
ideally replacement would more robust. Ralf have written
a few formatters and Format2D in most cases works fine.
But robustness is about remaining cases. Also, Ralf uses
different way to control formatters. ATM it is not clear
to me how to hook his formatter so that it presents control
interface like earlier formatters _and_ works like Ralf want
it to work...

There are other possible targets. I mentioned 'sfsfun.boot'.
It is code that really should have been written in Spad and
at first glance convertion would be quite easy. Except for
fact that code in 'sfsfun.boot' refuses to work for some
reasonable arguments and for some arguments has significant
errors. I do not want just to convert troubles to Spad,
so this waits for resolution of numeric problems (I wrote
more about this in another mail). Now I have good replacement
for about 75% of functionality in 'sfsfun.boot', but remaining
part is harder...

Just now I am looking at HyperDoc. There are aspect which
may cause serious trouble, OTOH logically this code is
independent from the rest of interpreter. And in principle
significant part of HyperDoc support could be common with
other front ends.

> > Some sharing is undesirable, but IMO to avoid massive code
> > duplication we should have significant code sharing between
> > subsystems. Which IMO means that trying to divide this between
> > directories is mostly futile
>
> I'm not talking adding new sub-directories. What I meant is to
> extract commonly used functionality into new files.

I see. For me it is convenient to keep origial code exactly
where it was in NAG sources, that way it is easy to compare
and know if bugs were in original code or are due to our
modificatinos. Of course, once code is substantially changed
or rewritten, this no longer applies. Another thing is that
I am trying to keep number of files at mangable level. So
one or few new files are fine, but I would like to avoid
having many tiny files.

> >
> > Coming back to your specific question, we can add new file
> > for utility type routines, but I do not expect such file to
> > get big and I prefer to keep 'src/interp' as a flat directory.
> >
>
> I'll modify "dbReadLines" in-place then. And I'll come up with
> a list of functions that should exist in this new file "util.boot".

Probably 'n_util.boot'. 'util.boot' would lead to 'util.fasl' which
conflicts with 'util.fasl' from 'util.boot'.

--
Waldek Hebisch

Ralf Hemmecke

unread,
Apr 8, 2024, 4:46:19 PM4/8/24
to fricas...@googlegroups.com
On 4/8/24 22:23, Waldek Hebisch wrote:
> It is reasonably good target. The problem as of today is that code
> in 'i-output.boot' is our most robust formatter and ideally
> replacement would more robust.

I agree here. Even though I am pretty sure that Format2D gives better
output then i-output.boot (and src/input/outputtest.input seems to
confirm this), Format2D lacks a reasonable line-breaking algorithm.
If somehow could help with that, it would be progress.

> Ralf have written a few formatters and Format2D in most cases works
> fine. But robustness is about remaining cases.

"Remaining" means what here? One problem probably was that i-output.boot
is not only used for the output from src/algebra, but also to format
constructors when errors occur.

Assuming the linebreaking stuff being settled, one could perhaps manage
to let i-output.boot only do the contructor printing, i.e. first split
it and let Format2D take over the algebra part.

> Also, Ralf uses different way to control formatters. ATM it is not
> clear to me how to hook his formatter so that it presents control
> interface like earlier formatters _and_ works like Ralf want it to
> work...

Yes, I want these formatters to stay as flexible as they are, but I do
not really understand what you mean by "hook" or what the "control
interface" is. I basically wrote those formatters from scratch, because
it was too hard to do a new format by a simple change of a few lines of
code.

If it ever becomes relevant, we should start a discussion about the
goals and then how to get there.

Ralf
Reply all
Reply to author
Forward
0 new messages