On Fri, Mar 29, 2024 at 11:24:59AM +1100, Hill Strong wrote:
> On Thu, Mar 28, 2024 at 6:02 PM Ralf Hemmecke <
ra...@hemmecke.org> wrote:
>
> >
> >
> > OK the answer to the above is not soo important to me, but I think Greg
> > is right that the builtin 'error' should automatically show the function
> > name in which it is called. Doesn't the compiler have all this information?
> >
> > Such an improvement would really be great.
> >
>
> The compiler should have enough information to generate appropriate code
> for the builtin function error to be able to generate the
> domain/category/package specification as well as the function in which the
> builtin function is called as well as the relevant parameters used for the
> calling function.
Compiler knows function name and constructor name. Note however
that FriCAS types are parametrized, and type parameter are known
only at runtime. Of course, compiler knows how to access type
parameters. Also note that functions can be overloaded, so
to know which function was called you need its signature (types
of return value and parameters).
> The code generation phase of the compiler should be able
> to generate relevant code that obtains all the necessary runtime
> information and display it. It is far better to get more information
> relating to the error than less.
It depends. Actually in FriCAS I rarely met situations where
information can not be displayed. Problem usually is that
there is way too much info and it is hard to find relevant
part. For example, trace generates a lot of info, but if
one traces "popular" constructor one gets a lot of irrelevant
stuff before important things happen.
And for debugging errors I normally look at backtrace. This
means that for reproducible errors I get way more info that
is reasonable to put into error message. Main drawback of
backtrace is that it works at Lisp level, so info is unreadable
to ordinary FriCAS user. At least using sbcl it is possible
to have user handler for backtrace, so in principle we could
print this info in more convenient way. But that is some
work to do at lowest level in FrICAS. I consider this important,
but other things for me have higher priority. And other folks
have their own priorites. In volunteer open source project telling
other prople to do some work usually is useless: if a person
can and want to do something he/she would do it anyway.
And if a person can not or does not want to do something,
then talk is useless.
> More helpful information reduces the time
> it takes to work out what has gone wrong and what you need to fix that
> specific error.
>
> Yes, I do understand just how much work is involved in doing this and yes
> many people do not see the need for this level of work until it comes and
> bites them on the posterior. If that information is not there, users
> quickly get frustrated (especially new users) and it is always good to have
> new users picking up your systems and becoming enthusiastic proponents of
> the system.
>
> If you go through all calls to error in the FriCAS code base, the current
> error messages are often quite cryptic and of little help in analysing any
> problem that does arise. Since error messages are not the normal mode of
> operation, any system generated code for enhancing error messages will
> not/should not affect any of the normal code being run and should not
> affect how fast the normal mode code runs.
>
> If this is left to the programmer to adjust the various error messages
> throughout the code base, this will be a very large task.
Well, concerning "cryptic error messages", let us look at recent
example posted by Ralf. We got:
>> System error:
The variable $$$ is unbound.
If you consider this carefully, this contains useful information.
First 'System error' tells you that error was detected at lowest
level (by Lisp system in this case). And "The variable $$$ is unbound."
tells you exactly what the problem is: code attempted to access
variable named '$$$' but such variable does not exist at this
time ("unbound" in Lisp terminology).
Of course, ordinary users may consider this cryptic, but IMO the
real trouble was that there was no apparent connection with user
input: from user point of view there was a sequence of commands
that executed OK at command line, but failed inside function body.
I am familiar with FriCAS code and I recently wrote RootSimplification
package that triggered the trouble so I was able to guess part of
explanation. But let us pretend that a developer tries to solve
this starting from "frist principles". First thing to do is to
write
)set break break
After that we get one extra piece of info:
(SB-EVAL::GET-VARIABLE $$$ #<SB-EVAL::ENV {10035A8D23}>)
which may look cryptic, but is the actual Lisp code that failed.
And this confirms that trouble is due to access to '$$$'.
Next thing is get backtrace. Backtrace may be long, so normally
one want only part of it. This time
backtrace 50
(which requires 50 positins) is right thing, but one knows this
only after solving tho problem. Normal way is to look at smaller
part and request bigger if needed. What we see in the backtrace?
First is seqence of calls, starting at
25: (|RootSimplification|)
RootSimplification is a constructor name and this means call which
initializes it. We see:
22: (|INFORM;interpret;%A;9| (SEQ (LET |eI| (|Expression| |Integer|)) (LET |a| (|::| (QUOTE |a|) |eI|)) (LET |b| (|::| (QUOTE |b|) |eI|)) (LET |c| (|::| (QUOTE |c|) |eI|)) (LET |r| (|::| (QUOTE |r|) |eI|)) (LET |e| (|::| (QUOTE |e|) |eI|)) (+ (+ (* (* # #) (^ |r| 2)) (* (+ # #) |r|)) (+ (* (+ # #) (^ |e| 4)) (* (* # |b|) (^ |e| 3))))) #<unavailable argument>)
23: (|RSIMP;str_to_expr| "eI := Expression(Integer); ...
Which means that 'str_to_expr' in RootSimplification called 'interpret'
which is call from Spad code to interpreter. So we now know that
error happended when intepreter was evaluating code passed as a
string from 'str_to_expr'. There are several calls which are not
interesting, relevant one is:
9: (|evalFormMkValue| #(^ NIL NIL NIL ((|totalArgs| . 2) (|argumentNumber| . 2) (|callingFunction| . *))) (SPADCALL (QUOTE ((1 #S(SPAD_KERNEL :OP # :ARG NIL :NEST 1) (1 . #1=#)) . #1#)) 2 (ELT $$$ 4)) (|Expression| (|Integer|)))
^^^^^^^^^^^
Here we see piece of Lisp code generated by FriCAS which tries to
access '$$$'. So we can guess that this code is wrong, and we need
to find out which part/what is responsible for this code. One
hint is part before initialization of RootSimplification (after all
we know that on command line initialization of RootSimplification
works without error). We have:
42: (|compileDeclaredMap| |nicer| ...
which means that interpreter is compiling 'nicer' and needs to
initialize RootSimplification as part of compilation. If you
look at 'compileDeclaredMap' you will see that is changes values
of some global variable. So now resonable guess is that error
is due to wrong setting of global variables, they are set for
compiling code, while to initialize RootSimplification we
need evaluation. It took me some time to find responsible
variable (it was '$insideCompileBodyIfTrue'). However, the
point is that error message + info from backtrace was sufficient
to get resonable idea about nature of the problem. I do not
think that different error message could do better. And
actually, main drawback of backtrace is that it is long.
Coming back to error messages, I think that fundamental
problem is that users have trouble relating low level
errors with high level actions. That is hard problem,
backtrace can help linking low level error to higher
level calls, but fundamentally problem is hard. We can
try to predict possible errors and add more checks, but
this requires more effort.
Concerning 'error' routine: automatically printing more
info could help sometimes. But most of the time extra
info will be irrelevant, so IMO if extra info is printed
it shoule be either under user control or by explict
request from programmer invoking 'error'.
--
Waldek Hebisch