How to tell if MPFR is active (GAWK)....?

Kenny McCormack

未読、

2017/05/19 10:17:152017/05/19

To:

I.e., how to tell whether the -M command line option was supplied, and,
more generally, that MPFR is up and working.

I know there are 4 entries in PROCINFO that pertain to the existence of
MPFR (i.e., that both MPFR and GMP were compiled in), but that doesn't tell
you whether or not it is currently active.

The following test, taken from the GAWK manual, does work, but I was hoping
for something more directly testable (like a specific variable to test or a
PROCINFO entry to examine).

The following will return 0 unless the -M option is used:

$ gawk [-M] -v PREC=56 'BEGIN { print (0.1 + 12.2 == 12.3) }'

--
"Every time Mitt opens his mouth, a swing state gets its wings."

(Should be on a bumper sticker)

Ed Morton

未読、

2017/05/20 15:27:512017/05/20

To:

On 5/19/2017 9:17 AM, Kenny McCormack wrote:
> I.e., how to tell whether the -M command line option was supplied, and,
> more generally, that MPFR is up and working.
>
> I know there are 4 entries in PROCINFO that pertain to the existence of
> MPFR (i.e., that both MPFR and GMP were compiled in), but that doesn't tell
> you whether or not it is currently active.
>
> The following test, taken from the GAWK manual, does work, but I was hoping
> for something more directly testable (like a specific variable to test or a
> PROCINFO entry to examine).
>
> The following will return 0 unless the -M option is used:
>
> $ gawk [-M] -v PREC=56 'BEGIN { print (0.1 + 12.2 == 12.3) }'
>

Doesn't look like it:

$ cat tst.awk
function walk_array(arr, name, i)
{
for (i in arr) {
if (isarray(arr[i]))
walk_array(arr[i], (name "[" i "]"))
else
printf("%s[%s] = %s\n", name, i, arr[i])
}
}
BEGIN {
walk_array(SYMTAB,"SYMTAB")
walk_array(FUNCTAB,"FUNCTAB")

print (0.1 + 12.2 == 12.3)
}

$ gawk -v PREC=56 -f tst.awk > o1

$ gawk -M -v PREC=56 -f tst.awk > o2

$ diff o1 o2
17c17
< SYMTAB[PROCINFO][pgrpid] = 11064
---
> SYMTAB[PROCINFO][pgrpid] = 10764
112c112
< SYMTAB[PROCINFO][pid] = 11064
---
> SYMTAB[PROCINFO][pid] = 10764
240c240
< 0
---
> 1

Kenny McCormack

未読、

2017/05/23 11:40:222017/05/23

To:

In article <ofq55a$mmh$1...@dont-email.me>,

Ed Morton <morto...@gmail.com> wrote:
>On 5/19/2017 9:17 AM, Kenny McCormack wrote:
>> I.e., how to tell whether the -M command line option was supplied, and,
>> more generally, that MPFR is up and working.
>>
>> I know there are 4 entries in PROCINFO that pertain to the existence of
>> MPFR (i.e., that both MPFR and GMP were compiled in), but that doesn't tell
>> you whether or not it is currently active.
>>
>> The following test, taken from the GAWK manual, does work, but I was hoping
>> for something more directly testable (like a specific variable to test or a
>> PROCINFO entry to examine).
>>
>> The following will return 0 unless the -M option is used:
>>
>> $ gawk [-M] -v PREC=56 'BEGIN { print (0.1 + 12.2 == 12.3) }'
>>
>
>Doesn't look like it:

Thanks for checking into this. It does seem there's no good way, other
than the test shown above.

There is an internal (i.e., internal to the C source code) variable called
"do_flags", which contains bit-encoded information on a variety of
subjects, including MPFR. I think it would be helpful in a variety of ways
if "do_flags" were exposed to the AWK programmer.

I made a one-line patch (addition) to main.c:

update_PROCINFO_num("do_flags", do_flags);

See below:

$ ./gawk 'BEGIN { print PROCINFO["do_flags"]}'
0
$ ./gawk -M 'BEGIN { print PROCINFO["do_flags"]}'
16384
$

Note that awk.h defines:

DO_MPFR = 0x4000 /* arbitrary-precision floating-point math */

--
The randomly chosen signature file that would have appeared here is more than 4
lines long. As such, it violates one or more Usenet RFCs. In order to remain
in compliance with said RFCs, the actual sig can be found at the following URL:
http://user.xmission.com/~gazelle/Sigs/Snicker

Andrew Schorr

未読、

2017/05/24 9:29:162017/05/24

To:

On Friday, May 19, 2017 at 10:17:15 AM UTC-4, Kenny McCormack wrote:
> The following test, taken from the GAWK manual, does work, but I was hoping
> for something more directly testable (like a specific variable to test or a
> PROCINFO entry to examine).
>
> The following will return 0 unless the -M option is used:
>
> $ gawk [-M] -v PREC=56 'BEGIN { print (0.1 + 12.2 == 12.3) }'

This is not a bad solution, but let me ask the more basic question: why
does your script need to know whether MPFR is enabled?

Regards,
Andy

Kaz Kylheku

未読、

2017/05/24 11:31:342017/05/24

To:

Use your engineering imagination.

This could impelment a diagnostic like:

oops! this script requires GNU Awk with MFPR (-M option)

If your script depends on MFPR for correct results, that may be a good
idea, rather than to produce the wrong results.

Kenny McCormack

未読、

2017/05/24 11:40:402017/05/24

To:

In article <201705240...@kylheku.com>,

Indeed. It is not hard to come up with good reasons for wanting this; I
can think of half a dozen without breaking a sweat.

The interesting question is: Why did Andy ask the question?

On the net, when somebody asks you why you want to do something, it always
(well, at least 99.999% of the time) carries an implication that you
shouldn't want to do it and/or that there is no good reason to want to do it.

--
Watching ConservaLoons playing with statistics and facts is like watching a
newborn play with a computer. Endlessly amusing, but totally unproductive.

Kenny McCormack

未読、

2017/05/24 11:53:572017/05/24

To:

In article <29d6e76b-1f15-45ad...@googlegroups.com>,

Which in turn leads to the question of why you are asking the question...

--

"This ain't my first time at the rodeo"

is a line from the movie, Mommie Dearest, said by Joan Crawford at a board meeting.

Andrew Schorr

未読、

2017/05/25 10:48:082017/05/25

To:

On Wednesday, May 24, 2017 at 11:53:57 AM UTC-4, Kenny McCormack wrote:
> >This is not a bad solution, but let me ask the more basic question: why
> >does your script need to know whether MPFR is enabled?
>
> Which in turn leads to the question of why you are asking the question...

Because the maintainers aren't going to concern themselves with this issue
unless somebody can present a concrete example of why this matters.

You guys say you can think of dozens of reasons, so please give one.
If you have a script where the results depend on a certain level of precision,
then why not simply start the script with a test calculation that will fail
if the floating-point precision is insufficient?

For example, one might use a function like this to test whether MPFR is
enabled:

function is_mpfr_enabled( save_prec, rc) {
save_prec = PREC
PREC = 100
rc = (1 != (1 + 1.e-25))
PREC = save_prec
return rc
}

But it would be better to use a calculation that is relevant to the
calculations being performed by one's particular script.

And Kenny's idea of exposing the numeric "do_flags" value doesn't
seem wise to me: what happens if/when we change the numeric values
of those flags? Does the script break?

Regards,
Andy

Kaz Kylheku

未読、

2017/05/25 10:58:552017/05/25

To:

On 2017-05-25, Andrew Schorr <asc...@telemetry-investments.com> wrote:
> On Wednesday, May 24, 2017 at 11:53:57 AM UTC-4, Kenny McCormack wrote:
>> >This is not a bad solution, but let me ask the more basic question: why
>> >does your script need to know whether MPFR is enabled?
>>
>> Which in turn leads to the question of why you are asking the question...
>
> Because the maintainers aren't going to concern themselves with this issue
> unless somebody can present a concrete example of why this matters.

The maintainers, in the first place, should know better than to
introduce flags that globally change the semantics of calculation, and
make them controllable only via the command line, so that programs
cannot declare which semantics is in effect over their own statements.

> You guys say you can think of dozens of reasons, so please give one.
> If you have a script where the results depend on a certain level of precision,
> then why not simply start the script with a test calculation that will fail
> if the floating-point precision is insufficient?

Because sometimes people other than the author use a script, and
even the author might forget that option.

> For example, one might use a function like this to test whether MPFR is
> enabled:
>
> function is_mpfr_enabled( save_prec, rc) {

From the beginning, Kenny stated that of course calculation tests
can be used, but was asking about a direct way.

Kenny McCormack

未読、

2017/05/25 12:28:582017/05/25

To:

In article <201705250...@kylheku.com>,
Kaz Kylheku <686-67...@kylheku.com> wrote:
...

>Because sometimes people other than the author use a script, and
>even the author might forget that option.

Quite so. One of the main things I build into all my scripts is
sanity-checking/environment-checking. In fact, this code usually is the
first code written; the rest (the actual functionality) comes later.
A program should exit as quickly as possible if the environment isn't
suitable.

>> For example, one might use a function like this to test whether MPFR is
>> enabled:
>>
>> function is_mpfr_enabled( save_prec, rc) {
>
>From the beginning, Kenny stated that of course calculation tests
>can be used, but was asking about a direct way.

Indeed. Ironic that Andy seems to prefer code that depends on a bunch of
funny numbers, like 56, 12.2 and 12.3, over depending on one particular bit
flag number (16384). My experience is that while these bit flag numbers
can, in theory, change from version to version, in practice they rarely do.
I'm a lot more comfortable relying on that than on the results of some
totally arbitrary looking calculation. Besides which, if you really
want/need that sort of future-proofing, you can just write (GAWK) code to
parse awk.h and get the right/current numbers.

Anyway, be honest, if this were any other context, and you saw code like
the originally quoted test (the one that depends on numbers like 56, 12.2
and 12.3), you'd be appalled (and quite vocal) about how hokey-smokey it
looks. Not to beat a dead horse, but I am sure that there is absolutely
nothing in any standards document or in anything resembling common sense
that would prevent a future version of GAWK from getting the right result
without MPFR being active. That fact alone invalidates the test.

Isn't it ironic?

https://www.youtube.com/watch?v=Jne9t8sHpUc

--
It's possible that leasing office space to a Starbucks is a greater liability
in today's GOP than is hitting your mother on the head with a hammer.

Andrew Schorr

未読、

2017/05/25 12:47:022017/05/25

To:

It would be nice if MPFR mode could be activated from the script.
However, the current implementation does not make that possible,
and it's not obvious to me from a quick look at the code that this
change can be made at runtime. If you come up with a patch that can
make this work, please submit it to the mailing list.

But more basically, I still don't understand why an awk script needs
to know whether the program was invoked with -M. The script's job is
presumably to do some calculations, and so therefore its concern should
be whether it will be able to do those calculations with sufficient
precision. So why not test directly for whether the calculation precision
is adequate for the program's needs?

I get that you find it distasteful that there's no way to check whether
MPFR mode is active, and I confess that it does feel a bit icky. But
I still don't see why it matters. Either the computation engine provides
adequate precision for the program's needs, or it doesn't. It's best
to check for that directly. For example, even if MPFR were available, how
would you decide what value is needed for the PREC setting? If you know that
you need precision of n bits in the mantissa, then just test for it.

For example:

# n is the number of bits of precision required in the mantissa.
# presumably, the caller already set PREC to n.
function adequate_math_precision(n) {
return (1 != (1+(1/(2^(n-1)))))
}

# example:
BEGIN {
PREC = 80
if (!adequate_math_precision(PREC)) {
print "Error: insufficient arithmetic precision available" > "/dev/stderr"
exit 1
}
}

Regards,
Andy

Janis Papanagnou

未読、

2017/05/26 6:58:422017/05/26

To:

On 25.05.2017 18:47, Andrew Schorr wrote:
> It would be nice if MPFR mode could be activated from the script.
> However, the current implementation does not make that possible,
> and it's not obvious to me from a quick look at the code that this
> change can be made at runtime. If you come up with a patch that can
> make this work, please submit it to the mailing list.

I agree with this idea, but it's not important whether MPFR is set or
not - "MPFR" doesn't carry the significant information now, and in
future versions native vs. MPFR ranges may change -, so setting PREC
is more likely what we need (PROCINFO["PREC"]=120). GNU awk could then
determine whether the MPFR library functions or the native arithmetic
functions would [internally] have to be activated (the user need not
care; he declares what he wants, or otherwise the native default will
be taken).

>
> But more basically, I still don't understand why an awk script needs
> to know whether the program was invoked with -M. The script's job is
> presumably to do some calculations, and so therefore its concern should
> be whether it will be able to do those calculations with sufficient
> precision. So why not test directly for whether the calculation precision
> is adequate for the program's needs?

Yes, I agree. I took the OP's -M interrogation request as a primitive
substitute for what actually was desired; interrogation of a status
("MPFR") is simpler than [dynamically] changing processing behaviour
(as outlined above, but as seems currently not feasible as you say).

>
> I get that you find it distasteful that there's no way to check whether
> MPFR mode is active, and I confess that it does feel a bit icky. But
> I still don't see why it matters. Either the computation engine provides
> adequate precision for the program's needs, or it doesn't. It's best
> to check for that directly. For example, even if MPFR were available, how
> would you decide what value is needed for the PREC setting? If you know that
> you need precision of n bits in the mantissa, then just test for it.

Yes, but testing by implementing some embedded sample computation is
at least as icky as testing for "MPFR" status and aborting.

Janis

Andrew Schorr

未読、

2017/05/26 8:57:392017/05/26

To:

On Friday, May 26, 2017 at 6:58:42 AM UTC-4, Janis Papanagnou wrote:
> I agree with this idea, but it's not important whether MPFR is set or
> not - "MPFR" doesn't carry the significant information now, and in
> future versions native vs. MPFR ranges may change -, so setting PREC
> is more likely what we need (PROCINFO["PREC"]=120). GNU awk could then
> determine whether the MPFR library functions or the native arithmetic
> functions would [internally] have to be activated (the user need not
> care; he declares what he wants, or otherwise the native default will
> be taken).

That's an interesting idea. Please feel free to submit a proposed patch.

But one should keep in mind that running with MPFR is likely to be much
slower, so it's not so obvious to me that it's a great idea to hide what's
going on beneath the hood.

> Yes, but testing by implementing some embedded sample computation is
> at least as icky as testing for "MPFR" status and aborting.

Well, I'm sorry, I don't think it's "icky" to test for what you
actually care about -- can I do the computation in mind with the
required precision.

By the way everyone, gawk stores all of its command-line arguments
in the PROCINFO["argv"] array, so if you are really worried about this,
just scan the array for the presence of -M or --bignum:

bash-4.2$ gawk -v x=y -M 'BEGIN {for (i = 0; i < length(PROCINFO["argv"]); i++) print i, PROCINFO["argv"][i]}'
0 gawk
1 -v
2 x=y
3 -M
4 BEGIN {for (i = 0; i < length(PROCINFO["argv"]); i++) print i, PROCINFO["argv"][i]}

Regards,
Andy

Ed Morton

未読、

2017/05/26 10:03:572017/05/26

To:

What gawk version is that?

$ gawk -v x=y -M 'BEGIN {for (i = 0; i < length(PROCINFO["argv"]); i++) print i,
PROCINFO["argv"][i]}'

$
$ gawk --version
GNU Awk 4.1.4, API: 1.1 (GNU MPFR 3.1.5, GNU MP 6.1.2)

Regards,

Ed.

Kenny McCormack

未読、

2017/05/26 10:10:242017/05/26

To:

In article <og9cdt$qak$1...@dont-email.me>,
Ed Morton <morto...@gmail.com> wrote:
...

>What gawk version is that?
>
>$ gawk -v x=y -M 'BEGIN {for (i = 0; i < length(PROCINFO["argv"]); i++) print i,
>PROCINFO["argv"][i]}'

Yes, it is not in 4.1.4. I'm pretty sure Andy always runs the "developer"
version of GAWK, and that he tends to forget that fact from time to time.

Note that having access to the original argv[] is more or less equivalent
to having access to "do_flags" (in the context of this specific issue) -
both are possible solutions to the question of "Did the user put '-M' on
the command line?". Kinda 6 of one, ...

--
Marshall: 10/22/51
Jessica: 4/4/79

Kenny McCormack

未読、

2017/05/26 10:56:482017/05/26

To:

In article <866c5cbd-a0d8-44ae...@googlegroups.com>,
Andrew Schorr <asc...@telemetry-investments.com> wrote:
...

>Well, I'm sorry, I don't think it's "icky" to test for what you
>actually care about -- can I do the computation in mind with the
>required precision.

I'm sure you can see that, in the general case, that is equivalent to
asking whether the result of a calculation is correct - which, in the
general case, is equivalent to "the halting problem".

For what it is worth, I've developed the following test, which although still
hokey-smokey, is (to my eyes) a lot less hokey-smokey than the "12.2" test:

$ gawk4 -M 'BEGIN { print 2**2000 == "114813069527425452423283320117768198402231770208869520047764273682576626139237031385665948631650626991844596463898746277344711896086305533142593135616665318539129989145312280000688779148240044871428926990063486244781615463646388363947317026040466353970904996558162398808944629605623311649536164221970332681344168908984458505602379484807914058900934776500429002716706625830522008132236281291761267883317206598995396418127021779858404042159853183251540889433902091920554957783589672039160081957216630582755380425583726015528348786419432054508915275783882625175435528800822842770817965453762184851149029376" }'
1
$

>By the way everyone, gawk stores all of its command-line arguments
>in the PROCINFO["argv"] array, so if you are really worried about this,
>just scan the array for the presence of -M or --bignum:

Not present in 4.1.4 (latest released version).

--
In politics and in life, ignorance is not a virtue.
-- Barack Obama --

Janis Papanagnou

未読、

2017/05/26 11:01:442017/05/26

To:

On 26.05.2017 14:57, Andrew Schorr wrote:
> On Friday, May 26, 2017 at 6:58:42 AM UTC-4, Janis Papanagnou wrote:
>> I agree with this idea, but it's not important whether MPFR is set or
>> not - "MPFR" doesn't carry the significant information now, and in
>> future versions native vs. MPFR ranges may change -, so setting PREC
>> is more likely what we need (PROCINFO["PREC"]=120). GNU awk could then
>> determine whether the MPFR library functions or the native arithmetic
>> functions would [internally] have to be activated (the user need not
>> care; he declares what he wants, or otherwise the native default will
>> be taken).
>
> That's an interesting idea. Please feel free to submit a proposed patch.
>
> But one should keep in mind that running with MPFR is likely to be much
> slower, so it's not so obvious to me that it's a great idea to hide what's
> going on beneath the hood.

That's why I proposed that the default is native (i.e. non-MPFR) processing.
In case someone wants high-precision he'd have to *explicitly* state that in
the simple form shown above; say, PROCINFO["PREC"]=120 . That's not more or
less obvious than buying any degradation using a "-M" switch; in both cases
the user should know what he uses and - from the manual - what consequences
there are.

Janis

>> [...]

Janis Papanagnou

未読、

2017/05/26 11:42:342017/05/26

To:

On 26.05.2017 14:57, Andrew Schorr wrote:

> On Friday, May 26, 2017 at 6:58:42 AM UTC-4, Janis Papanagnou wrote:

[...]

>
>> Yes, but testing by implementing some embedded sample computation is
>> at least as icky as testing for "MPFR" status and aborting.
>
> Well, I'm sorry, I don't think it's "icky" to test for what you
> actually care about -- can I do the computation in mind with the
> required precision.

Well, we probably have a different approach here; my paradigm is
rather programming by contract according to a specification not by
adding _runtime_ tests whether the used program P (awk in this case)
uses the _internal_(!) function A or A'. The latter is behavioural
analysis of a program. (While in case of lacking specifications it's
possible that you have to do such things, you normally wouldn't do
that during runtime.)

In the given case where you can't control it from inside of awk I'd
rather would externalize the task, i.e. do the discrimination where
I am forced to do that anyway; on the command line (or environment)
where I will have to specify "-M" or not.[*]

Janis

[*] Note that this not related any more to the original request of
this thread to test "-M" (for which there seems to be a solution
already, and one that is probably available with the next release).

>
> [...]

Andrew Schorr

未読、

2017/05/26 13:00:102017/05/26

To:

On Friday, May 26, 2017 at 10:10:24 AM UTC-4, Kenny McCormack wrote:
> Yes, it is not in 4.1.4. I'm pretty sure Andy always runs the "developer"
> version of GAWK, and that he tends to forget that fact from time to time.

Yeah, sorry, this will be in version 4.2, which I hope may come out soonish.

> Note that having access to the original argv[] is more or less equivalent
> to having access to "do_flags" (in the context of this specific issue) -
> both are possible solutions to the question of "Did the user put '-M' on
> the command line?". Kinda 6 of one, ...

Not equivalent in my view: one will be in distributed gawk, whereas the other
relies on a patch and on some obscure numeric value not changing over time.

Regards,
Andy

Andrew Schorr

未読、

2017/05/26 13:01:202017/05/26

To:

On Friday, May 26, 2017 at 11:01:44 AM UTC-4, Janis Papanagnou wrote:
> That's why I proposed that the default is native (i.e. non-MPFR) processing.
> In case someone wants high-precision he'd have to *explicitly* state that in
> the simple form shown above; say, PROCINFO["PREC"]=120 . That's not more or
> less obvious than buying any degradation using a "-M" switch; in both cases
> the user should know what he uses and - from the manual - what consequences
> there are.

In principle, I agree with your concept. I just don't think it's easily done
with the current codebase. Somebody needs to play with patching the code,
and I'm not convinced that this is a high priority.

Regards,
Andy

Andrew Schorr

未読、

2017/05/26 13:04:592017/05/26

To:

On Friday, May 26, 2017 at 11:42:34 AM UTC-4, Janis Papanagnou wrote:
> Well, we probably have a different approach here; my paradigm is
> rather programming by contract according to a specification not by
> adding _runtime_ tests whether the used program P (awk in this case)
> uses the _internal_(!) function A or A'. The latter is behavioural
> analysis of a program. (While in case of lacking specifications it's
> possible that you have to do such things, you normally wouldn't do
> that during runtime.)

You are totally and completely correct. But we sometimes have to live
with the limitations of what we've got. If the code can be patched to do
it your way, it might be worth adding this feature.

Of course, the notion of specifying a base-2 mantissa precision is already
making some assumptions about how numbers are represented.

> In the given case where you can't control it from inside of awk I'd
> rather would externalize the task, i.e. do the discrimination where
> I am forced to do that anyway; on the command line (or environment)
> where I will have to specify "-M" or not.[*]

I don't follow this comment. Are you suggesting a change from current
behavior?

> [*] Note that this not related any more to the original request of
> this thread to test "-M" (for which there seems to be a solution
> already, and one that is probably available with the next release).

But as you point out, why is this type of behavioral test advisable?
If one wants to test behavior, I still think it is wiser to test actual
calculation precision rather than whether a given flag was supplied
on the command line.

Regards,
Andy

Janis Papanagnou

未読、

2017/05/26 13:10:182017/05/26

To:

Sure. And given that you - i.e. someone familiar with the code base - says so
probably means that I don't even need to try, I guess.

Janis

Janis Papanagnou

未読、

2017/05/26 14:19:162017/05/26

To:

On 26.05.2017 19:04, Andrew Schorr wrote:
> On Friday, May 26, 2017 at 11:42:34 AM UTC-4, Janis Papanagnou wrote:

[...]

>> In the given case where you can't control it from inside of awk I'd
>> rather would externalize the task, i.e. do the discrimination where
>> I am forced to do that anyway; on the command line (or environment)
>> where I will have to specify "-M" or not.[*]
>
> I don't follow this comment. Are you suggesting a change from current
> behavior?

No, no. What I meant is that given what we have _I_ (probably not the
OP who may have requirements I don't yet recognize) would put the test
(if necessary) statically in the environment; the calling environment
will require adding -M (if you need it) or keep the default anyway, if
all I want to do is to abort any lacking precision functionality, so I
don't seem to need the test inside awk. (Having the test inside an awk
module might be necessary if you have independent library modules that
can be called from other awk programs that might or might not be called
with -M; for me the procedure to test precision in library modules looks
a bit strange, so I'm not really sure why that would be necessary, but I
cannot deny if someone has good uses for it. But GNU awk's handling is,
IMO, anyway not perfect in this case as [partly] discussed previously.)

>
>> [*] Note that this not related any more to the original request of
>> this thread to test "-M" (for which there seems to be a solution
>> already, and one that is probably available with the next release).
>
> But as you point out, why is this type of behavioral test advisable?

Behavioural tests are necessary if one is confronted with unknown systems.
Usually you test a specific instance, depending on the type of the system,
at a specific time, evolution stage, or version.

If the system will behave differently when there's larger numbers involved
it's necessary to make tests before you use it. (GNU awk, though, has its
[current] behaviour specified in the manual, AFAIR.) Undesired behaviour
is for example if there's at some point silently an computational overflow
that you don't get reported and produce (probably hazardous) results thereby.
Or to check out where's the limit actually is so that you have to bite the
performance-degradation bullet and use MPFR, but with the next release, or
with the next CPU/ALU generation the limits may have changed again, so that
using less performant MPFR would not be necessary to use any more.

Declaring what you need and letting awk decide whether to use native or MPFR
functions would solve that. But then the performance-degradation might also
be a requirement. If awk would decide the library to use you couldn't tell
as well what's used and whether your performance will drop or not. (Arguably,
this can and probably should be done as part of the system regression tests
so I'm not completely convinced.) Slower computation - specifically in the
contexts where awk is used - is usually a lesser issue than silent overflows
or similar.

As explained, I cannot tell for sure in this case - this is a question the
OP might answer (or has he already elaborated on it?).

> If one wants to test behavior, I still think it is wiser to test actual
> calculation precision rather than whether a given flag was supplied
> on the command line.

(Both are not what I would want or need.)

Janis

Andrew Schorr

未読、

2017/05/27 11:00:442017/05/27

To:

On Friday, May 26, 2017 at 1:10:18 PM UTC-4, Janis Papanagnou wrote:
> > with the current codebase. Somebody needs to play with patching the code,
> > and I'm not convinced that this is a high priority.
>
> Sure. And given that you - i.e. someone familiar with the code base - says so
> probably means that I don't even need to try, I guess.

I truly have no idea how difficult this would be. This has never been requested
before. The current code does some different initializations when the -M
flag is supplied, such as plugging in some hooks to the runtime interpreter
to evaluate arithmetic op-codes differently, etc. I'm not saying that a runtime
behavior change is impossible, but it will clearly require some patching. For
example, when in MPFR mode, there are places in the code that assume that
numeric values will be in MPFR or MPZ format. We would now need to add support
for IEEE 754 values as well. It's a nontrivial change. And it would make
MPFR mode even slower than it is now. But even worse, if somebody left MPFR
mode to return to regular calculation mode, we would have to start worrying
about encountering MPFR numbers in normal operation. So that would slow down
the operation of normal mode as well. So it overall seems like a bad idea to
me at the moment.

Regards,
Andy

Janis Papanagnou

未読、

2017/05/27 11:37:372017/05/27

To:

I cannot follow your argumentation that it would in consequence has to have
the effect of slowing down "everything", but if you say the current code as
implemented would probably be changable only that way then I take your word
as expert for granted. I realize that it's probably not worth to think about
such a feature currently.

Janis

Kenny McCormack

未読、

2017/05/27 11:58:172017/05/27

To:

In article <b6cb2ceb-6d72-464d...@googlegroups.com>,

Andrew Schorr <asc...@telemetry-investments.com> wrote:
...

>> probably means that I don't even need to try, I guess.
>
>I truly have no idea how difficult this would be. This has never been
>requested before. The current code does some different initializations
>when the -M

I want it to be clear that the ability to turn MPFR mode on an off at will
from within GAWK source, although attractive in a theoretical sense, was
never my intent, nor was it the intent of my having started this thread.

I suppose it does kinda follow, in that if one requests the ability to
detect its state, one naturally begins to think in terms of being able to
change that state. I.e., so the argument goes, why ask about being able to
inquire state without it naturally following that one want to affect said
state. But, as I said, such was never my intent.

That all said, I do think that it should be possible for a GAWK source file
(i.e., one that is directly executable via the "shebang" hack) to specify
that this script requires MPFR. I'm thinking that it would be analogous to
the way that things like @include and @load are parsed - that is, they are
part of the source, but they must somehow be parsed before the regular
language-level parsing occurs - something like how the C pre-processor
works. Note that I've never looked at the code for these so I may just be
making this up.

But, in fact, there is such a construct! As I discovered the last time I
posted about this, all you have to do is change the shebang line from:

#!/path/to/gawk -f

to:

#!/path/to/gawk -Mf

and Bob's your uncle!

--
The only thing Trump's made great again is Saturday Night Live.

Andrew Schorr

未読、

2017/06/04 10:47:412017/06/04

To:

On Saturday, May 27, 2017 at 11:37:37 AM UTC-4, Janis Papanagnou wrote:
> I cannot follow your argumentation that it would in consequence has to have
> the effect of slowing down "everything", but if you say the current code as
> implemented would probably be changable only that way then I take your word
> as expert for granted. I realize that it's probably not worth to think about
> such a feature currently.

For each math function, there are 2 versions. The "normal" version assumes
that numeric values are stored in IEEE 754 floating-point format, and the
enhanced precision version assumes that they are stored in either mpz or mpfr
format. When gawk starts, it decides which set of math routines to install.
If we enable dynamic switching between regular and high-precision modes,
then we would need to do one of two things:
1. consolidate into a single set of routines that can handle operands in any
of the 3 numeric formats and produce results in the currently requested format.
or 2. keep separate routines but still enhance each one to accept operands
in any format.
Either way, there will be a performance hit for this added flexibility.
I guess we could consider converting all the numeric values in memory whenever
the selected format changes, but that seems like a performance disaster as
well.

Regards,
Andy

Kenny McCormack

未読、

2017/06/04 11:15:162017/06/04

To:

In article <418f0eb3-4ecb-4a91...@googlegroups.com>,

Yes, I get that. And, just to re-iterate and to be clear, I never asked
for the ability to switch back and forth at will. As I've said, it might
be a nice thing in an abstract, theoretical sense, but I see no practical
utility to it.

One of the ideas that I *have* put forward in this thread (and which was
amplified and clarified by Kaz) is that it would be nice if there was a
source-code-level way to specify that this program needs MPFR.

And, indeed, there is. All you have to do is make sure your "shebang" line
is:

#!/path/to/gawk -Mf

So, as far as I am concerned, this specific aspect of this thread (*) is
solved, done, kaput, over.

(*) But do be aware that this was only a sub-topic of the thread. It was
not and is not the original and primary reason for this thread's existence.

--
The randomly chosen signature file that would have appeared here is more than 4
lines long. As such, it violates one or more Usenet RFCs. In order to remain
in compliance with said RFCs, the actual sig can be found at the following URL:

http://user.xmission.com/~gazelle/Sigs/LadyChatterley

Andrew Schorr

未読、

2017/06/04 20:08:022017/06/04

To:

On Sunday, June 4, 2017 at 11:15:16 AM UTC-4, Kenny McCormack wrote:
> One of the ideas that I *have* put forward in this thread (and which was
> amplified and clarified by Kaz) is that it would be nice if there was a
> source-code-level way to specify that this program needs MPFR.
>
> And, indeed, there is. All you have to do is make sure your "shebang" line
> is:
>
> #!/path/to/gawk -Mf
>
> So, as far as I am concerned, this specific aspect of this thread (*) is
> solved, done, kaput, over.

Good. There are several command-line options that are not configurable
in the source code, such as -M, --posix, --non-decimal-data, etc.
So that is not just an issue for -M. It's not ideal, but it's not clear how
high a priority it should be to address this.

> (*) But do be aware that this was only a sub-topic of the thread. It was
> not and is not the original and primary reason for this thread's existence.

:-) I think what you want is something in PROCINFO to indicate how math
calculations are being done, and what I am proposing instead is that you
simply use a function to test whether the math calculations are sufficiently
precise for your needs. An explanation of the latter approach has gone into the
manual:

16.6 How To Check If MPFR Is Available
======================================

Occasionally, you might like to be able to check if 'gawk' was invoked
with the '-M' option, enabling arbitrary-precision arithmetic. You can
do so with the following function, contributed by Andrew Schorr:

# adequate_math_precision --- return true if we have enough bits

function adequate_math_precision(n)
{
return (1 != (1+(1/(2^(n-1)))))
}

Here is code that invokes the function in order to check if
arbitrary-precision arithmetic is available:

BEGIN {
# How many bits of mantissa precision are required
# for this program to function properly?
fpbits = 123

# We hope that we were invoked with MPFR enabled. If so, the
# following statement should configure calculations to our desired
# precision.
PREC = fpbits

if (! adequate_math_precision(fpbits)) {
print("Error: insufficient computation precision available.\n" \
"Try again with the -M argument?") > "/dev/stderr"
exit 1
}
}

Regards,
Andy

Kenny McCormack

未読、

2017/06/04 20:20:462017/06/04

To:

In article <9247231c-a866-4d8f...@googlegroups.com>,
Andrew Schorr <asc...@telemetry-investments.com> wrote:
...

> # adequate_math_precision --- return true if we have enough bits
>
> function adequate_math_precision(n)
> {
> return (1 != (1+(1/(2^(n-1)))))
> }

I would never use that because I have no idea what it does or what it
means (this is not to be interpreted as a request for an explanation).

If this is what we're stuck with, I'll stick with my 2**2000 test.

--
The randomly chosen signature file that would have appeared here is more than 4
lines long. As such, it violates one or more Usenet RFCs. In order to remain
in compliance with said RFCs, the actual sig can be found at the following URL:

http://user.xmission.com/~gazelle/Sigs/EternalFlame

Janis Papanagnou

未読、

2017/06/04 21:05:052017/06/04

To:

An obvious approach for such technical functionality would be to use function
pointers; this would mean one indirect function call per (simple or complex)
numerical function call. That would certainly be meaningless for performance
if complex (MPFR) calculations are enabled, and I can't imagine that it's an
issue even for simple calculations, but, as said, I don't know the respective
code to judge, so I abstain.

Janis

>
> Regards,
> Andy
>

Andrew Schorr

未読、

2017/06/06 12:52:422017/06/06

To:

On Sunday, June 4, 2017 at 9:05:05 PM UTC-4, Janis Papanagnou wrote:
> An obvious approach for such technical functionality would be to use function
> pointers; this would mean one indirect function call per (simple or complex)
> numerical function call. That would certainly be meaningless for performance
> if complex (MPFR) calculations are enabled, and I can't imagine that it's an
> issue even for simple calculations, but, as said, I don't know the respective
> code to judge, so I abstain.

The issue is that any numeric operand could be in any of 3 different formats,
so the code must take care to examine each argument's type, convert it to the
needed type for the calculation to be done in the requested output format, and
then call the function to do the actual work. This adds a lot of checking
versus the usual case where everything is assumed to be in IEEE 754 format.

Regards,
Andy