Using math functions in Awk

19 views
Skip to first unread message

Janis Papanagnou

unread,
Sep 4, 2021, 10:05:14 AM9/4/21
to
I want to use math functions in Awk that are not available by default.
What are the typical (or best) methods to do so. Is there an Awk math
library (awk source code), ideally a standard or a de facto standard?
Should such functions be better accessed from the shell environment
(using standard Unix tools for example)? Any opinions or suggestions?
Specifically I'm looking for the Poisson function.

Janis

Kenny McCormack

unread,
Sep 4, 2021, 10:28:54 AM9/4/21
to
In article <sgvueo$s73$1...@gioia.aioe.org>,
I don't know what you mean by "Poisson function". It doesn't seem to show
up in this list:

https://en.wikipedia.org/wiki/List_of_things_named_after_Siméon_Denis_Poisson

--
The plural of "anecdote" is _not_ "data".

Kaz Kylheku

unread,
Sep 4, 2021, 10:50:32 AM9/4/21
to
Likely, the Poisson PDF (probability distribution function) for
modelling processes that conform to the Poisson model.

Poisson models situations in which events occur at a certain average rate,
but in a random way, independent of each other. An event having occurred
or not doesn't affect the probability of a subsequent occurrence.

Rare errors happening in bit stream are likely Poisson, as are flat
tires, requests hammering on a server, etc.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal

Janis Papanagnou

unread,
Sep 4, 2021, 11:03:00 AM9/4/21
to
> https://en.wikipedia.org/wiki/List_of_things_named_after_Siméon_Denis_Poisson
>

I mean the parameterized function describing the Poisson distribution:
https://en.wikipedia.org/wiki/Poisson_distribution
or maybe also this ("probability mass function") for larger values of r
https://en.wikipedia.org/wiki/Negative_binomial_distribution
(As far as I had been told, sometimes also know as Pareto.)
Not a discrete but continuous function; in ASCII graphics something like

**
* *
* *
* * _____> asymptotically towards 0
^____starting at (0,0)

Although this is the function I want to examine the questions can be
taken more generally for any function that is not available in Awk.
General Awk solutions/suggestions or GNU Awk specific are both welcome.

Janis

Kenny McCormack

unread,
Sep 4, 2021, 12:00:20 PM9/4/21
to
In article <sh01r2$frq$1...@gioia.aioe.org>,
Janis Papanagnou <janis_pa...@hotmail.com> wrote:
...
>I mean the parameterized function describing the Poisson distribution:
>https://en.wikipedia.org/wiki/Poisson_distribution
>or maybe also this ("probability mass function") for larger values of r
>https://en.wikipedia.org/wiki/Negative_binomial_distribution
>(As far as I had been told, sometimes also know as Pareto.)
...
>Although this is the function I want to examine the questions can be
>taken more generally for any function that is not available in Awk.
>General Awk solutions/suggestions or GNU Awk specific are both welcome.

Well, there are really two parts to your question, then:

1) How do I get access to functions that aren't provided directly in GAWK?

2) What exactly is the "Poisson distribution/function" and how to I
convert the mathematical notation that I see for it in Wikipedia
into computer source code in some language?

The first is on-topic and probably something that I and others on this
newsgroup can help you with. The second, not so much. Note that if

For the first, there are (at least) 3 routes I could see as reasonable
paths to pursue (presented in no particular order):

1) Find a library of AWK code somewhere that has what you need.
(This seems unlikely to me, but stranger things have happened...)
2) Write your own AWK code to what what you need to do. I assume this
is what you are trying to avoid.
3) Find a library written in C (or similar) and then figure out how to
link to it via writing a GAWK extension library yourself. This
option is pretty easy once you've done it a few times, but
represents a sizable hurdle for first-timers. As far as I know,
you've not yet done one of these.

Note that if the function you were looking for was something simple, that
is in the C library but not in GAWK - such as acos() - then it would be
straightforward to write an extension lib to get to it (*), but I don't
think your Poisson function is in the standard library - hence the need to
find a (third party) lib that does have it.

Note, BTW, that acos() really is in the C math lib, but not in GAWK.
Purists will point out that (I think this is true - haven't verified it
100%) that it is unnecessary, since it can be derived from atan2(), which
*is* in GAWK. Again, I'm not sure about all this, but I think it is true.

(*) I, of course, would just use my call_any() library and I'd be done, but
that may not be available to you at the present time.

--
This is the GOP's problem. When you're at the beginning of the year
and you've got nine Democrats running for the nomination, maybe one or
two of them are Dennis Kucinich. When you have nine Republicans, seven
or eight of them are Michelle Bachmann.

Ben Bacarisse

unread,
Sep 4, 2021, 2:43:50 PM9/4/21
to
Janis Papanagnou <janis_pa...@hotmail.com> writes:

> On 04.09.2021 16:28, Kenny McCormack wrote:
>> In article <sgvueo$s73$1...@gioia.aioe.org>,
>> Janis Papanagnou <janis_pa...@hotmail.com> wrote:
>>> I want to use math functions in Awk that are not available by default.
>>> What are the typical (or best) methods to do so. Is there an Awk math
>>> library (awk source code), ideally a standard or a de facto standard?
>>> Should such functions be better accessed from the shell environment
>>> (using standard Unix tools for example)? Any opinions or suggestions?
>>> Specifically I'm looking for the Poisson function.
>>
>> I don't know what you mean by "Poisson function". It doesn't seem to show
>> up in this list:
>>
>> https://en.wikipedia.org/wiki/List_of_things_named_after_Siméon_Denis_Poisson
>>
>
> I mean the parameterized function describing the Poisson distribution:
> https://en.wikipedia.org/wiki/Poisson_distribution

This one is easy, provided you don't have extreme values. So much so,
it's probably fastest just to code it yourself. Do you need
Poisson-distributed whole numbers k, the probability of getting some k?

> or maybe also this ("probability mass function") for larger values of r
> https://en.wikipedia.org/wiki/Negative_binomial_distribution

Also quite easy for non-troublesome parameters. I think the usual
method of generating values is just to "do the trials".

> (As far as I had been told, sometimes also know as Pareto.)
> Not a discrete but continuous function; in ASCII graphics something
> like

I am not sure how the extension to the reals would be implemented as
I've not use that before.

> **
> * *
> * *
> * * _____> asymptotically towards 0
> ^____starting at (0,0)
>
> Although this is the function I want to examine the questions can be
> taken more generally for any function that is not available in Awk.
> General Awk solutions/suggestions or GNU Awk specific are both
> welcome.

This part of outside of my wheelhouse.

--
Ben.

Bruce Horrocks

unread,
Sep 4, 2021, 8:22:16 PM9/4/21
to
The Poisson distribution Wikipedia link you give tells you how to
calculate the Poisson distribution in section 6.1, either directly or
using the gamma function for better stability.

There is no gamma function by default in gawk, of course, but it is
available in the GNU MPFR library <https://www.mpfr.org>

And there is a gawkextlib for MPFR that provides access to the
additional functions, including gamma.
<http://gawkextlib.sourceforge.net/mpfr/gawk-mpfr.html>

So I think your best bet would be to install the mpfr extension library.
<http://gawkextlib.sourceforge.net>
<http://gawkextlib.sourceforge.net/mpfr/mpfr.html>

--
Bruce Horrocks
Surrey, England
Reply all
Reply to author
Forward
0 new messages