Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Definitions regarding the eternal 'gawkextlib debate'.

34 views
Skip to first unread message

Kenny McCormack

unread,
Jul 14, 2016, 7:44:17 AM7/14/16
to
Regarding the "XML" debate, I just want to make a couple of definition
clear. It seems some participants (i.e., Andy) are intent of
misunderstanding/confusing these definitions (specifically, by assuming
that the larger, more inclusive definition is meant when in fact, it should
be clear that the less inclusive definition was intended).

These definitions come as two pairs - and as you'll see, they quite
parallel each other.

1) "core gawk". This should mean that collection of features that you get
in the gawk executable itself (i.e., that single file that is created by
the final link stage of the basic compile - "gawk.exe" under DOS/Windows or
simply "gawk" under Unix/Linux-like OSes).

The alternative (larger) definition would be everything that you get when
you do the usual "configure/make/make install" routine - this includes the
"core gawk" described above as well as all the "library code" (i.e., all the
"include file"s and "shared libraries" that are included in the gawk
tarball). This includes, incidentally, both the "include file" and the
"shared library" that implement the "in-place editing" functionality (both
of which are invoked via the "-i" command line option).

I tend to use the phrase "core gawk distribution" (rather than just "core
gawk") to refer to the later, larger, more inclusive definition.

2) "gawkextlib". This can refer either to the gawkextlib package itself
(smaller, less inclusive definition) or to that plus all the
gawkextlib-dependent share libraries that actually implement functionality.
Note that if you go to the gawkextlib site, you will see source code
(tarballs) for all of this - starting with the source code for gawkextlib
itself (smaller, less inclusive definition), followed by the source code
(tarballs) for each of the gawkextlib-dependent shared library extensions.

As noted, gawkextlib itself (the smaller, less inclusive definition)
doesn't implement any end-user-useful functionality, but it is necessary to
have it for any other others to compile and to work.

I hope these definitions prove useful and that we can now have a useful
discussion without constantly stepping on each other's feet (insulting each
other by assuming one definition was meant when clearly this was not the
intent).

--
The plural of "anecdote" is _not_ "data".

Andrew Schorr

unread,
Jul 14, 2016, 9:26:38 PM7/14/16
to
On Thursday, July 14, 2016 at 7:44:17 AM UTC-4, Kenny McCormack wrote:
> I tend to use the phrase "core gawk distribution" (rather than just "core
> gawk") to refer to the later, larger, more inclusive definition.

That is sensible.

> 2) "gawkextlib". This can refer either to the gawkextlib package itself
> (smaller, less inclusive definition) or to that plus all the
> gawkextlib-dependent share libraries that actually implement functionality.
> Note that if you go to the gawkextlib site, you will see source code
> (tarballs) for all of this - starting with the source code for gawkextlib
> itself (smaller, less inclusive definition), followed by the source code
> (tarballs) for each of the gawkextlib-dependent shared library extensions.
>
> As noted, gawkextlib itself (the smaller, less inclusive definition)
> doesn't implement any end-user-useful functionality, but it is necessary to
> have it for any other others to compile and to work.

I think the less-inclusive definition is the relevant one, since each of the other tarballs is a separate library. The idea is that users pick and choose which libraries that they want. If you want XML support, then you grab only gawk-xml, plus the gawkextlib support library. In my opinion, it is not sensible to think of them as one big blob of libraries that should be installed together. That is not a scalable concept. One doesn't think of installing all perl modules or all python libraries. One just picks those desired for the project at hand.

> I hope these definitions prove useful and that we can now have a useful
> discussion without constantly stepping on each other's feet (insulting each
> other by assuming one definition was meant when clearly this was not the
> intent).

I don't think it's a question of insulting. We merely need to make sure were talking about the same thing. Gawkextlib can also be thought of as the sourceforge project where a bunch of extension libraries are available. Perhaps we should have given a different name to the support library.

Regards,
Andy

Janis Papanagnou

unread,
Jul 15, 2016, 6:03:02 AM7/15/16
to
On 15.07.2016 03:26, Andrew Schorr wrote:
[...]
>
> I think the less-inclusive definition is the relevant one, since each of
> the other tarballs is a separate library. The idea is that users pick and
> choose which libraries that they want. If you want XML support, then you
> grab only gawk-xml, plus the gawkextlib support library. In my opinion, it
> is not sensible to think of them as one big blob of libraries that should
> be installed together. That is not a scalable concept. One doesn't think of
> installing all perl modules or all python libraries. One just picks those
> desired for the project at hand.

Thanks for clarifying that.

I see a problem with both approaches. From a packaging point of view (and
from the view of a packaging systems user) I don't want to see individual
modules for individual tools, since this would immensly bloat the interface
(of the packaging system); we already have thousands of installable tools,
and if one would have tools with (in the long run) hundreds of modules that
would get quite fast very messy. (Usually we see far less modules per tool,
like binary/documentation/library/development/debug, or subsets thereof.)
And (as far as I understand) every new module would require an update of
the tool specific packaging data, to add a new module. And the other way,
having all in one big library, would mean that users pay for something what
they don't need. These problems are what gave me the impression that the
OS's packaging system is not a good place to manage individual tool modules.
Considering that, the perl approach seems quite clever and user friendly
(you'd just do cpanm Module::Name ) and have your specific entity. Though
the (organisational) drawback is that you'd need an own infrastructure for
the modules (and that someone has to do it).

Janis

[...]

Andrew Schorr

unread,
Jul 15, 2016, 9:02:30 AM7/15/16
to
On Friday, July 15, 2016 at 6:03:02 AM UTC-4, Janis Papanagnou wrote:
> Thanks for clarifying that.

Maybe it would be more clear to speak of the gawkextlib support library as "libgawkextlib".

> I see a problem with both approaches. From a packaging point of view (and
> from the view of a packaging systems user) I don't want to see individual
> modules for individual tools, since this would immensly bloat the interface
> (of the packaging system); we already have thousands of installable tools,
> and if one would have tools with (in the long run) hundreds of modules that
> would get quite fast very messy. (Usually we see far less modules per tool,
> like binary/documentation/library/development/debug, or subsets thereof.)
> And (as far as I understand) every new module would require an update of
> the tool specific packaging data, to add a new module. And the other way,
> having all in one big library, would mean that users pay for something what
> they don't need. These problems are what gave me the impression that the
> OS's packaging system is not a good place to manage individual tool modules.
> Considering that, the perl approach seems quite clever and user friendly
> (you'd just do cpanm Module::Name ) and have your specific entity. Though
> the (organisational) drawback is that you'd need an own infrastructure for
> the modules (and that someone has to do it).

I have no objections if somebody wants to build such a tool. The challenges are making sure that you use the correct paths for a given distribution, and making sure to pull in all necessary dependencies. For example, if you install gawk-xml, then you need also to install the expat XML parsing library that it uses. If you use the standard O/S package distribution system, these dependencies are handled automatically. I must confess that I don't know how the CPAN stuff works; does it pull in all the dependencies automatically?

As I mentioned previously, for perl and python at least, these 2 mechanisms exist side-by-side. One can use cpan to install modules, or one can install the official O/S package versions. For example, on my Fedora system, I see tons of individual perl and python module packages installed:

bash-4.3$ rpm -qa | grep ^perl- | wc -l
341
bash-4.3$ rpm -qa | grep ^python- | wc -l
128

I think using the O/S package mechanism ultimately gives a better result, but I'm open to the other approach if somebody wants to tackle it. Building our own tool feels like reinventing the wheel, but it does have the advantage of possibly working across multiple platforms. It would still, however, require the user to have development tools such as a compiler and make installed, so that's less appealing than just grabbing the binaries using the O/S package mechanism.

Regards,
Andy
0 new messages