On 07.05.2017 23:35, Marc de Bourget wrote:
>> [...]
>
> Yes, you are completely right but I still think
> $1 = $1
> looks like nonsense code written by an idiot :-)
Consider it as an awk idiom, established as a _consequence_ of how awk is
defined to work. Don't look at it with the eyes of a <name some favourite
programming language here> programmer. (There's really a lot that looks
like "code written by an idiot" not only in awk, also in other languages,
whether based on C pr not.)
> Of course it is not - but only a few AWK gurus will understand the sense.
Awk is a very small and conceptually quite simple language, so it's quite
easy to learn and understand. (There's certainly pitfalls and dark corners,
but that's true for quite any language, and the larger the worse usually.)
This specific contruct is comparably easy to understand. Certainly people
who are comming from another programming language will be surprised first,
but that doesn't make its sense per se difficult to grasp. (YMMV.) I think
it's necessary in any programming language to understand its paradigms and
concepts. (And in awk that is very easy.)
>
> I'm beginning to understand why Perl, Ruby etc. renounce automatic input
> parsing and use split and join instead. This makes the code much clearer.
I'm sure you're basically right. (Although I think that perl's syntax is
quite a cryptic mess; there's much more than in awk that you will have to
get used to.) With the given construct ($1 = $1) you actually make use of
a technical side-effect; and side effects are in most cases just bad.
> Since a while I do everything in the BEGIN section without the main loop.
(I think this is a bad idea, see below, but anyway.)
> This makes programs more readable, better portable and not too specific.
Doing everything in the BEGIN section means that you abandon some of the
pros that you get with the awk language; including readability (given the
code bloat you get by this habit) or safety (e.g. see Ed's getline post).
(Certainly I understand the motivation to do what seems is fitting you.)
You obviously assume here only a specific kind of "portability"; the one
that exists - but only to a very restricted degree! -, in the same family
of [procedural (maybe even C based?)] languages. Between awk versions the
above idiom should be fairly portable. And WRT other language paradigms,
like Prolog's, Lisp's, any OO's, there's anyway no real comparison or a
simple "portability" possible.
The point of having (and using) different languages and various language
families is that you can often do specific tasks in one language(-family)
better than in another. From the final system's view it's should be anyway
irrelevant in what language the components have been written.
Janis