Is there an simple way to initialise arrays in bulk?

bolta...@yahoo.co.uk

unread,

Jan 22, 2009, 9:59:44 AM1/22/09

to

Hi

I need to set up a large array in BEGIN. Is there an easy way to
initialise it in one go in the same vein as this C example, eg:

char *arr[] = { "hello" , "cruel", "world" };

or would I have to just manually set each array index manually or
write some code to iterate through a delimited string?

Thanks for any help

B2003

Thomas Weidenfeller

unread,

Jan 22, 2009, 10:03:11 AM1/22/09

to

bolta...@yahoo.co.uk wrote:
> Is there an easy way to
> initialise it in one go in the same vein as this C example, eg:
>
> char *arr[] = { "hello" , "cruel", "world" };
>
> or would I have to just manually set each array index manually or
> write some code to iterate through a delimited string?

The typical hack is to use split().

/Thomas

pk

unread,

Jan 22, 2009, 10:12:40 AM1/22/09

to

You could do for example

awk -v str="val1,val2,val3,val4" 'BEGIN{split(str,arr,/,/)} { ... }'

bolta...@yahoo.co.uk

unread,

Jan 22, 2009, 11:51:01 AM1/22/09

to

On Jan 22, 3:12 pm, pk <p...@pk.invalid> wrote:

Thanks

B2003

Kenny McCormack

unread,

Jan 22, 2009, 12:13:03 PM1/22/09

to

Or, more simply, and eliminating the off-topic and extraneous shell syntax:

BEGIN{split("val1,val2,val3,val4",arr,/,/)}
{ ... }

pk

unread,

Jan 22, 2009, 12:25:36 PM1/22/09

to

...and turning the program into a rigid one that should be manually edited
if we want to use a different set of values:

Kenny McCormack

unread,

Jan 22, 2009, 1:52:01 PM1/22/09

to

Your shell script wouldn't have to be modified each time?

r.p....@gmail.com

unread,

Jan 22, 2009, 2:00:56 PM1/22/09

to

On Jan 22, 1:52 pm, gaze...@shell.xmission.com (Kenny McCormack)
wrote:

> In article <glaa52$c6...@news.motzarella.org>, pk <p...@pk.invalid> wrote:
> >On Thursday 22 January 2009 18:13, Kenny McCormack wrote:
>

> >> In article <gla2ba$h2...@news.motzarella.org>, pk <p...@pk.invalid> wrote:

> >>>On Thursday 22 January 2009 15:59, boltar2...@yahoo.co.uk wrote:
>
> >>>> Hi
>
> >>>> I need to set up a large array in BEGIN. Is there an easy way to
> >>>> initialise it in one go in the same vein as this C example, eg:
>
> >>>> char *arr[] = { "hello" , "cruel", "world" };
>
> >>>> or would I have to just manually set each array index manually or
> >>>> write some code to iterate through a delimited string?
>
> >>>You could do for example
>
> >>>awk -v str="val1,val2,val3,val4" 'BEGIN{split(str,arr,/,/)} { ... }'
>
> >> Or, more simply, and eliminating the off-topic and extraneous shell
> >> syntax:
>
> >...and turning the program into a rigid one that should be manually edited
> >if we want to use a different set of values:
>
> >> BEGIN{split("val1,val2,val3,val4",arr,/,/)}
> >> { ... }
>

> Your shell script wouldn't have to be modified each time?- Hide quoted text -
>
> - Show quoted text -

I strongly prefer a block of text that has great readability and is
easily changed:

Usually something like this:

colors[++ncolors] = "red"
colors[++ncolors] = "blue"
colors[++ncolors] = "green"
...

the advantage here is that it is trivial to add or remove a line, and
it is easy to generate
this code using another script if necessary, or just good old vi.

pk

unread,

Jan 22, 2009, 2:15:50 PM1/22/09

to

On Thursday 22 January 2009 19:52, Kenny McCormack wrote:

>>>>You could do for example
>>>>
>>>>awk -v str="val1,val2,val3,val4" 'BEGIN{split(str,arr,/,/)} { ... }'
>>>
>>> Or, more simply, and eliminating the off-topic and extraneous shell
>>> syntax:
>>
>>...and turning the program into a rigid one that should be manually edited
>>if we want to use a different set of values:
>>
>>> BEGIN{split("val1,val2,val3,val4",arr,/,/)}
>>> { ... }
>>
>
> Your shell script wouldn't have to be modified each time?

I guess it depends on the kind of application you have in mind. I was
thinking of a scenario where you put the awk program in a file, and then
run

awk -v str="blah,blah,blah...." -f program.awk

on the command line several times with different sets of values, or as part
of a shell script where other commands output strings like "blah,blah,blah"
which you can easily pass to awk without the need to modify the awk
program.

But of course, if you're going to have fixed values anyway somewhere in the
code, be it shell code or awk code, then it does not really matter.

Chris F.A. Johnson

unread,

Jan 22, 2009, 8:13:02 PM1/22/09

to

Of course not. It gets its info on the command line, e.g.:

str=$1 ## or whatever
awk -v str="$str" 'BEGIN{split(str,arr,/,/)} { ... }'

--
Chris F.A. Johnson, author | <http://cfaj.freeshell.org>
Shell Scripting Recipes: | My code in this post, if any,
A Problem-Solution Approach | is released under the
2005, Apress | GNU General Public Licence

Janis Papanagnou

unread,

Jan 22, 2009, 10:40:45 PM1/22/09

to

I don't think repeating many times the same ''colors[++ncolors] = '' command
adds anything to readability. Introducing superfluous variables is something
I'd rather avoid. Increasing a short awk program by a huge initializer list
doesn't serve readability, too. If you happen to have to initialize more
than one type of array the scalar initialization is even more errorprone.
But YMMV, of course.

>
> Usually something like this:
>
> colors[++ncolors] = "red"
> colors[++ncolors] = "blue"
> colors[++ncolors] = "green"
> ...
>
> the advantage here is that it is trivial to add or remove a line, and
> it is easy to generate
> this code using another script if necessary, or just good old vi.

Adding a value to Kenny's split proposal is as trivial (or even more trivial)
than adding another line of code. That's hardly convincing.

Scripting has the advantage that it's not restricted to line-oriented data,
you can generate the split command trivially as well. And the same is true
for changing the data using your favourite editor.

In addition to the initialization possibilities mentioned in this thread you
can, especially in case of a large data set, put that data in its own file
and read it on startup, one of...

NR==FNR { colors[++n] = $0 ; next } # if one data entry per line
{ ...processing of other files... }

NR==FNR { for(n=1;n<=NF;n++) colors[n] = $n ; next } # all data on one line
{ ...processing of other files... }

and call it as...

awk -f yourprogram.awk initfile.data otherfiles.data

Janis

Brian Inglis

unread,

Jan 23, 2009, 1:08:08 AM1/23/09

to

++wholehog:

del=" " ## or whatever
str="$@" ## or whatever
awk -v str="$str" -v del="$del" 'BEGIN{split(str,arr,del)} { ... }'

--
Thanks. Take care, Brian Inglis Calgary, Alberta, Canada

Brian....@CSi.com (Brian[dot]Inglis{at}SystematicSW[dot]ab[dot]ca)
fake address use address above to reply

r.p....@gmail.com

unread,

Jan 23, 2009, 3:19:58 PM1/23/09

to

On Jan 23, 1:08 am, Brian Inglis <Brian.Ing...@SystematicSW.Invalid>
wrote:

> On Fri, 23 Jan 2009 01:13:02 +0000 in comp.lang.awk, "Chris F.A.
>
>
>
>
>
> Johnson" <cfajohn...@gmail.com> wrote:
> >On 2009-01-22, Kenny McCormack wrote:

> >> In article <glaa52$c6...@news.motzarella.org>, pk <p...@pk.invalid> wrote:
> >>>On Thursday 22 January 2009 18:13, Kenny McCormack wrote:
>

> >>>> In article <gla2ba$h2...@news.motzarella.org>, pk <p...@pk.invalid> wrote:

> Brian.Ing...@CSi.com (Brian[dot]Inglis{at}SystematicSW[dot]ab[dot]ca)
> fake address use address above to reply- Hide quoted text -

>
> - Show quoted text -

Janis -- I respectfully disagree with you and am willing to test my
hypothesis of readability empirically!

IMHO it is very important that people think of gawk as better than
perl (even better than ruby perhaps) in terms of readability. Most
people think of gawk as dense, relying on syntactic tricks with
implicit semantics. So again, I strongly prefer to make things as
explicit as possible.

Ed Morton

unread,

Jan 23, 2009, 3:39:33 PM1/23/09

to

I think you must've replied to the wrong posting because I don't see
anything from Janis in the quoted text above.

> IMHO it is very important that people think of gawk as better than
> perl (even better than ruby perhaps) in terms of readability.

It does seem to be better, assuming it's used reasonably.

> Most people think of gawk as dense, relying on syntactic tricks with
> implicit semantics.

No, they don't.

> So again, I strongly prefer to make things as explicit as possible.

Making things as explicit as possible leads to software that's
relatively difficult to maintain. Conciseness is a much better and
more common goal.

Ed.

Janis Papanagnou

unread,

Jan 23, 2009, 10:50:25 PM1/23/09

to

r.p....@gmail.com wrote:
> On Jan 23, 1:08 am, Brian Inglis <Brian.Ing...@SystematicSW.Invalid>
> wrote:
>

>>[...]

>
> Janis -- I respectfully disagree with you and am willing to test my
> hypothesis of readability empirically!

If you have something to say WRT to my postings you should followup
to that one and quote the respective parts.

Or do you disagree with Brians posting?

Frankly, it's hard to respect a subjective opinion about readability
from someone who cannot even make a consistent - readable - posting.

> IMHO it is very important that people think of gawk as better than
> perl (even better than ruby perhaps) in terms of readability.

This statement has no factual content, and no substance to comment on.
You may want to try again, and be more explicit what about you
concretely think to be "better" or worse in awk, perl, whatever.
(Not that I am the least interested in any religious argumentation.)

Actually you can use both, awk and perl, in a concise or verbose way.
You seem to be unfamiliar with the concise way.

Given how you used awk in your examples in the past does show, though,
that your experience with awk is apparently not mature enough to write
sophisticated and readable awk programs. Given how you used getline
(or rather, that you used getline at all in the respective context) in
some recent posting shows that you even have not understood the basic
principles of programming in awk. I'd suggest to try to understand awk
first, before suggesting a Pascal like program structure to others;
which is really no good idea.

> Most
> people think of gawk as dense, relying on syntactic tricks with
> implicit semantics. So again, I strongly prefer to make things as
> explicit as possible.

There's no such thing as implicit semantics. I suppose you mean awk's
default actions and default conditions and default initializations?
It's well documented and part of the language; it is intended that
those features are used.

What you prefer, stongly or else, is your choice.

Janis

Kenny McCormack

unread,

Jan 24, 2009, 11:13:35 AM1/24/09

to

In article <gle362$irc$1...@svr7.m-online.net>,

Janis Papanagnou <janis_pa...@hotmail.com> wrote:
>r.p....@gmail.com wrote:
>> On Jan 23, 1:08 am, Brian Inglis <Brian.Ing...@SystematicSW.Invalid>
>> wrote:
>>
>>>[...]
>>
>> Janis -- I respectfully disagree with you and am willing to test my
>> hypothesis of readability empirically!
>
>If you have something to say WRT to my postings you should followup
>to that one and quote the respective parts.
>
>Or do you disagree with Brians posting?
>
>Frankly, it's hard to respect a subjective opinion about readability
>from someone who cannot even make a consistent - readable - posting.
>
>> IMHO it is very important that people think of gawk as better than
>> perl (even better than ruby perhaps) in terms of readability.
>
>This statement has no factual content, and no substance to comment on.

Discussing the "subjective" side of programming is always tricky.
It is tricky enough in real life; virtually impossible in online fora.

The problem is the simple fact that the first rule of the "social" side
(i.e., the part that has to do with the fact that programmers are human
beings, with all the faults and foibles of humans; they are not
computers - they work with computers, but they are not, themselves,
computers) is this:

Programmers like what they know.

Therefore, if you grew on pure procedural languages, you will tend to
try to mold each new language that you learn into the mold of the
languages you are comfortable with. I've seen this type of programmer
(like the PP [Previous Poster] - a new acronym made up on the spot [*])
before - they end up converting AWK into C (pure procedural).

So, I suggest that we not try to go any further with this thread/topic online.
Nothing good can come from it...

[*] A historical note: 'twas I, long ago and under some other nym, who
coined the term "OP".

Janis Papanagnou

unread,

Jan 24, 2009, 11:28:36 AM1/24/09

to

Kenny McCormack wrote:
>
> Programmers like what they know.

Very true.

> So, I suggest that we not try to go any further with this thread/topic online.
> Nothing good can come from it...

Right.

> [*] A historical note: 'twas I, long ago and under some other nym, who
> coined the term "OP".

Oh! Really? - Now let's see whether the "PP" becomes as popular as OT.

Janis

Kenny McCormack

unread,

Jan 24, 2009, 1:12:09 PM1/24/09

to

In article <glffjl$h4i$1...@svr7.m-online.net>,

Janis Papanagnou <janis_pa...@hotmail.com> wrote:
>Kenny McCormack wrote:
>>
>> Programmers like what they know.
>
>Very true.
>
>> So, I suggest that we not try to go any further with this
>> thread/topic online. Nothing good can come from it...
>
>Right.

I should also add that each programmer has a personal "threshold of pain"
with respect to how much they like their programming language to look
like giberish. I also think that most "script"-type languages have some
element of giberish in them. I won't deny that AWK has some "magic" in
it. But I have always thought that AWK struck the best balance between
power and cryptic-ness. Both sed and perl offend me, but, of course,
each of these has its defenders. As, of course, does python, ruby, and
all the rest...

>> [*] A historical note: 'twas I, long ago and under some other nym,
>> who coined the term "OP".
>
>Oh! Really? - Now let's see whether the "PP" becomes as popular as OT.

Indeed. BTW, did you mean to say "OP" (not "OT") ?

Janis Papanagnou

unread,

Jan 24, 2009, 7:27:25 PM1/24/09

to

Kenny McCormack wrote:
> I should also add that each programmer has a personal "threshold of pain"
> with respect to how much they like their programming language to look
> like giberish. I also think that most "script"-type languages have some
> element of giberish in them. I won't deny that AWK has some "magic" in
> it. But I have always thought that AWK struck the best balance between
> power and cryptic-ness.

Yes, it could be seen that way.

> Both sed and perl offend me, but, of course,

> each of these has its defenders. [...]

Well, while I have respect for some very clever features in perl I
have to admit that I don't like that language much. The "parameter
naming" I was referring to in another thread (originally labelled
"awk++ and polymorphism") and which is supported by perl does not
originate from this source; I've seen it already 20 years (or so)
ago in some other context that I yet don't recall, sadly. (Whether
supported by perl or not, I think that is a good feature, anyway.)

>>>[*] A historical note: 'twas I, long ago and under some other nym,
>>>who coined the term "OP".
>>
>>Oh! Really? - Now let's see whether the "PP" becomes as popular as OT.
>
> Indeed. BTW, did you mean to say "OP" (not "OT") ?

Yes, sure. (Just a typo/braino.)

Janis

Manuel Collado

unread,

Jan 25, 2009, 4:28:32 AM1/25/09

to

Janis Papanagnou escribió:

> [...]
> Well, while I have respect for some very clever features in perl I
> have to admit that I don't like that language much. The "parameter
> naming" I was referring to in another thread (originally labelled
> "awk++ and polymorphism") and which is supported by perl does not
> originate from this source; I've seen it already 20 years (or so)
> ago in some other context that I yet don't recall, sadly. (Whether
> supported by perl or not, I think that is a good feature, anyway.)

Do you refer to the named arguments feature of Ada? It has been there
since the first Ada83 standard.

print_date( year => 2009, month => January, day => 25 );

It is specially useful when combined with default values for omitted
arguments.

--
Manuel Collado - http://lml.ls.fi.upm.es/~mcollado

Janis Papanagnou

unread,

Jan 25, 2009, 4:43:03 AM1/25/09

to

Manuel Collado wrote:
> Janis Papanagnou escribió:
>
>> [...]
>> Well, while I have respect for some very clever features in perl I
>> have to admit that I don't like that language much. The "parameter
>> naming" I was referring to in another thread (originally labelled
>> "awk++ and polymorphism") and which is supported by perl does not
>> originate from this source; I've seen it already 20 years (or so)
>> ago in some other context that I yet don't recall, sadly. (Whether
>> supported by perl or not, I think that is a good feature, anyway.)
>
>
> Do you refer to the named arguments feature of Ada? It has been there
> since the first Ada83 standard.

Yes; not sure but possible that it had been Ada where I saw it first.
(One of the languages I never programmed myself in practise, therefore
my uncertainty about the origin.)

>
> print_date( year => 2009, month => January, day => 25 );
>
> It is specially useful when combined with default values for omitted
> arguments.

Indeed.

Janis

r.p....@gmail.com

unread,

Jan 26, 2009, 5:47:47 PM1/26/09

to

> Ed.- Hide quoted text -

>
> - Show quoted text -

Ed, sorry about the reply to the wrong post. My browser presents all
the comments in serial fashion, so it is easy to see prior posts in a
thread.

I agree with you on the first point. I still disagree, respectfully,
on the latter two. Do you really believe that concise programs are
easier to maintain?

r.p....@gmail.com

unread,

Jan 26, 2009, 5:59:36 PM1/26/09

to

On Jan 23, 10:50 pm, Janis Papanagnou <janis_papanag...@hotmail.com>
wrote:

I was actually trying to be respectful in my disagreement with you,
hence the phrase "respectfully disagree". I actually care quite a bit
about how awk/gawk is perceived and how people use it. As far as I'm
concerned, anyone who reads comp.lang.awk is already someone I admire,
whatever our minor differences of opinion.

By the way, Google Search: Results 1 - 10 of about 5,180 for "implicit
semantics".

Ed Morton

unread,

Jan 26, 2009, 6:56:29 PM1/26/09

to

"Concise" means "clear and brief". That's much better than "clear but
needlessly lengthy". Clarity is the key word. If you can write clear
code in a couple of lines, that's much easier to modify in future than
if you take 20 lines to get the job done.

Ed.

Ed Morton

unread,

Jan 26, 2009, 7:04:56 PM1/26/09

to

On Jan 26, 4:47 pm, r.p.l...@gmail.com wrote:

> Ed, sorry about the reply to the wrong post. My browser presents all
> the comments in serial fashion, so it is easy to see prior posts in a
> thread.

Aren't you using google groups? If so just click on "Options" at the
top right of your screen and select "View as tree" rather than
"Standard view".

Ed.