Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Array Elements Out of Sync

1 view
Skip to first unread message

Elisa Roselli

unread,
Dec 10, 1999, 3:00:00 AM12/10/99
to
I need a way to update a file containing associations between identifying
numbers and phrases. Here's an example that I'll call NumberList:

01 B
02 C
03 Confir
04 E
05 M
06 Menu
07 Titre
08 W

Here, +ACQ-1 is the identifier, +ACQ-2 the phrase value and the Field Separator is a
tab.

Now, say I've added some elements to the source file for this list, so that
the sorted elements to be tagged have become:

B
C
Confir
E
Joe +ACM-new element
M
Mary +ACM-new element
Menu
Sylvia +ACM-new element
Titre
W

I'll call this file NewList. It is very important that the previously
existing phrases in NewList keep their old IDs, so I can't just do something
of the '+AHs-print NR +ACQ-0+AH0-' ilk, and overwrite NumberList.

Thus I've tried to write a program that reads down NewList, assigns its
previously existing elements their old IDs, accumulates all the new elements
in an array, and appends the new elements and new IDs in an END statement
once all have been read. The output I desire would be something like this:

01 B
02 C
03 Confir
04 E
05 M
06 Menu
07 Titre
08 W
09 Joe
10 Mary
11 Sylvia

I'm almost there, but there is a de-phasing in the array numbering and
printing that I just can't figure out. Here's the program as it now stands.

awk '
FILENAME +AD0APQ- ARGV+AFs-1+AF0- +AHs-
split(+ACQ-0, entry, +ACIAXA-t+ACI-)
obj+AFs-entry+AFs-2+AF0AXQ- +AD0- entry+AFs-1+AF0-
next
+AH0-
+AHs- item +AD0- +ACQ-0
if (item in obj)
+AHsAJA-0 +AD0- obj+AFs-item+AF0- +ACIAXA-t+ACI- item
print +ACQ-0
if (obj+AFs-item+AF0- +- 0 +AD4- max)
max +AD0- obj+AFs-item+AF0- +- 0
+AH0-
else
+AHs-newitems+AFs-z+AF0- +AD0- item
z+-+-+AH0-
+AH0-
END+AHs-for (i+AD0-1+ADs- i+ADwAPQ-z+ADs- i+-+-)
printf(+ACIAJQ-02d+AFw-t+ACU-s+AFw-n+ACI-, max +- i, newitems+AFs-i+AF0-)
print z+AH0- '

Calling this as
program NumberList NewList +AD4- NewNumber
I get the following output:

01 B
02 C
03 Confir
04 E
05 M
06 Menu
07 Titre
08 W
09 Mary
10 Sylvia
11
3

As you can see, Mary is where Joe should be, Sylvia is where Mary should be
and there is an empty hole where Sylvia should be. I've tried prefixing
rather than postfixing the incrementation of the array subscript z (+-+-z
rather than z+-+-) and it seems to make no difference. I've also tried
initializing z at 1, and starting the for loop in the END statement at 0
rather than 1. Although the errors vary according to the combination, I
never get quite what I want.

If it's not too much trouble, can anyone spot something obvious I may be
missing? Many thanks,

Elisa Roselli

Dan Mercer

unread,
Dec 10, 1999, 3:00:00 AM12/10/99
to
What editor did you use to produce this gibberish?
Well, from what I can understand, you are under the misapprehension
that

for(item in array)

will return the items in the order they were stored, which it won't.
That's documented in the man page. Arrays are really hashes, and
the order they are hashed is very much system dependent - though some
awks may actually return them in the order placed. If you want
things returned in a particular order, you need to use an indexed
array.

--
Dan Mercer
dame...@uswest.net


In article <82r8q0$huq$1...@wanadoo.fr>,

Opinions expressed herein are my own and may not represent those of my employer.


Kenny McCormack

unread,
Dec 10, 1999, 3:00:00 AM12/10/99
to
In article <82rdvm$8pn$1...@magnum.mmm.com>, Dan Mercer <dame...@mmm.com> wrote:
>What editor did you use to produce this gibberish?

Heh - good one!
It really is goofed up, isn't it? Blame it on the two scourages of the
modern world:

1) Microsoft
2) Internationalization

>Well, from what I can understand, you are under the misapprehension
>that
>

> for (item in array)


>
>will return the items in the order they were stored, which it won't.
>That's documented in the man page. Arrays are really hashes, and
>the order they are hashed is very much system dependent - though some
>awks may actually return them in the order placed. If you want
>things returned in a particular order, you need to use an indexed
>array.

Or get hold of a version of AWK where this bug (1) is fixed.
I know of at least two (2).

Elisa, did you ever identify for us what your platform is? Some of the
things you write make me think DOS/Windows; while other things sound like
Unix. It would help a lot to know.

(1) I know I will get flamed for this.
(2) Thompson AWK and my own version of mawk.


Charles Demas

unread,
Dec 10, 1999, 3:00:00 AM12/10/99
to
In article <82rfjc$r7t$1...@yin.interaccess.com>,

Kenny McCormack <gaz...@interaccess.com> wrote:
>In article <82rdvm$8pn$1...@magnum.mmm.com>,
>Dan Mercer <dame...@mmm.com> wrote:
>
>>What editor did you use to produce this gibberish?
>
>Heh - good one!
>It really is goofed up, isn't it? Blame it on the two scourages of the
>modern world:
>
> 1) Microsoft
> 2) Internationalization

Well if she wants help here, she'd be well placed to get something
that doesn't do that stuff when posting.

If I see it's screwed up like it was, I stop reading, and go on
to the next post.


Chuck Demas
Needham, Mass.

--
Eat Healthy | _ _ | Nothing would be done at all,
Stay Fit | @ @ | If a man waited to do it so well,
Die Anyway | v | That no one could find fault with it.
de...@tiac.net | \___/ | http://www.tiac.net/users/demas

Harlan Grove

unread,
Dec 11, 1999, 3:00:00 AM12/11/99
to
In article <82rfjc$r7t$1...@yin.interaccess.com>, gaz...@yin.interaccess.com
(Kenny McCormack) writes:

>In article <82rdvm$8pn$1...@magnum.mmm.com>, Dan Mercer <dame...@mmm.com>
>wrote:

>>Well, from what I can understand, you are under the misapprehension
>>that
>>
>> for (item in array)
>>
>>will return the items in the order they were stored, which it won't.

...


>Or get hold of a version of AWK where this bug (1) is fixed.

Not flames, but disagreement. If you consider the indices of associative arrays
to be sets, then there's no guarantee that the set is well- or
partially-ordered. So to paraphrase Dan Mercer, if you want FIFO, index the
array with a well-ordered set like integers. Or use multi level indexing.

To descend into the pedantic,

! ($x in assocarray) { key[++key[0]] = $x; assocarray[$x] = $y }
:
END { for (k = 1; k <= key[0]; ++k) print key[k], assocarray[key[k]] }

should print the assocarray FIFO (first in, first out - input order).

Elisa Roselli

unread,
Dec 13, 1999, 3:00:00 AM12/13/99
to
Dan Mercer a +AOk-crit dans le message +ADw-82rdvm+ACQ-8pn+ACQ-1+AEA-magnum.mmm.com+AD4-...
+AD4-What editor did you use to produce this gibberish?

Grievously sorry about the gibberish. It originates in MS Outlook Express,
and I'm trying to find the source of the problem. It has something to do
with the parametering of the display and the language set. If I select
+ACI-View - Language - Universal Alphabet UTF-7, the messages look fine to me.
If the parameter is changed, unreadable gibberish results. I never know how
a message that looks perfectly normal at my keyboard will look once it gets
to this forum, nor what language set to select that would suit us all
implicilty.

(This all started because I was trying to configure Outlook to read
Cyrillic. I've got that working fine at home but things are weird here at
the office)

+AD4-Well, from what I can understand, you are under the misapprehension
+AD4-that
+AD4-
+AD4- for(item in array)
+AD4-
+AD4-will return the items in the order they were stored, which it won't.

No, that was not my misapprehension. The +ACI-for(item in array)+ACI- clause was
just being used to test whether an item on the list containing new words was
not already present and identified on the old list+ADs- i.e. whether it was a
new or old item.

However, the simple fact of coming here and attempting to describe my
problems sometimes helps me to spot them myself. In this case, I had a few
clauses that were designed to accumulate new phrases in an array called
newitems:

else
+AD4- +AHs-newitems+AFs-z+AF0APQ-item
+AD4- z+-+-+AH0-

(Or - in English in case the display is still screwed -
if the item is not in the array obj, put item in another array newitems at
indice z, then increment z with z plus plus).

I've now got the program to work fine by incrementing z with +ACI-plus plus z+ACI-
before affecting item to the z indice of the array newitems. I guess the
problem was that z was uninitialized, and hence zero by default, so the
first item was being affected to the zero indice of newitems, which doesn't
exist in awk. By incrementing it first I get z to be equal to one.
Surprisingly, when I tried to initialize z at one, it wouldn't work.
Probably the incrementation was incorrectly placed in the loop.

Anyway, thanks again for your patience. Once again I've solved my own
problem, but it helped enormously to talk about it.

Elisa

Kenny McCormack

unread,
Dec 13, 1999, 3:00:00 AM12/13/99
to
In article <832i7v$41l$1...@wanadoo.fr>,
Elisa Roselli <e.ro...@volusoft.com> wrote:
...

>Anyway, thanks again for your patience. Once again I've solved my own
>problem, but it helped enormously to talk about it.

Such is the genius of psychiatry.

Brian Inglis

unread,
Dec 27, 1999, 3:00:00 AM12/27/99
to
On Mon, 13 Dec 1999 11:36:07 +0100, "Elisa Roselli"
<e.ro...@volusoft.com> wrote:

>Dan Mercer a +AOk-crit dans le message +ADw-82rdvm+ACQ-8pn+ACQ-1+AEA-magnum.mmm.com+AD4-...
>+AD4-What editor did you use to produce this gibberish?
>
>Grievously sorry about the gibberish. It originates in MS Outlook Express,
>and I'm trying to find the source of the problem. It has something to do
>with the parametering of the display and the language set. If I select
>+ACI-View - Language - Universal Alphabet UTF-7, the messages look fine to me.
>If the parameter is changed, unreadable gibberish results. I never know how
>a message that looks perfectly normal at my keyboard will look once it gets
>to this forum, nor what language set to select that would suit us all
>implicilty.
>
>(This all started because I was trying to configure Outlook to read
>Cyrillic. I've got that working fine at home but things are weird here at
>the office)

I've never heard of UTF-7 -- MS only?
UTF-8, Latin-1, US-ASCII are common.
Maybe try reading/posting in ASCII or Latin-1?
I can read posts from many parts, but yours seems to be
non-standard.

>Anyway, thanks again for your patience. Once again I've solved my own
>problem, but it helped enormously to talk about it.
>

>Elisa
>

Thanks. Take care, Brian Inglis Calgary, Alberta, Canada
--
Brian_...@CSi.com (Brian dot Inglis at SystematicSw dot ab dot ca)
use address above to reply

0 new messages