Suppose I start with the following on one line.....
Path:
news.sunsite.dk!dotsrc.org!news.net.uni-c.dk!aotearoa.belnet.be!news.belnet.be!newsfeed.kpn.net!pfeed09.wxs.nl!feeder.eternal-september.org!eternal-september.org!not-for-mail
The data content AND width change AND there isn't the normal space
delimiter one often has with a line of data.
(1) How would I extract the second bit of data ? In this case
"dotsrc.org"
(2) How would I extract the last bit of data ? In this case
eternal-september.org OR maybe eternal-september.org!not-for-mail
I think "not-for-mail" is at the end of all news header "paths" but I
am not sure of this. In any case I want the "eternal-september.org"
part whether the "not-for-mail" follows it, or not.
Can anyone here help with either/both queries please ?
Regards, John.
awk -f'!' { print $2 }
awk -f'!' { print $2; print $NF == "not-for-mail" ? $(NF-1) : $NF }
untested
Grant.
--
http://bugsplatter.id.au
awk -F'!' '{print $2; print $NF == "not-for-mail" ? $(NF-1) : $NF }'
awk -F'!' '{print $2; print ($NF == "not-for-mail" ? $(NF-1) : "") $NF}'
Ed.
>In article <qkccf5pqt3ai6slb3...@4ax.com>,
> John Fitzsimons <DELETEu...@sneakemail.com> wrote:
>> Windows Gawk newbie here.
< snip >
>> Path:
>> news.sunsite.dk!dotsrc.org!news.net.uni-c.dk!aotearoa.belnet.be!news.belnet.be
>> !newsfeed.kpn.net!pfeed09.wxs.nl!feeder.eternal-september.org!eternal-septembe
>> r.org!not-for-mail
< snip >
>> (1) How would I extract the second bit of data ? In this case
>> "dotsrc.org"
>> (2) How would I extract the last bit of data ? In this case
>> eternal-september.org OR maybe eternal-september.org!not-for-mail
< snip >
>awk -f'!' { print $2 }
Thanks, but I cannot get any of the solutions to work. I am obviously
doing something wrong and/or it isn't a simple "cut and paste"
solution. I get...
gawk.exe -f mon.awk path.txt >suntest.txt
gawk: mon.awk:1: awk -f'!' { print $2 }
gawk: mon.awk:1: ^ invalid char ''' in expression
Regards, John.
First of all, I don't think anybody really got what you were really
trying to do - so the responses have been of the "I don't know what you
are really trying to do, but here try this idea that just popped into my
head - it may get you started [towards wherever it is you think you are
trying to get to]" variety. (*)
Second, the example above assumes use of a Unix shell (where the 's work
correctly). You seem to be using Windows. The best advice for Windows AWK
users (except when using TAWK where, naturally enough, it all works
correctly) is to forget about doing stuff on the command line and just
put the script in a file.
Third, in the above command line, it needs to be -F (capital F), since
-f means something entirely different.
---------------------------------------------------------------------------
(*) Note: FWIW, IIRC, I *think* what you are looking for is something
like: for (i=1; i<=NF; i++) { split($i,T,"!");print T[1] }
But that's also just a guess.
>In article <qkccf5pqt3ai6slb3...@4ax.com>,
> John Fitzsimons <DELETEu...@sneakemail.com> wrote:
Hi Bob,
>> Windows Gawk newbie here.
>> Suppose I start with the following on one line.....
>> Path:
>> news.sunsite.dk!dotsrc.org!news.net.uni-c.dk!aotearoa.belnet.be!news.belnet.be
>> !newsfeed.kpn.net!pfeed09.wxs.nl!feeder.eternal-september.org!eternal-septembe
>> r.org!not-for-mail
>> The data content AND width change AND there isn't the normal space
>> delimiter one often has with a line of data.
>> (1) How would I extract the second bit of data ? In this case
>> "dotsrc.org"
>> (2) How would I extract the last bit of data ? In this case
>> eternal-september.org OR maybe eternal-september.org!not-for-mail
>> I think "not-for-mail" is at the end of all news header "paths" but I
>> am not sure of this. In any case I want the "eternal-september.org"
>> part whether the "not-for-mail" follows it, or not.
>> Can anyone here help with either/both queries please ?
>awk -f'!' { print $2 }
Okay, as suggested I updated my gawk.
gawk\bin>gawk.exe --version
GNU Awk 3.1.6
Copyright (C) 1989, 1991-2007 Free Software Foundation.
The error I get is..
C:\gawk\bin>gawk.exe -f bob.awk path.txt >suntest.txt
gawk: bob.awk:1: awk -f'!' { print $2 }
gawk: bob.awk:1: ^ invalid char ''' in expression
Does anyone have any suggestions as to how to get gawk 3.1.6 to work
please ?
Regards, John.
>In article <go1ff5l1dku3sgi56...@4ax.com>,
>John Fitzsimons <DELETEu...@sneakemail.com> wrote:
< snip >
>First of all, I don't think anybody really got what you were really
>trying to do
Well, nobody said that they couldn't understand my query. I gave an
example of the source data..
Path: news.sunsite.dk!dotsrc.org!news.net.uni-c. etc. etc.
and said...
How would I extract the second bit of data ? In this case
"dotsrc.org"
I doubt that I could make that any clearer.
> - so the responses have been of the "I don't know what you
>are really trying to do, but here try this idea that just popped into my
>head - it may get you started [towards wherever it is you think you are
>trying to get to]" variety. (*)
>Second, the example above assumes use of a Unix shell (where the 's work
>correctly). You seem to be using Windows.
Seem to be ? ! The first line of my post said..
Windows Gawk newbie here.
>The best advice for Windows AWK
>users (except when using TAWK where, naturally enough, it all works
>correctly) is to forget about doing stuff on the command line and just
>put the script in a file.
Have never heard of TAWK. I did put the script in a file.
>Third, in the above command line, it needs to be -F (capital F), since
>-f means something entirely different.
>---------------------------------------------------------------------------
>(*) Note: FWIW, IIRC, I *think* what you are looking for is something
>like: for (i=1; i<=NF; i++) { split($i,T,"!");print T[1] }
>But that's also just a guess.
Okay, here is the result of putting Ed's version in a script file..
\gawk\bin>gawk.exe -F ed.awk path.txt >suntest.txt
gawk: path.txt
gawk: ^ syntax error
The suntest file had the following entry.
errcount: 1
Thank you for your feedback.
Regards, John.
You are confused. you need to distingish the invocation of awk
with the awk code:
gawk.exe -F'!' '{print $2}' path.txt
This line invokes awk and passed several arguemnt to it, the main script
is '{print $2}'. I think this might fail for you because it is using
unix style quotes. So it might be useful to put the script
in a file foo.awk:
BEGIN{
FS="!" # equivalent to the -F'!'
}
{
print $2
}
Notice that gawk and -F are not included in the foo.awk file,
it contains only awk code. The command line shortcut has also
been transformed into awk code.
now you can run this script like this:
gawk.exe -f foo.awk path.txt
--
pgas @ SDF Public Access UNIX System - http://sdf.lonestar.org
Oh, maybe used it correctly. These computers are so stuffy, ya know.
Gotta do things always *their* way. Talk about "my way or the highway!"...
Maybe read a manual. I know. Such a pain...
Why should *you* have to do all the work?
Can't it at least meet you half way?
That should be -f, not -F. Try this:
gawk.exe -f ed.awk path.txt >suntest.txt
Ed.
You have an awk call in an awk program? That doesn't work.
I think this has already been mentioned, but probably buried
somewhere in the thread bandworm;
awk -f ed.awk path.txt >suntest.txt
where ed.awk contains *only* the _awk code_, which was just
BEGIN {FS="!"}
{print $2; print ($NF == "not-for-mail" ? $(NF-1) : "") $NF}
(You don't need the BEGIN clause if you put add option -F "!"
on command line. But better try it, as depiced, using BEGIN.)
Janis
> Does anyone have any suggestions as to how to get gawk 3.1.6 to work
> please ?
>
I haven't used it but I suspect it works just fine. What the suggestions so
far have in common is changing the field separator from the default
'run-of-whitespace' to '!' (since that is what separates the fields of
interest to you). No doubt how to do this is adequately explained in the
documentation (typically involving the special variable 'FS'). Then to get
the second field you just use the normal '$2'.
That won't directly solve your second problem, since sometimes you seem to
want the last field and sometimes the last two. But at least you'll be able
to find them easily enough while you decide.
If you want to leave the default field separator alone and do it the hard
way instead, take a look at the index(), rindex() and substr() functions.
- Anton Treuenfels
>On 2009-11-10, John Fitzsimons <DELETEu...@sneakemail.com> wrote:
>> \gawk\bin>gawk.exe -F ed.awk path.txt >suntest.txt
>> gawk: path.txt
>> gawk: ^ syntax error
>You are confused.
Absolutely. :-)
>you need to distingish the invocation of awk
>with the awk code:
>gawk.exe -F'!' '{print $2}' path.txt
>This line invokes awk and passed several arguemnt to it, the main script
>is '{print $2}'. I think this might fail for you because it is using
>unix style quotes.
Okay.
>So it might be useful to put the script
>in a file foo.awk:
Done.
>BEGIN{
> FS="!" # equivalent to the -F'!'
>}
>{
> print $2
>}
>Notice that gawk and -F are not included in the foo.awk file,
>it contains only awk code. The command line shortcut has also
>been transformed into awk code.
>now you can run this script like this:
>gawk.exe -f foo.awk path.txt
Thanks Pierre. That is a big step forward. I now no longer get
compile errors ! :-)
I also now get the output of my first query exactly as
wanted/expected. Thank you. Very much appreciated. :-)
Regards, John.
>In article <jq2if5te9adqiq7rl...@4ax.com>,
>John Fitzsimons <DELETEu...@sneakemail.com> wrote:
< snip >
>Maybe read a manual. I know. Such a pain...
< snip >
Actually, as someone who has done computer application tutoring I see
a few problems with that sort of comment. Last time I checked the .awk
manual was over 300 html pages. So..
(1) Newbies might not have the time to fully read 300+ pages.
(2) Even if they did it wouldn't mean that they understood what they
read.
(3) If they instead did a "search" then that would only work if they
were sure of the correct terms to be looking for.
(4) Someone used to coding can sometimes see an error in a couple of
seconds that a newbie might take days to find/work out. Assuming
he/she even got to that stage.
(5) Sometimes one only needs to understand a very small percentage
of a program to produce output that is worthwhile/significant to the
user.
HTH. :-)
Regards, John.
>John Fitzsimons wrote:
><snip>
Now that I have made the changes suggested by Janis, and others,
it seems to work fine. Thank you. :-)
Regards, John.
>John Fitzsimons wrote:
< snip >
Hi Janis,
>> The error I get is..
>> C:\gawk\bin>gawk.exe -f bob.awk path.txt >suntest.txt
>> gawk: bob.awk:1: awk -f'!' { print $2 }
>> gawk: bob.awk:1: ^ invalid char ''' in expression
>You have an awk call in an awk program? That doesn't work.
>I think this has already been mentioned, but probably buried
>somewhere in the thread bandworm;
> awk -f ed.awk path.txt >suntest.txt
>where ed.awk contains *only* the _awk code_, which was just
> BEGIN {FS="!"}
> {print $2; print ($NF == "not-for-mail" ? $(NF-1) : "") $NF}
>(You don't need the BEGIN clause if you put add option -F "!"
>on command line. But better try it, as depiced, using BEGIN.)
Thank you for not only finding the errors, but also for explaining why
certain things didn't work correctly. Very much appreciated. :-)
Regards, John.
You know nothing about using manuals. One doesn't have to read
the entire manual; he simply skims the first part in order
to find out how to run an awk program.
>
> (2) Even if they did it wouldn't mean that they understood what they
> read.
>
> (3) If they instead did a "search" then that would only work if they
> were sure of the correct terms to be looking for.
>
> (4) Someone used to coding can sometimes see an error in a couple of
> seconds that a newbie might take days to find/work out. Assuming
> he/she even got to that stage.
What's a "he/she"? I never encountered that term in the
works of the great authors.
>
> (5) Sometimes one only needs to understand a very small percentage
> of a program to produce output that is worthwhile/significant to the
> user.
>
> HTH. :-)
Are you blissfully unaware that you are the one desperately
in need of help?
>
> Regards, John.
--
A woman's place is in the home. --- Wise old saying.