Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

add file name to output file along with the line extracted

6 views
Skip to first unread message

kishorered...@gmail.com

unread,
Jan 26, 2017, 10:54:03 AM1/26/17
to
Dear group,

I am trying to extract 10th line from group of files.
I am able to do it with

find . -name "*.final.out" | parallel "awk 'NR ==10' {} >> result.txt"

However, I need to add a filename in the output file for every line so that I can identify from which file the line came from.

Any help is appreciated.

Best,
Kishore

Kaz Kylheku

unread,
Jan 26, 2017, 11:20:43 AM1/26/17
to
On 2017-01-26, kishorered...@gmail.com <kishorered...@gmail.com> wrote:
> Dear group,
>
> I am trying to extract 10th line from group of files.
> I am able to do it with
>
> find . -name "*.final.out" | parallel "awk 'NR ==10' {} >> result.txt"
>
> However, I need to add a filename in the output file for every line so that I can identify from which file the line came from.

'NR==10 { print FILENAME, $0 }'

You are relying on these ">> result.txt" from parallel jobs being
atomic. This is only the case if each Awk job performs its output as a
single write() system call.

Kishore Reddy

unread,
Jan 26, 2017, 12:24:43 PM1/26/17
to
Got it Thanks !!!

Kaz Kylheku

unread,
Jan 26, 2017, 12:37:56 PM1/26/17
to
P.S. If that parallel logging issue is a real problem, a non-useless use
of cat can help:

( echo this; awk ' ... { print that }' ) | cat >> file

Even the command issues multiple writes (as the above example will for sure,
duen to different commands being used), cat will accumulate them up and issue
a single write when its stdin closes --- that is, if you don't go over
cat's I/O buffer size, which is almost certainly generous enough for a
small piece of output.
0 new messages