Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Insert part of the file name into delimited file

39 views
Skip to first unread message

juanp...@gmail.com

unread,
Oct 2, 2017, 5:20:08 PM10/2/17
to
Hello!

I need to insert part of a file name into the contents of a delimited file..

For instance.. file name: test_data_GA.csv

Data contents:
a,1
b,2
c,3

I need to parse out the state from the file name, in this case GA and insert it in the csv file as the first column..

Output:
GA,a,1
GA,b,2
GA,c,3

Would this be appropriate for awk?

Thanks in advance for your help!

Ben Bacarisse

unread,
Oct 2, 2017, 6:17:57 PM10/2/17
to
Yes, but it might be better to do it other ways. A shell can turn the
file name into a tag and sed can add a bit of text to every line and can
do it "in pace":

function addtag
{
local tag=${1##*_}
tag=${tag%.*}
sed -i -e "s/^/$tag,/" "$1"
}

If you want a new file (rather than in in-place edit) you'd just collect
the sed output into that new file name.

Just my initial thoughts. You can certainly do it with awk.

--
Ben.

Janis Papanagnou

unread,
Oct 2, 2017, 6:58:48 PM10/2/17
to
You can do it in awk. It also depends on the system you use and available
tools you have. Ben gave you a solution for the bash shell and sed.

In awk you can do, say,

awk '
!tag { tag=FILENAME; gsub(/^.*_|\..*$/,"",tag) }
{ print tag","$0 }
' test_data_GA.csv > out

which will produce the desired output in a file 'out', and then replace the
original file on shell level by

mv out test_data_GA.csv

or you can do all, including that renaming, inside awk (i.e. awk will call
shell code using system())

awk '
!tag { tag=FILENAME; gsub(/^.*_|\..*$/,"",tag) }
{ print tag","$0 > "out" }
END { system("mv out test_data_GA.csv") }
' test_data_GA.csv

If you happen to have a newer version of GNU awk there's also an "inplace"
option (similar to the "sed -i" inplace option you've seen in Ben's code) to
directly work on the input file.

Janis

Ben Bacarisse

unread,
Oct 2, 2017, 7:51:31 PM10/2/17
to
A word of warning to the OP: these are one-file awk programs. If you
pass a bunch of CSV files to the program you won't get what you want
(and you may lose data). It's perfectly safe as shown here, but I
thought it might be worth a mention.

> If you happen to have a newer version of GNU awk there's also an "inplace"
> option (similar to the "sed -i" inplace option you've seen in Ben's code) to
> directly work on the input file.

GAWK also has BEGINFILE and ENDFILE which can be used manage the tags
and the output file to make Janis's solution truly generic.

--
Ben.

Ed Morton

unread,
Oct 3, 2017, 8:54:49 AM10/3/17
to
awk 'FNR==1{s=FILENAME; gsub(/.*_|.csv$/,"",s)} {print s "," $0}' test_data_GA.csv
GA,a,1
GA,b,2
GA,c,3

Ed.

juanp...@gmail.com

unread,
Oct 3, 2017, 11:04:27 AM10/3/17
to
Thanks Ed, really appreciate it..

This worked exactly as what I was looking for..

Janis Papanagnou

unread,
Oct 3, 2017, 11:11:11 AM10/3/17
to
Interesting. Ed's code is almost an exact copy of what I had already posted.
What was the problem with the code I posted?

Janis

juanp...@gmail.com

unread,
Oct 3, 2017, 11:20:27 AM10/3/17
to
Thanks Ben, I ended up using Ed's but this is a good option as well..

Kaz Kylheku

unread,
Oct 3, 2017, 1:30:53 PM10/3/17
to
On 2017-10-03, Janis Papanagnou <janis_pa...@hotmail.com> wrote:
> Interesting. Ed's code is almost an exact copy of what I had already posted.
> What was the problem with the code I posted?

Being first is often the thing that is wrong.

Otherwise I'd be sitting here using a GUI from Xerox PARC, Bill Joy's Vi
instead of Vim, a kernel written by Ken Thompson rather than Torvalds,
etc.

Ralf Damaschke

unread,
Oct 3, 2017, 5:41:52 PM10/3/17
to
Janis Papanagnou wrote:
> On 03.10.2017 17:04, juanp...@gmail.com wrote:
>> On Tuesday, October 3, 2017 at 8:54:49 AM UTC-4, Ed Morton wrote:

>>> awk 'FNR==1{s=FILENAME; gsub(/.*_|.csv$/,"",s)} {print s "," $0}' test_data_GA.csv
>>
>> Thanks Ed, really appreciate it..
>>
>> This worked exactly as what I was looking for..
>
> Interesting. Ed's code is almost an exact copy of what I had already posted.
> What was the problem with the code I posted?

Your code does not work as expected when given several files as arguments.
This might not be a crucial but an important issue as it easily allows
to combine several lists distinguished by filenames into one list with
the distinction now expressed by an additional column.

Janis Papanagnou

unread,
Oct 3, 2017, 6:10:55 PM10/3/17
to
On 03.10.2017 23:41, Ralf Damaschke wrote:
> Janis Papanagnou wrote:
>> On 03.10.2017 17:04, juanp...@gmail.com wrote:
>>> On Tuesday, October 3, 2017 at 8:54:49 AM UTC-4, Ed Morton wrote:
>
>>>> awk 'FNR==1{s=FILENAME; gsub(/.*_|.csv$/,"",s)} {print s "," $0}' test_data_GA.csv
>>>
>>> Thanks Ed, really appreciate it..
>>>
>>> This worked exactly as what I was looking for..
>>
>> Interesting. Ed's code is almost an exact copy of what I had already posted.
>> What was the problem with the code I posted?
>
> Your code does not work as expected when given several files as arguments.

The requirement was: "insert part of a file name into the contents of a
delimited file".

> This might not be a crucial but an important issue as it easily allows
> to combine several lists distinguished by filenames into one list with
> the distinction now expressed by an additional column.

Of course, yes.

Janis

Ralf Damaschke

unread,
Oct 4, 2017, 5:22:58 PM10/4/17
to
And this obviously makes Ed's code superior to yours.

Janis Papanagnou

unread,
Oct 4, 2017, 5:34:44 PM10/4/17
to
This is not the point. The OP said: "This worked exactly as what I was
looking for..", and, given the stated requirements, it's not obvious why
the previously posted code did not do "exactly" what was requested. Note
also that the original requirement spoke about "insert into file", and
Ed's code does not address that at all (where Ben's and mine did). So a
statement about "superior code" is IMO not as clear as you think it is.

Janis

Ralf Damaschke

unread,
Oct 5, 2017, 5:31:31 PM10/5/17
to
Janis Papanagnou wrote:

> So a
> statement about "superior code" is IMO not as clear as you think it is.

Sorry, I thought you was looking for an answer to your question
"What was the problem with the code I posted?".

Janis Papanagnou

unread,
Oct 5, 2017, 6:21:54 PM10/5/17
to
Yes: What was the problem that it would not fulfill the _stated_ requirements.

It was *not*: What _non-stated_ requirements does the posted code not fulfill.

HTH.

Janis

0 new messages