Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

help converting a date in a CSV file

70 views
Skip to first unread message

laredotornado

unread,
Oct 20, 2010, 5:46:59 PM10/20/10
to
Hi,

I have a file, whose 18th column of data is a data in the format of
(for example), 14-Oct-1960. Every row in the column has a value, but
what I would like is to convert all those values to a different date
format -- 10/14/1960, from the example above.

Does anyone know how I can do this in a shell script? Thanks, - Dave

pk

unread,
Oct 20, 2010, 5:40:12 PM10/20/10
to

Sounds like something like this could work:

awk 'BEGIN{
split("Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec", m)
for(i=1;i<=12;i++) {
months[m[i]] = i
}
FS=OFS=","
}
{ split($18, d, /-/)
$18 = months[d[2]] "/" d[1] "/" d[3]
print
}'

Icarus Sparry

unread,
Oct 20, 2010, 6:08:48 PM10/20/10
to


Unfortunately "CSV" does not define a standard, it is fine until you need
to represent things like commas and quotes in it.

At this point you discover that you essentially need to break apart the
lines one character at a time. Suddenly what sounded like a simple
problem has become much more complicated!

If the file doesn't have any commas in the data, then this code is fine.

If the data does have commas then I suggest using something like perl or
python and code that other people have debugged to read and write the
file.

Janis Papanagnou

unread,
Oct 20, 2010, 6:11:04 PM10/20/10
to

I have one and a half additional suggestions;

$18 = sprintf("%04d-%02d-%02d",d[3],months[d[2]],d[1])

Converting a date format to an insane format shouldn't be supported. ;-)

Janis

w_a_x_man

unread,
Oct 20, 2010, 6:48:46 PM10/20/10
to

Using Ruby and assuming no commas in the data:

require "date"
while line = gets
columns = line.split ","
columns[17] = Date.parse( columns[17] ).strftime("%m/%d/%Y")
puts columns.join ","
end


Less prolix:

require "date"
while gets
columns = split ","
columns[17] = Date.parse( columns[17] ).strftime("%m/%d/%Y")
puts columns.join ","
end

Ben Finney

unread,
Oct 20, 2010, 8:07:20 PM10/20/10
to
w_a_x_man <w_a_...@yahoo.com> writes:

> On Oct 20, 4:46 pm, laredotornado <laredotorn...@zipmail.com> wrote:
> > I have a file, whose 18th column of data is a data in the format of
> > (for example), 14-Oct-1960.  Every row in the column has a value, but
> > what I would like is to convert all those values to a different date
> > format -- 10/14/1960, from the example above.
> >
> > Does anyone know how I can do this in a shell script?
>

> Using Ruby and assuming no commas in the data:

Oh, I didn't realise non-shell scripting languages are kosher in
response to a request for a shell script on ‘comp.unix.shell’.

In that case, here's a way that uses the Python standard library:

#! /usr/bin/python

import sys
import csv
import datetime

reader = csv.reader(sys.stdin)
writer = csv.writer(sys.stdout)

for record in reader:
col_18_text = record[17]
col_18_value = datetime.datetime.strptime(col_18_text, "%d-%b-%Y")
col_18_text = col_18_value.strftime("%d/%m/%Y")
record[17] = col_18_text
writer.writerow(record)

That has the advantage of a clever CSV library (no need for “assuming no
commas in the data”) and explicit datetime specifications.

It has the disadvantage of not being a shell script, which the OP asked
for.

--
\ “He who allows oppression, shares the crime.” —Erasmus Darwin, |
`\ grandfather of Charles Darwin |
_o__) |
Ben Finney

w_a_x_man

unread,
Oct 20, 2010, 8:48:35 PM10/20/10
to
On Oct 20, 4:40 pm, pk <p...@pk.invalid> wrote:
> On Wed, 20 Oct 2010 14:46:59 -0700 (PDT)
>

Here's an AWK function that can handle CSV data that
contains commas and double quotes.


function parse_csv( str, array, field,i )
{ split( "", array )
str = str ","
while ( match(str,
/[ \t]*("[^"]*(""[^"]*)*"|[^,]*)[ \t]*,/) )
{ field = substr( str, 1, RLENGTH )
gsub( /^[ \t]*"?|"?[ \t]*,$/, "", field )
gsub( /""/, "\"", field )
array[++i] = field
str = substr( str, RLENGTH + 1 )
}
}

w_a_x_man

unread,
Oct 20, 2010, 8:50:14 PM10/20/10
to
On Oct 20, 7:07 pm, Ben Finney <ben+u...@benfinney.id.au> wrote:

> w_a_x_man <w_a_x_...@yahoo.com> writes:
> > On Oct 20, 4:46 pm, laredotornado <laredotorn...@zipmail.com> wrote:
> > > I have a file, whose 18th column of data is a data in the format of
> > > (for example), 14-Oct-1960.  Every row in the column has a value, but
> > > what I would like is to convert all those values to a different date
> > > format -- 10/14/1960, from the example above.
>
> > > Does anyone know how I can do this in a shell script?
>
> > Using Ruby and assuming no commas in the data:
>
> Oh, I didn't realise non-shell scripting languages are kosher in
> response to a request for a shell script on ‘comp.unix.shell’.

I wonder why Benny the Bitch didn't have a hissy fit about the
only other solution, which was written in the programming language
AWK?

>
> In that case, here's a way that uses the Python standard library:
>
>     #! /usr/bin/python
>
>     import sys
>     import csv
>     import datetime
>
>     reader = csv.reader(sys.stdin)
>     writer = csv.writer(sys.stdout)
>
>     for record in reader:
>         col_18_text = record[17]
>         col_18_value = datetime.datetime.strptime(col_18_text, "%d-%b-%Y")
>         col_18_text = col_18_value.strftime("%d/%m/%Y")
>         record[17] = col_18_text
>         writer.writerow(record)

Despite the inclusion of 3 libraries, the bitch has managed
to make the code as bloated and ugly as herself. Impressive.

>
> That has the advantage of a clever CSV library

Ruby has at least one "clever" CSV library, but parsing
any possible CSV data is so easy that no library is really
needed.


> (no need for “assuming no
> commas in the data”)

Again, the AWK program assumed no commas in the data, but
the bitch neglects to shriek about that.

> and explicit datetime specifications.

The bitch is simply too stupid to realize that such a
specification makes the code more brittle.

>
> It has the disadvantage of not being a shell script, which the OP asked
> for.

The mindless bitch cannot comprehend that posters are under
no moral obligation to provide the OP with exactly what he
requested, since posters are not being paid by the OP.

A bitch will always be bitchy; it can do no other.


To illustrate why no library is needed for CSV parsing:

require "date"

def csv_split string
ary = (string.chomp+",").
scan(/\G"((?:[^"]+|"")*)",|\G([^,"\n]*),/)
ary.map{|a| a[1] || a[0].gsub(/""/,'"') }
end

while gets
columns = csv_split $_
columns[17] = Date.parse(columns[17]).strftime("%m/%d/%Y")
puts columns.join ","
end

Ben Finney

unread,
Oct 20, 2010, 9:20:44 PM10/20/10
to
w_a_x_man <w_a_...@yahoo.com> writes:

> I wonder why Benny the Bitch didn't have a hissy fit […]

Wow. Name-calling, projection, and envy, all in a single post.

*plonk*

--
\ “We have to go forth and crush every world view that doesn't |
`\ believe in tolerance and free speech.” —David Brin |
_o__) |
Ben Finney

Tim Harig

unread,
Oct 20, 2010, 9:39:33 PM10/20/10
to
On 2010-10-21, w_a_x_man <w_a_...@yahoo.com> wrote:
> On Oct 20, 7:07�pm, Ben Finney <ben+u...@benfinney.id.au> wrote:
>> w_a_x_man <w_a_x_...@yahoo.com> writes:
>> > On Oct 20, 4:46�pm, laredotornado <laredotorn...@zipmail.com> wrote:
>> > > I have a file, whose 18th column of data is a data in the format of
>> > > (for example), 14-Oct-1960. �Every row in the column has a value, but
>> > > what I would like is to convert all those values to a different date
>> > > format -- 10/14/1960, from the example above.
>>
>> > > Does anyone know how I can do this in a shell script?
>>
>> > Using Ruby and assuming no commas in the data:
>>
>> Oh, I didn't realise non-shell scripting languages are kosher in
>> response to a request for a shell script on ?comp.unix.shell?.

>
> I wonder why Benny the Bitch didn't have a hissy fit about the
> only other solution, which was written in the programming language
> AWK?

Lots of awk solutions are posted here. While awk isn't exactly a shell, it
is a utility that can, granting a few differences in implementation, be
found on just about any *nix system. Perl, Ruby, Python, etc, are less
commonly available. I might even have difficulty calling a system without
awk; ruby is optional.

>> In that case, here's a way that uses the Python standard library:
>>
>> � � #! /usr/bin/python
>>
>> � � import sys
>> � � import csv
>> � � import datetime
>>
>> � � reader = csv.reader(sys.stdin)
>> � � writer = csv.writer(sys.stdout)
>>
>> � � for record in reader:
>> � � � � col_18_text = record[17]
>> � � � � col_18_value = datetime.datetime.strptime(col_18_text, "%d-%b-%Y")
>> � � � � col_18_text = col_18_value.strftime("%d/%m/%Y")
>> � � � � record[17] = col_18_text
>> � � � � writer.writerow(record)
>
> Despite the inclusion of 3 libraries, the bitch has managed
> to make the code as bloated and ugly as herself. Impressive.

Funny that *you* are the one who had to resort to name calling.

>> That has the advantage of a clever CSV library
>
> Ruby has at least one "clever" CSV library, but parsing
> any possible CSV data is so easy that no library is really
> needed.

Easy as it may be, I have seen numerous errors in code attempting overly
simplified methods to parse CSV style files. In many cases with complex
quoting and excaping, a state machine *is* the simplest and most reliable
way to parse the data; especially if you cannot be 100% sure that data will
be properly formatted.

The real advantage of the Python CSV module is that it allows the
definition of multiple CSV style dialects. One therefore does not need to
reinvent code to handle Byzantine escape rules that may be used in any
given character deliniated data format.

>> (no need for ?assuming no
>> commas in the data?)


>
> Again, the AWK program assumed no commas in the data, but
> the bitch neglects to shriek about that.

His tone didn't shriek. Your's does.

> To illustrate why no library is needed for CSV parsing:
>
> require "date"
>
> def csv_split string
> ary = (string.chomp+",").
> scan(/\G"((?:[^"]+|"")*)",|\G([^,"\n]*),/)
> ary.map{|a| a[1] || a[0].gsub(/""/,'"') }
> end
>
> while gets
> columns = csv_split $_
> columns[17] = Date.parse(columns[17]).strftime("%m/%d/%Y")
> puts columns.join ","
> end

Your code works for one specific style of CSV format. If the format
changes, you will have to rework your code. If you have a program that
has to parse multiple CSV styles, you will have to duplicate it. The
Python CSV module (and probably similar modules in Perl, Ruby, etc)
merely requires adding the specifications of the requested CSV format
to a reader object.

That is not to say that Awk is insufficient for handling CSV files. One
can certainly write a set of reusable state machine functions in awk. That
isn't even to say that simple parsing such as your example above are not
often sufficient to solve a given problem. Awk was built because such a
method is often enough to get the job done; but, don't limit yourself to
using it for every problem. Eventually, you will get burned.

Edgardo Portal

unread,
Oct 21, 2010, 9:55:27 AM10/21/10
to

Maybe you could make use of GNU date in your script? For example:

prompt$ echo "$(date -d '14-Oct-1960' +'%m/%d/%Y')"
10/14/1960

laredotornado

unread,
Oct 21, 2010, 11:21:47 AM10/21/10
to
On Oct 20, 7:07 pm, Ben Finney <ben+u...@benfinney.id.au> wrote:
> w_a_x_man <w_a_x_...@yahoo.com> writes:

Thanks to all for the help and solutions. Ultimately, I went with
Ben's since I had python installed -- worked great. - Dave

0 new messages