Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
convert non-delimited to delimited
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  7 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
RyanL  
View profile  
 More options Aug 27 2007, 1:59 pm
Newsgroups: comp.lang.python
From: RyanL <ryanlaurit...@gmail.com>
Date: Mon, 27 Aug 2007 17:59:23 -0000
Local: Mon, Aug 27 2007 1:59 pm
Subject: convert non-delimited to delimited
I'm a newbie!  I have a non-delimited data file that I'd like to
convert to delimited.

Example...
Line in non-delimited file:
0139725635999992000010100534+42050-102800FM-15+1198KAIA

Should be:
0139,725635,99999,2000,01,01,00,53,4,+42050,-102800,FM-15,+1198,KAIA

What is the best way to go about this?  I've looked all over for
examples, help, suggestions, but have not found much.  CSV module
doesn't seem to do exactly what I want.  Maybe I'm just missing
something or not using the correct terminology in my searches.  Any
assistance is greatly appreaciated!  Using Python 2.4


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Mark Elston  
View profile  
 More options Aug 27 2007, 2:35 pm
Newsgroups: comp.lang.python
From: Mark Elston <m.els...@advantest-ard.com>
Date: Mon, 27 Aug 2007 11:35:17 -0700
Local: Mon, Aug 27 2007 2:35 pm
Subject: Re: convert non-delimited to delimited
* RyanL wrote (on 8/27/2007 10:59 AM):

Since you have to know, a priori, how to break the input string I
assume that these fields are of fixed length.  You can use the following
to do what you want:

 >>> a="0139725635999992000010100534+42050-102800FM-15+1198KAIA"
 >>> print "%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s" %
     (a[0:4],a[4:10],a[10:15],a[15:19],a[19:21],a[21:23],a[23:25],
      a[25:27],a[27],a[28:34],a[34:41],a[41:46],a[46:51],a[51:])

which results in the following output:

0139,725635,99999,2000,01,01,00,53,4,+42050,-102800,FM-15,+1198,KAIA

Mark


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
mensanator@aol.com  
View profile  
 More options Aug 27 2007, 2:38 pm
Newsgroups: comp.lang.python
From: "mensana...@aol.com" <mensana...@aol.com>
Date: Mon, 27 Aug 2007 11:38:24 -0700
Local: Mon, Aug 27 2007 2:38 pm
Subject: Re: convert non-delimited to delimited
On Aug 27, 1:35 pm, Mark Elston <m.els...@advantest-ard.com> wrote:

Or try this:

import struct
test = '0139725635999992000010100534+42050-102800FM-15+1198KAIA'
template = '4s6s5s4s2s2s2s2s1s6s7s5s5s4s'
the_line = struct.unpack(template,test)
print the_line
print ','.join(the_line)

##  ('0139', '725635', '99999', '2000', '01', '01', '00', '53', '4',
'+42050', '-102800', 'FM-15', '+1198', 'KAIA')
##
##
0139,725635,99999,2000,01,01,00,53,4,+42050,-102800,FM-15,+1198,KAIA


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Matimus  
View profile  
 More options Aug 27 2007, 2:40 pm
Newsgroups: comp.lang.python
From: Matimus <mccre...@gmail.com>
Date: Mon, 27 Aug 2007 18:40:06 -0000
Local: Mon, Aug 27 2007 2:40 pm
Subject: Re: convert non-delimited to delimited
On Aug 27, 10:59 am, RyanL <ryanlaurit...@gmail.com> wrote:

> I'm a newbie!  I have a non-delimited data file that I'd like to
> convert to delimited.

> Example...
> Line in non-delimited file:
> 0139725635999992000010100534+42050-102800FM-15+1198KAIA

> Should be:
> 0139,725635,99999,2000,01,01,00,53,4,+42050,-102800,FM-15,+1198,KAIA

> What is the best way to go about this?  I've looked all over for
> examples, help, suggestions, but have not found much.  CSV module
> doesn't seem to do exactly what I want.  Maybe I'm just missing
> something or not using the correct terminology in my searches.  Any
> assistance is greatly appreaciated!  Using Python 2.4

I don't think you are going to find anything that will just do this
for you. You are going to have read the file, figure out where to
split the string, and reprint it delimited with commas. As for
suggesting code... I can't tell how you actually want to delimit the
stuff from the above example? Are the fields always a fixed number of
characters? If they aren't then is there some other method for
determining how many characters to group into a field? From the looks
of it you could split that string any way you want and get something
that looks right, but isn't.

Matt


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Michael Bentley  
View profile  
 More options Aug 27 2007, 2:44 pm
Newsgroups: comp.lang.python
From: Michael Bentley <mich...@jedimindworks.com>
Date: Mon, 27 Aug 2007 11:44:11 -0700
Local: Mon, Aug 27 2007 2:44 pm
Subject: Re: convert non-delimited to delimited

On Aug 27, 2007, at 10:59 AM, RyanL wrote:

> I'm a newbie!  I have a non-delimited data file that I'd like to
> convert to delimited.

> Example...
> Line in non-delimited file:
> 0139725635999992000010100534+42050-102800FM-15+1198KAIA

> Should be:
> 0139,725635,99999,2000,01,01,00,53,4,+42050,-102800,FM-15,+1198,KAIA

> What is the best way to go about this?  I've looked all over for
> examples, help, suggestions, but have not found much.  CSV module
> doesn't seem to do exactly what I want.  Maybe I'm just missing
> something or not using the correct terminology in my searches.  Any
> assistance is greatly appreaciated!  Using Python 2.4

Is each data element a fixed size?

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Neil Cerutti  
View profile  
 More options Aug 27 2007, 2:44 pm
Newsgroups: comp.lang.python
From: Neil Cerutti <horp...@yahoo.com>
Date: Mon, 27 Aug 2007 18:44:18 GMT
Local: Mon, Aug 27 2007 2:44 pm
Subject: Re: convert non-delimited to delimited
On 2007-08-27, RyanL <ryanlaurit...@gmail.com> wrote:

> I'm a newbie!  I have a non-delimited data file that I'd like to
> convert to delimited.

> Example...
> Line in non-delimited file:
> 0139725635999992000010100534+42050-102800FM-15+1198KAIA

> Should be:
> 0139,725635,99999,2000,01,01,00,53,4,+42050,-102800,FM-15,+1198,KAIA

It looks like a fixed format data file, also called flat-record
files, in which fields are of fixed lengths, and records are
separated by newlines.

> What is the best way to go about this?  

Check out chapter 2 of _Text Processing in Python_ for one
solution.

http://gnosis.cx/TPiP/chap2.txt

--
Neil Cerutti
Weight Watchers will meet at 7 p.m. Please use large double door at the side
entrance. --Church Bulletin Blooper


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Paul McGuire  
View profile  
 More options Aug 27 2007, 8:06 pm
Newsgroups: comp.lang.python
From: Paul McGuire <pt...@austin.rr.com>
Date: Mon, 27 Aug 2007 17:06:31 -0700
Local: Mon, Aug 27 2007 8:06 pm
Subject: Re: convert non-delimited to delimited
On Aug 27, 12:59 pm, RyanL <ryanlaurit...@gmail.com> wrote:

> I'm a newbie!  I have a non-delimited data file that I'd like to
> convert to delimited.

> Example...
> Line in non-delimited file:
> 0139725635999992000010100534+42050-102800FM-15+1198KAIA

> Should be:
> 0139,725635,99999,2000,01,01,00,53,4,+42050,-102800,FM-15,+1198,KAIA

> What is the best way to go about this?  I've looked all over for
> examples, help, suggestions, but have not found much.  CSV module
> doesn't seem to do exactly what I want.  Maybe I'm just missing
> something or not using the correct terminology in my searches.  Any
> assistance is greatly appreaciated!  Using Python 2.4

I'm guessing that these lines *aren't* fixed-length, especially those
signed integer fields.  I used the patented Paul McGuire CrystalBall
module to come up with this pyparsing rendition.  (OP may adjust to
suit.)

-- Paul

data = "0139725635999992000010100534+42050-102800FM-15+1198KAIA"
"""to be parsed as:
0139,725635,99999,2000,01,01,00,53,4,+42050,-102800,FM-15,+1198,KAIA"""

from pyparsing import *
import time
def convertTimeStamp(t):
    t["date"] = map(int,t.date)
    t["time"] = map(int,t.time)
    return time.strftime("%Y-%m-%dT%H:%M",
                          tuple(t.date)+tuple(t.time)+(0,0,0,0))

yearMonthDay = Word(nums,exact=4) + Word(nums,exact=2) +
Word(nums,exact=2)
hourMinuteSecond = Word(nums,exact=2) + Word(nums,exact=2)
timestamp = ( yearMonthDay("date") + hourMinuteSecond("time") )
timestamp.setParseAction(convertTimeStamp)
signedInteger = Word("+-",nums)

fieldA = Word(nums,exact=4)("A")
fieldB = Word(nums,exact=6)("B")
fieldC = Word(nums,exact=5)("C")
fieldD = timestamp("timestamp")
fieldE = Word(nums)("E")
fieldF = signedInteger("latitude").setParseAction(lambda t : int(t[0])/
1000.0)
fieldG = signedInteger("longitude").setParseAction(lambda t :
int(t[0])/1000.0)
fieldH = Combine(Word(alphas,exact=2) + "-" + Word(nums,exact=2))("H")
fieldI = signedInteger("I")
fieldJ = Word(alphas)("J")
dataFields = fieldA + fieldB + fieldC + fieldD + fieldE + \
    fieldF + fieldG + fieldH + fieldI + fieldJ

res = dataFields.parseString(data)
print res.dump()

prints:

['0139', '725635', '99999', '2000-01-01T00:53', '4',
42.049999999999997, -102.8, 'FM-15', '+1198', 'KAIA']
- A: 0139
- B: 725635
- C: 99999
- E: 4
- H: FM-15
- I: +1198
- J: KAIA
- latitude: 42.05
- longitude: -102.8
- timestamp: 2000-01-01T00:53


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »