Newsgroups: comp.lang.python
From: Jorgen Grahn <grahn+n...@snipabacken.dyndns.org>
Date: 20 Jan 2008 13:53:54 GMT
Local: Sun, Jan 20 2008 8:53 am
Subject: Re: Efficient processing of large nuumeric data file
On Fri, 18 Jan 2008 09:15:58 -0800 (PST), David Sanders <dpsand...@gmail.com> wrote: ... > Hi, > I am processing large files of numerical data. Each line is either a > My question is how to process such files efficiently to obtain a > The data files are large (~100 million lines), and this code takes a I don't know if you are in control of the *generation* of data, but > long time to run (compared to just doing wc -l, for example). I think it's often better and more convenient to pipe the raw data through 'gzip -c' (i.e. gzip-compress it before it hits the disk) than to figure out a smart application-specific compression scheme. Maybe if you didn't have a homegrown file format, there would have /Jorgen -- You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
| ||||||||||||||