CSV With no header row

337 views
Skip to first unread message

robi...@e2eservices.co.uk

unread,
Apr 6, 2017, 5:24:21 AM4/6/17
to python-etl
Hello,

I have what's hopefully a really simple question, but I can't see anything obviously in the docs for it.

I want to ingest a CSV file that has no header row, and either be able to pre-set the headers for the table or ideally tell the `etl.fromcsv(..)` what the headers should be called..

At the moment I've only really tried the following:

>>> import petl as etl
>>> table = etl.fromcsv("test.csv")
>>> table
+------------+------------------------------------+-----+-------+-------+
| 11/02/17   | this is a string                   | 1   | 6     | asd   |
+============+====================================+=====+=======+=======+
| '12/02/17' | 'oo another string'                | '2' | '7'   | 'qwe' |
+------------+------------------------------------+-----+-------+-------+
| '24/03/17' | 'one more time lets have a string' | '3' | '5.5' | 'zxc' |
+------------+------------------------------------+-----+-------+-------+


It looks like the fromcsv, function would need a new args added to set the header on the underlying table object (much like the fromcolumns method).

Thanks,
Robin

robi...@e2eservices.co.uk

unread,
Apr 6, 2017, 6:20:51 AM4/6/17
to python-etl
I've got the behaviour i wanted but by using the petl.io.json.fromdicts() and csv.DictReader

>>> fieldnames = ['1','2','3','4','5']
>>> sourcerows = []
>>> with open('test.csv') as csvfile:
...     reader = csv.DictReader(csvfile, fieldnames=fieldnames)
...     for row in reader:
...             sourcerows.append(row)
>>> table = etl.json.fromdicts(sourcerows)
>>> table
+------------+------------------------------------+-----+-------+-------+
| 1          | 2                                  | 3   | 4     | 5     |
+============+====================================+=====+=======+=======+
| '11/02/17' | 'this is a string'                 | '1' | '6'   | 'asd' |
+------------+------------------------------------+-----+-------+-------+
| '12/02/17' | 'oo another string'                | '2' | '7'   | 'qwe' |
+------------+------------------------------------+-----+-------+-------+
| '24/03/17' | 'one more time lets have a string' | '3' | '5.5' | 'zxc' |
+------------+------------------------------------+-----+-------+-------+

Would still like to know if there is a way to do it via fromcsv, or if there could be


robi...@e2eservices.co.uk

unread,
Apr 6, 2017, 6:38:07 AM4/6/17
to python-etl
Wasn't too hard to just add the functionality myself

Alistair Miles

unread,
Apr 6, 2017, 7:22:52 AM4/6/17
to pytho...@googlegroups.com
Thanks Robin, that's a nice convenience feature, have merged.

FWIW there is also a pushheader() function which can be used to prepend a header row in any situation where you are without one. E.g., etl.fromcsv('headerless.csv').pushheader(['1', '2', '3', '4', '5']).

Cheers,
Alistair

On Thu, Apr 6, 2017 at 11:38 AM, <robi...@e2eservices.co.uk> wrote:
Wasn't too hard to just add the functionality myself

--
You received this message because you are subscribed to the Google Groups "python-etl" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python-etl+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Alistair Miles
Head of Epidemiological Informatics
Centre for Genomics and Global Health <http://cggh.org>
The Wellcome Trust Centre for Human Genetics
Roosevelt Drive
Oxford
OX3 7BN
United Kingdom
Email: alim...@googlemail.com
Web: http://purl.org/net/aliman
Twitter: https://twitter.com/alimanfoo
Tel: +44 (0)1865 287721
Reply all
Reply to author
Forward
0 new messages