info: important dbf module performance hint

115 views

Skip to first unread message

Mirek Zvolsky

unread,

Nov 4, 2016, 6:41:16 AM11/4/16

to Python dBase

Seems dbf module is extremely slow in some scenarios.

t = dbf.Table(filename)
t.open('read-only')
records = []
for rec in t:
  rec1 = rec   # this is fast (but gives nothing new :) )
  flds = rec['id_person'], rec['name']   # this is fast and we have field content for next code
  records.append(rec)   # this is extremely slow (same with saving into dict) !!!!!!!
t.close()

So, be careful.

If you read the file and want save records for next work (into list, into dict,..)

then save the fields (or join them into new pure python dictionary) instead of saving the record object from dbf module!

flds = dbf.get_fields(filename)
t = dbf.Table(filename)
t.open('read-only')
records = []
for rec in t:
  record = {fld: rec[fld] for fld in flds}
  records.append(record)   # this is fast
t.close()

This is not about large dbf files.

On my machine reading of 80000 records from 1 table

was 30 minutes

and with changed code is about 10 seconds.

So now my code is usable for web application where the user will save his 12 dbf files and want see them imported soon.