如何優化python讀檔速度

33 views

Skip to first unread message

AndyChung

unread,

Jan 2, 2012, 5:55:23 AM1/2/12

to python-cn-free（中文python邮件列表）

*環境：*
Python2.6 + Django1.2.5 + MySql + DotCloud
*前置作業：*
現在有幾千個文件檔
每個文件裡約有200-400筆資料
(附檔是小範例共4個檔案，
1.txt為1筆資料，2.txt為2筆資料，
3.txt為3筆資料，4.txt為4筆資料)
*目的：*
把文件裡的每筆資料先用固定格式，
再用record_query_vector包成一個list
(請先不要提psyco和pypy，想先從程式去改)
*問題：*
我目前程式是這樣，主要耗的速度是讀檔的時間，
光是讀幾百個檔( 迴圈次數多 )，就要花將近40秒的時間，
請問該怎麼改？讓速度更快呢?
請指點一下或者麻煩改程式碼，麻煩了，謝謝您。
*變數： *
全部文件檔=all_f=[ ]
try與finally包成的程式是文件資料要轉成的格式
for ts in TData.objects.raw(*'SELECT id,* f_url *FROM ts_to'*):
all_f.append(ts.f_url)
第幾個文件檔的連結=f_url
record_query_vector=[ ]
*
*
*程式：*
for f_url in all_f:
test_dict={ }#字典
file=open(f_url)
next(file)
try:
for line in file.readlines():
dk=line.split()[:4]#串列,字典的key
dk=' '.join(dk)#字串
dv=line.split()[4:]#串列,字典的value
dv=' '.join(dv)#串列包字串
dv= [int(n) for n in dv.split(' ')]
test_dict[str(dk)]=dv
count+=1
finally:
file.close()
record_query_vector.append(test_dict.values())

Reply all

Reply to author

Forward

0 new messages