1. f_size = os.path.getsize(file_name)
2. fp1 = file(file_name, 'r')
data = fp1.readlines()
last_byte = fp1.tell()
I always get the same value when doing 1. or 2. Is there a reason I
should do both? When reading to the end of a file, won't tell() be just
as accurate as os.path.getsize()?
Thanks guys,
Bob
Read the docs. Note the hint that you get what the stdio serves up.
ftell() can only be _guaranteed_ to give you a magic cookie that you
may later use with fseek(magic_cookie) to return to the same place in a
more reliable manner than with Hansel & Gretel's non-magic
bread-crumbs. On 99.99% of modern filesystems, the cookie obtained by
ftell() when positioned at EOF is in fact the size in bytes. But why
chance it? os.path.getsize does as its name suggests; why not use it,
instead of a method with a side-effect? As for doing _both_, why would
you??
You don't always get the same value, even on systems where `tell()`
returns a byte position. You need the rights to read the file in case 2.
>>> import os
>>> os.path.getsize('/etc/shadow')
612L
>>> f = open('/etc/shadow', 'r')
Traceback (most recent call last):
File "<stdin>", line 1, in ?
IOError: [Errno 13] Permission denied: '/etc/shadow'
Ciao,
Marc 'BlackJack' Rintsch
On Windows, those two are not equivalent. Besides the newline conversion
done by reading text files, the solution in 2. will stop as soon as it sees
a ctrl-Z.
If you used 'rb', you'd be much closer.
--
- Tim Roberts, ti...@probo.com
Providenza & Boekelheide, Inc.
Doesn't appear to me to go wrong due to newline conversion:
Python 2.4 (#60, Nov 30 2004, 11:49:19) [MSC v.1310 32 bit (Intel)] on
win32
>>> import os.path
>>> txt = 'qwertyuiop\nasdfghjkl\nzxcvbnm\n'
>>> file('bob', 'w').write(txt)
>>> len(txt)
29
>>> os.path.getsize('bob')
32L ##### as expected
>>> f = file('bob', 'r')
>>> lines = f.readlines()
>>> lines
['qwertyuiop\n', 'asdfghjkl\n', 'zxcvbnm\n']
>>> f.tell()
32L ##### as expected
> the solution in 2. will stop as soon as it sees
> a ctrl-Z.
... and the value returned by f.tell() is not the position of the
ctrl-Z but more likely the position of the end of the current block --
which could be thousands/millions of bytes before the physical end of
the file.
Good ol' CP/M.
>
> If you used 'rb', you'd be much closer.
And be much less hassled when that ctrl-Z wasn't meant to mean EOF, it
just happened to appear in an unvalidated data field part way down a
critical file :-(
Doesn't appear to me to go wrong due to newline conversion:
Python 2.4 (#60, Nov 30 2004, 11:49:19) [MSC v.1310 32 bit (Intel)] on
win32
>>> import os.path
>>> txt = 'qwertyuiop\nasdfghjkl\nzxcvbnm\n'
>>> file('bob', 'w').write(txt)
>>> len(txt)
29
>>> os.path.getsize('bob')
32L ##### as expected
>>> f = file('bob', 'r')
>>> lines = f.readlines()
>>> lines
['qwertyuiop\n', 'asdfghjkl\n', 'zxcvbnm\n']
>>> f.tell()
32L ##### as expected
> the solution in 2. will stop as soon as it sees
> a ctrl-Z.
... and the value returned by f.tell() is not the position of the
ctrl-Z but more likely the position of the end of the current block --
which could be thousands/millions of bytes before the physical end of
the file.
Good ol' CP/M.
>
> If you used 'rb', you'd be much closer.
And be much less hassled when that ctrl-Z wasn't meant to mean EOF, it