> So if I use ls -la and the file size is reported as 8226 that's not how
> many bytes are in the file? 8225 is the offset of the last byte.
Logically, there are 8226 bytes in the file (with offsets 0 thru
8225, inclusive, for a total of 8226 bytes).
Physically, there may be more than that because disk space is
allocated in blocks and any unused fractional disk block is wasted.
Some file systems use a dual-blocksize scheme: if a file is smaller
than one BigBlock, disk allocated is a multiple of SmallBlock, but
if the file is larger than one BigBlock, disk is allocated in
multiples of BigBlock. Typical sizes for a modern desktop might
be SmallBlock = 2048 bytes and BigBlock = 16384 bytes.
Unless you have to micromanage disk space because you're always
critically short, you really don't care about the physical details.
Physically, there may be less than that in a "sparse file" because
disk blocks may not have been allocated for unwritten portions of
a file. (This enables such oddities as "terabyte-long" files on a
1.44MB floppy disk, at least according to "ls -l".)
A (larger) file may require "indirect blocks" which keep track of
blocks that are part of the file. Some of these are stored in an
inode. Back in the UNIX V7 days, the block size was 512 bytes, and
you got 10 block numbers in an inode, so if a file was larger than
5,120 bytes, it needed an indirect block.
The st_blocks value reported times the block size may not equal the
st_size value rounded up to the next higher block size due to
unwritten data blocks and indirect blocks.
It is not meaningful to talk about a "file size in system memory"
since a sufficiently large file will not *FIT* in memory, and even
if it does, the system will balance the needs for this file and
other files in use at the same time, and you can expect that amount
to change without notice. For example, if the program is spending
a long time waiting for console input, its resident memory usage
may go to nearly zero.
A C text file when read into memory may differ from the file size
reported by a stat() equivalent on Windows because C translates
\r\n line endings to \n line endings on reading, and the reverse
on writing.
A File System Accountant can use all sorts of measures to properly
bill you for the file according to policy, which is likely to have
all the convenience and understandability of IRS tax forms. That
might include billing for the size of the inode (Unix) and the size
of the directory entry (which depends on the length of the *name*
of the file), and sharing the cost of the inode between users with
different links to the same file..
> Hum. I
> guess you learn something new everyday. I'm using ext4. Will it contain what
> I'm looking for? I what to know how many bytes are composing the file and
*Logically*, the number of bytes in the file is given by "ls -l".
The amount of disk space used to store it may not be the same value.
> how the OS treats the file in RAM. That might be the kernel's memory
> mangement job rather than sys calls. But don't all system calls call on
> kernel functions?
How much RAM is used for a particular file is subject to constant change,
and if you REALLY, REALLY need to know this EXACTLY, you're in deep,
deep trouble, because you will be wrong.