Pointers and Memory-Mapped Files

149 views
Skip to first unread message

Heather Mason

unread,
Mar 19, 2012, 2:40:47 PM3/19/12
to
Hey everyone,

I'm attempting to work with a large file which multiple processes may
read from and write to. I plan to use memory mapping to point to
specific portions of the file I would like to modify. My question is how
to properly handle an instance in which the file expands from the front,
thereby disrupting the byte offset used in my pointer. For example,

data = mmap((caddr_t)0, len, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);

In this case, the offset is 0, and I wish to modify the beginning of the
file. If another process were to input 4 bytes into the beginning of the
file, when I write to the file, I am suddenly overwriting the 4 bytes of
inserted data and the file becomes, essentially, corrupted for the
purposes of this program.

I suppose one solution would be to use a mutex to lock the file when
performing any operation, but I find this to be less than desirable and
essentially defeating the purpose of the program. Is there an efficient
means to track whether the pointer no longer points to the original data
and, if so, by how much the original data has been offset?
--
comp.lang.c.moderated - moderation address: cl...@plethora.net -- you must
have an appropriate newsgroups line in your header for your mail to be seen,
or the newsgroup name in square brackets in the subject line. Sorry.

George Neuner

unread,
Mar 25, 2012, 10:43:54 PM3/25/12
to
On Mon, 19 Mar 2012 13:40:47 -0500 (CDT), Heather Mason
<hea...@mason.com> wrote:

>Hey everyone,
>
>I'm attempting to work with a large file which multiple processes may
>read from and write to. I plan to use memory mapping to point to
>specific portions of the file I would like to modify. My question is how
>to properly handle an instance in which the file expands from the front,
>thereby disrupting the byte offset used in my pointer. For example,
>
>data = mmap((caddr_t)0, len, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
>
>In this case, the offset is 0, and I wish to modify the beginning of the
>file. If another process were to input 4 bytes into the beginning of the
>file, when I write to the file, I am suddenly overwriting the 4 bytes of
>inserted data and the file becomes, essentially, corrupted for the
>purposes of this program.

This part is a bit unclear (though I think I know the answer). Is the
first write an insertion that should have made the entire file longer
by 4 bytes, or is it simply a case of 2 processes writing to the same
location with no coordination?

>I suppose one solution would be to use a mutex to lock the file when
>performing any operation, but I find this to be less than desirable and
>essentially defeating the purpose of the program. Is there an efficient
>means to track whether the pointer no longer points to the original data
>and, if so, by how much the original data has been offset?

There is no file pointer. When you use a memory map, you do *not* see
a file, you see a block of memory just as if you had used malloc().
When you share a mapping between process, you are, in effect, only
sharing the block of memory. The VMM system transparently writes
changes made to the memory block into the file underneath, but there
is no inherent write coordination among the processes sharing it.

Coordinating writes with a mutex won't prevent corruption if the file
really needs to expand due to an insertion. There isn't a simple way
to handle that even with "normal" files - unless you do something
clever, you need to make room for an insertion by moving - i.e.
rewriting - all of the data that follows the insertion point.

I can think of several ways to handle this kind of situation, but any
of them might be overkill for your purpose. Without divulging any
secrets, can you expand on the intended use? If we have a better idea
of what you really need, we might be able to give you some better
suggestions.

George

Dag-Erling Smørgrav

unread,
Mar 25, 2012, 10:42:53 PM3/25/12
to
Heather Mason <hea...@mason.com> writes:
> In this case, the offset is 0, and I wish to modify the beginning of
> the file. If another process were to input 4 bytes into the beginning
> of the file, when I write to the file, I am suddenly overwriting the 4
> bytes of inserted data and the file becomes, essentially, corrupted
> for the purposes of this program.

There is no way to insert data at the beginning of a file without
rewriting the entire file.

DES
--
Dag-Erling Smørgrav - d...@des.no
Reply all
Reply to author
Forward
0 new messages