> As per my understanding, I can explain the working of mmap() briefly
> like this: When a process(let's call it process1) calls mmap on a
> regular file, that file is first copied to the page cache. Then the
> region of page cache which contains the file is mapped to virtual
> address space of the process1(This memory region is called
> memory-mapped file). If another process(let's call it process2) calls
> mmap on the same file, then the same page cache that was mapped to
> process1 will get mapped to the virtual address space of process2.
> When the processes wants to access the file, they simply access this
> memory mapped file which is very faster. Also the data modified by
> process1 can be seen by process2.
It *may* be faster. But address space manipulations, page faults
occurring while populating some part of the virtual address space of a
process and cache- and TLB-misses are all expensive operations, hence,
it may well be not.
> I have a query here. Please clarify it:
> When the process1 wants to write some data to the file, it will write
> to this memory mapped file. Then these dirty pages that are private to
> the process1 should be copied to the page cache. When will the kernel
> do this copying to page cache and how frequently?
Not at all. If the mapping is done as MAP_PRIVATE, the process will gets
its own copy of each page as soon as it starts writing to it. For
MAP_SHARED mappings, all processes mapping the same file plus the kernel
page cache will write to the same page.