mergemem: announce & design issues

43 views
Skip to first unread message

Philipp Reisner

unread,
Mar 17, 1998, 3:00:00 AM3/17/98
to

Hi Linux mm hackers,

Linux is able to share anonymous mappings of running processes,
employing
copy on write.
But this is hardly used, because most processes are started by a
shell. For example two users are running a netscape, and since the
processes are not in a parent-child relationship the anonymous mappings
are not shared.

Mergemem can merge this anonymous mappings of already running processes.
This is done by:
*) search for equal pages in the anonymous mappings (done in user-land)
*) The user-level process only sees a checksum and the map-count of a
page (checksum calculation in kernel-land)
*) If two equal pages are found, map one of them to the vm of both
processes, and free the spare page. (done in kernel-land)

*) If one of the processes is writing to that page afterwards, the
normal copy on write takes place.

We are quite satisfied with the gain of free memory, and the cpu-cycles
used
by the search process are affordable.

Design issues:
*) The user-level / kernel interface is a mess (currently by writes to
an
entry in the proc-fs). What is the good thing(tm) ??
*) a device and ioctls ? (We are looking at this already)
*) or a syscall ?
*) It is currently not SMP save. Can someone give me a hint how to do it
??
( A spin_lock(mmlck) at the beginning of mergemem & do_wp_page ? )
*) If I include it into a 2.1.xx kernel and make a patch of it, will
it be included into further kernels ?? (Linus ?)

Status:
mergemem-0.05 is usable. ( We are using it; Even when my girlfriend
touches
the computer, it is not crashing, thus it must be bugfree :-) )
(All previous releases had bugs... dying processes, OOPSs...)
We are already working on 0.06, but I want to hear your opinion on
mergemem

The working package is at http://www.mondoshawan.ml.org,
here is the core part, so the mm hackers can have a first look at it.

static int mergemod(int pid1,unsigned long addr1,int pid2, unsigned long
addr2)
{
struct task_struct * tsk1, * tsk2;
char * page1, * page2;
pte_t * pte1, * pte2, pte;

tsk1 = find_task_by_pid(pid1);
if(!tsk1) return MERGEMEM_NOTASK1;
tsk2 = find_task_by_pid(pid2);
if(!tsk2) return MERGEMEM_NOTASK2;

page1 = get_phys_addr(tsk1,addr1,&pte1);
if(!page1) return MERGEMEM_NOPAGE1;
page2 = get_phys_addr(tsk2,addr2,&pte2);
if(!page2) return MERGEMEM_NOPAGE2;

if(page1 == page2)
return MERGEMEM_ALREADYSH;

if(atomic_read(&mem_map[MAP_NR(page2)].count) != 1)
return MERGEMEM_MOREMAP;

cli(); /* start of critical section */
if(memcmp(page1,page2,PAGE_SIZE))
{
sti(); /* possible end of critical section */
return MERGEMEM_NOTEQUAL;
}

/* Do the actual work... */

/* new pte's are readonly and point to page1 */
pte = pte_wrprotect(*pte1);

/* increase the count in the according mem_map structure */
atomic_inc(&mem_map[MAP_NR(page1)].count);

/* page1 must go off the swap_cache, since it is now mapped more
* than one time
*/
if(delete_from_swap_cache(&mem_map[MAP_NR(page1)]))
pte = pte_mkdirty(pte);

set_pte(pte1,pte);
set_pte(pte2,pte_mkold(pte));

sti(); /* possible end of critical section */
/* free page2 */
free_page((unsigned long)page2);

/* decrease the rss counters */
tsk2->mm->rss--;

/* No, TLB flushing needed, because the processes which PTEs where
* altered, are not running at the moment.
*/
return MERGEMEM_SUCCESS;
}

------------------------------------------------------------------------
Want to try something new? Are you a Linux hacker?
Volunteer in testing mergemem!
(Get it from http://www.mondoshawan.ml.org/mergemem)
-----
Philipp Reisner E-Mail mailto:e952...@student.tuwien.ac.at


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.rutgers.edu

Regis Duchesne

unread,
Mar 17, 1998, 3:00:00 AM3/17/98
to

> Mergemem can merge this anonymous mappings of already running processes.
This concept is quite interesting

> *) If two equal pages are found, map one of them to the vm of both
> processes, and free the spare page. (done in kernel-land)

What do you mean by "two equal pages"? Do you only compare both checksums?
I hope that when checksums match, you compare both entire pages too, so
that the checksum is used as a hashcode only.

I wouldn't see a process crash because the checkums were the same and the
pages weren't.

Regards,

Regis "HPReg" Duchesne - Engineering Student at ***** ******** *****
www http://www.via.ecp.fr/~regis/
(O o) I use Linux & 3Com (1135 KB/s over 10Mb/s ethernet)
--.oOO--(_)--OOo.-----------------------------------------------------------

Philipp Reisner

unread,
Mar 17, 1998, 3:00:00 AM3/17/98
to

Regis Duchesne wrote:
>
> > Mergemem can merge this anonymous mappings of already running processes.
> This concept is quite interesting
>
> > *) If two equal pages are found, map one of them to the vm of both
> > processes, and free the spare page. (done in kernel-land)
> What do you mean by "two equal pages"? Do you only compare both checksums?
> I hope that when checksums match, you compare both entire pages too, so
> that the checksum is used as a hashcode only.

Of course, the module is comparing the pages with memcmp
(see the function in the original posting, it is taken from the module)

Yours Philipp.

------------------------------------------------------------------------
Want to try something new? Are you a Linux hacker?
Volunteer in testing mergemem!
(Get it from http://www.mondoshawan.ml.org/mergemem)
-----
Philipp Reisner E-Mail mailto:e952...@student.tuwien.ac.at

-

Philipp Reisner

unread,
Mar 17, 1998, 3:00:00 AM3/17/98
to

Jacques Gelinas wrote:
>
> Do you have some statistics on the saving with some programs you know
> benefit from that ?
>

It's hard to forecast the benefit for a program...
But some examples:
(intel, two instances of the named program)

netscape3 240KB
xemacs(19.14) 520KB
X (XSuSE_NVidia) 168KB ( I also heard of results of up do 1.5MB )
bash 104KB ( ! )

>From memory:
emacs ~0KB
sixtus_prolog on alpha ~100%

Jacques Gelinas

unread,
Mar 17, 1998, 3:00:00 AM3/17/98
to

On Tue, 17 Mar 1998, Philipp Reisner wrote:

> Jacques Gelinas wrote:
> >
> > Do you have some statistics on the saving with some programs you know
> > benefit from that ?
> >
>
> It's hard to forecast the benefit for a program...
> But some examples:
> (intel, two instances of the named program)
>
> netscape3 240KB
> xemacs(19.14) 520KB
> X (XSuSE_NVidia) 168KB ( I also heard of results of up do 1.5MB )
> bash 104KB ( ! )


Are you telling me that each bash process have 104k (around that) of
unshared stuff that is identical in each process ?

What are those anonymous mmap btw ?



> >From memory:
> emacs ~0KB
> sixtus_prolog on alpha ~100%
>
> ------------------------------------------------------------------------
> Want to try something new? Are you a Linux hacker?
> Volunteer in testing mergemem!
> (Get it from http://www.mondoshawan.ml.org/mergemem)
> -----
> Philipp Reisner E-Mail mailto:e952...@student.tuwien.ac.at
>
>

--------------------------------------------------------
Jacques Gelinas (jac...@solucorp.qc.ca)
Linuxconf: The ultimate administration system for Linux.
see http://www.solucorp.qc.ca/linuxconf
new developments: remote GUI admin, multiple machines admin, wu-ftpd

g...@ffa.se

unread,
Mar 17, 1998, 3:00:00 AM3/17/98
to

Dunno if this is the right place to talk about this but since I can't
find or figure out any other better place (besides there should be some
interest here to) to talk/question about this issue I'm writing here
in hope to get an answer.

What is the status for Linux and year 2000 compliant ? I have tried
to hunt down information about this but have failed totally.
Is the Linux kernel ready for Y2k or is there some "dirty" uses of time
still in there. I'm no kernel expert but I think that there should be
a lot of time/date usage in the kernel. Esp when we are dealing with
filesystems.

If things isn't already fixed and setup for the millennium then it's
about time to start working on getting everything ready for the millennium.
If the time() function is correctly working under Linux there shouldn't
be any problem until 2038. time() returns the time since 00:00:00 GMT,
January 1, 1970, measured in seconds and stores this in a 32 bit long
int. So if everything is done right with the filesystems and other stuff
where there might be use of dates or time values then everything should be cool.

But if there is some other strange usage of times and dates not using
time() then there might be problem.

I also think there should be a doc/faq or something written about this
matter to show the big dragons in the computer community that Linux
isn't is a small toy OS for hackers. But rather that it's a competitor
to then and that we really care! Most of every software/OS developer with
selfrespect have written some kind of report/publication in this matter.
So should the Linux community. Why not start with the kernel getting
everything checked so there will not be any problem when year 2000
comes with the kernel. And when this is done be sure to put up some kind
of doc/faq about this concern that the kernel is Y2k compliant and
how to write software so that the software that runs under Linux is
Y2k compliant.

I hope You get my point and that I'm not totally off topic in this list.
If I'm please help me out where I should go for information.

Thank You for Your time!

Regards Eje Gustafsson
---
(This is my opinion and not necessary the opinion of my employer)
_______________________________________________________________________

Eje Gustafsson mailto:g...@ffa.se
SystemAdministrator http://www.ffa.se
THE AERONAUTICAL RESEARCH INSTITUTE OF SWEDEN Telephone: +46-8-6341297
Ranhammarsv 12-14 Telefax: +46-8-253481
Box 11021 SE-16111 Bromma Sweden Mobile: +46-70-7552331
_______________________________________________________________________

Philipp Reisner

unread,
Mar 17, 1998, 3:00:00 AM3/17/98
to

> > > Do you have some statistics on the saving with some programs you know
> > > benefit from that ?
> > It's hard to forecast the benefit for a program...
> > But some examples:
> > (intel, two instances of the named program)
> >
> > bash 104KB ( ! )
>
> Are you telling me that each bash process have 104k (around that) of
> unshared stuff that is identical in each process ?
>
> What are those anonymous mmap btw ?
>
Last first:
anonymous mapings are all mapings, that where not mapped with mmap.
For example: the stack, memory returned from malloc().

You can see the mappings of a process by typing: cat /proc/xx/maps

I just checked it again, if the bashes are runned by diffrent users
I get only 20KB.
For 2 bashes of root I get 88KB.
For 2 bashes of philipp I (running in two kvts (xterm replacement)) I
get 240KB.
For 2 bashes of philipp running in one kvt I get 80KB.

Why are you not trying it yourself. The installation of mergemem ist
quite easy!

The intresting number, is free + buffers:

philipp@phil:/home/philipp/src/uni/mergemem > free
total used free shared buffers cached
Mem: 63244 52720 10524 32672 5508 23228
-/+ buffers: 23984 39260
Swap: 35544 0 35544
philipp@phil:/home/philipp/src/uni/mergemem > bash
philipp@phil:/home/philipp/src/uni/mergemem > free
total used free shared buffers cached
Mem: 63244 53192 10052 33376 5508 23236
-/+ buffers: 24448 38796
Swap: 35544 0 35544
philipp@phil:/home/philipp/src/uni/mergemem > mergemem -n bash
mergemem started
bash is running as 1660,1705
loading maps for process 1705
process has 6 anonymous and 16 file mappings
loading maps for process 1660
process has 6 anonymous and 16 file mappings
comparing mapping 80ad000-80fc000 (79 pages, 316 KB)
pid 1705: 16p merged, 0p shared, 10p misses, 53p not merged
pid 1660: 16p merged, 0p shared, 10p misses, 53p not merged
comparing mapping 40007000-40008000 (1 pages, 4 KB)
pid 1705: 1p merged, 0p shared, 0p misses, 0p not merged
pid 1660: 1p merged, 0p shared, 0p misses, 0p not merged
comparing mapping 4004a000-4004d000 (3 pages, 12 KB)
pid 1705: 1p merged, 0p shared, 0p misses, 2p not merged
pid 1660: 1p merged, 0p shared, 0p misses, 2p not merged
comparing mapping 400d8000-4010a000 (50 pages, 200 KB)
pid 1705: 1p merged, 0p shared, 1p misses, 48p not merged
pid 1660: 1p merged, 0p shared, 1p misses, 48p not merged
comparing mapping 40112000-40113000 (1 pages, 4 KB)
increasing vm_end to 40114000
pid 1705: 1p merged, 0p shared, 0p misses, 1p not merged
pid 1660: 1p merged, 0p shared, 0p misses, 1p not merged
comparing mapping bfffd000-c0000000 (3 pages, 12 KB)
pid 1705: 0p merged, 0p shared, 0p misses, 3p not merged
pid 1660: 0p merged, 0p shared, 0p misses, 3p not merged
==> pid 1705: 20p merged, 0p shared, 11p misses, 107p not merged
==> pid 1660: 20p merged, 0p shared, 11p misses, 107p not merged
** Saved Pages: 20 is 80 KB
** check runs: 6
** sum runs: 0
mergemem stopped
philipp@phil:/home/philipp/src/uni/mergemem > free
total used free shared buffers cached
Mem: 63244 53120 10124 33452 5508 23236
-/+ buffers: 24376 38868
Swap: 35544 0 35544

------------------------------------------------------------------------
Want to try something new? Are you a Linux hacker?
Volunteer in testing mergemem!
(Get it from http://www.mondoshawan.ml.org/mergemem)
-----
Philipp Reisner E-Mail mailto:e952...@student.tuwien.ac.at

Alan Cox

unread,
Mar 17, 1998, 3:00:00 AM3/17/98
to

> Is the Linux kernel ready for Y2k or is there some "dirty" uses of time

The main kernel is, the current libcs are. What happens to things like the DOSfs
I dont know

> I also think there should be a doc/faq or something written about this
> matter to show the big dragons in the computer community that Linux

http://www.uk.linux.org has a Y2K page which summarises known problem
issues and has pointers to any vendor pages there are around. (Red Hat right
now ..)

Contributions welcome

Marek Habersack

unread,
Mar 18, 1998, 3:00:00 AM3/18/98
to

On Tue, 17 Mar 1998, Eje.Gus...@ffa.se wrote:

> If the time() function is correctly working under Linux there shouldn't
> be any problem until 2038. time() returns the time since 00:00:00 GMT,
> January 1, 1970, measured in seconds and stores this in a 32 bit long
> int. So if everything is done right with the filesystems and other stuff

Not quite so. time(2) returns a time_t value and time_t is NOT defined to be a
32-bit integer. It is dependent on the OS what size does time_t have. So every
code which expects and uses time_t and doesn't cast it to a 32 bit (or
smaller) integer, will work just fine. The current Linux kernels use a native
integer size of the underlying architecture as the return type of the sys_time
call, but I think there should be no problem changing it to a hard-coded
64-bit integer if such need arises.

l8r, marek
---
"Little prigs and three-quarter madmen may have the conceit that the
laws of nature are constantly broken for their sakes."
Friedrich Nietzsche

Philipp Reisner

unread,
Mar 18, 1998, 3:00:00 AM3/18/98
to

Jacques Gelinas wrote:
>
> If I understand correctly, basically, when a program starts, it does
> various initialisation which uses some memory. The logic of mergemem is
> that if you start several instance of a given program, there is a good bet
> that some memory will be initialised exactly the same. So mergement is
> comparing different instance of a program and find all the identical
> pages. Then it puts those page read-only and merge all process to share
> the same physical page. It also puts the page with the copy-on-write
> flag so a process can still continue to modify the page, if needed later.
>
> Am I right ?
>

You are perfectly right!!
Are I am allowed to take your text for our readme?

> This sound like a very good way to save memory. I am currently developping
> an X terminal installation kit for schools. As I understand, this tool
> would do good things with that kind of environnment where mostly you have
> 10-15 users logged all the time, all running the same desktop components,
> so many waste duplicate pages.
>
> I will check that. Wonder how much can be saved with say 3-4 users running
> kde + netscape.

It will be used at the university in vienna, in a lab, where 30 students
(on X-terminals) are running a prolog interpreter on one alpha machine.

------------------------------------------------------------------------
Want to try something new? Are you a Linux hacker?
Volunteer in testing mergemem!
(Get it from http://www.mondoshawan.ml.org/mergemem)
-----
Philipp Reisner E-Mail mailto:e952...@student.tuwien.ac.at

-

Stephen Williams

unread,
Mar 18, 1998, 3:00:00 AM3/18/98
to

Jacques Gelinas wrote:
>> The logic of mergemem is
> that if you start several instance of a given program, there is a good bet
> that some memory will be initialised exactly the same. So mergement is
> comparing different instance of a program and find all the identical
> pages. Then it puts those page read-only and merge all process to share
> the same physical page. It also puts the page with the copy-on-write
> flag so a process can still continue to modify the page, if needed later.

Just curious, but couldn't one load a process with the data section shared
and copy-on-write the instant a program is loaded? Isn't it true that the
data section can be paged out of the executable file until it is written to?
(Zero-filled pages are initially shared anyhow.)

I'm a little surprised that Linux doesn't do this, or does it? And if not,
would doing it get some of the benefits of mergemem without the runtime
overhead?
--
Steve Williams "The woods are lovely, dark and deep.
st...@icarus.com But I have promises to keep,
st...@picturel.com and lines to code before I sleep,
http://www.picturel.com And lines to code before I sleep."

Jacques Gelinas

unread,
Mar 18, 1998, 3:00:00 AM3/18/98
to

On Wed, 18 Mar 1998, Stephen Williams wrote:

> Jacques Gelinas wrote:
> >> The logic of mergemem is
> > that if you start several instance of a given program, there is a good bet
> > that some memory will be initialised exactly the same. So mergement is
> > comparing different instance of a program and find all the identical
> > pages. Then it puts those page read-only and merge all process to share
> > the same physical page. It also puts the page with the copy-on-write
> > flag so a process can still continue to modify the page, if needed later.
>
> Just curious, but couldn't one load a process with the data section shared
> and copy-on-write the instant a program is loaded? Isn't it true that the
> data section can be paged out of the executable file until it is written to?
> (Zero-filled pages are initially shared anyhow.)
>
> I'm a little surprised that Linux doesn't do this, or does it? And if not,
> would doing it get some of the benefits of mergemem without the runtime
> overhead?

Linux does all this. The mergemem gadget goes much further. The idea is
that most program starts and then initialise various stuff. At this point,
the share pages are not shared anymore since the program have written in
them. So the data have been loaded of the executable and modified.

The end result is that if you start two instance of the same
program, it will perform basically the same initialisation,
creating a set of duplicate "modified" pages. The OS can't know
that easily. Here is an example.

int main ()
{
// Create a special table based on the hostname
char table[10000];
for (i=0; i<10000; i++){
...
}
// Then from now on, use that table unmodified
// as a lookup for example
}

In this example, each instance of the program is initialising a 10k table
with the exact same data, which is not fixed.

The mergemem patch walks the page allocated to all instance of a program
and find the one which are identical. It then remove the duplicates and
point all process to the same "write-protect with copy-on-write" page.

For example, I assume that if you have a text editor and load a 10 meg
document, then start another copy of the text editor and load the same
document, the mergemem patch will merge most of this 10 meg. now, the
minute you start editing the document in one editor, the page will start
to differentiate again.

The end result is about the same as if the original text editor had done a
fork().

But the idea of mergemem is that most program start and perform some
initialisation and all instance of the program share some amount of this
initialisation. This is "this amount" that mergemem can find.

The question is "is it worth it". From the date I have seen, it sounds
like it might be very useful on a multi-users server. Time will tell.

--------------------------------------------------------
Jacques Gelinas (jac...@solucorp.qc.ca)
Linuxconf: The ultimate administration system for Linux.
see http://www.solucorp.qc.ca/linuxconf
new developments: remote GUI admin, multiple machines admin, wu-ftpd

-

Reply all
Reply to author
Forward
0 new messages