High Memory In The Linux Kernel

4 views
Skip to first unread message

Straightman

unread,
Feb 23, 2004, 7:54:07 PM2/23/04
to
Kerneltrap.org
Posted by Amit Shah on Saturday, February 21, 2004 - 08:02

As RAM increasingly becomes a commodity, the prices drop and computer users
are able to buy more. 32-bit archictectures face certain limitations in
regards to accessing these growing amounts of RAM. To better understand the
problem and the various solutions, we begin with an overview of Linux memory
management. Understanding how basic memory management works, we are better
able to define the problem, and finally to review the various solutions.

This article was written by examining the Linux 2.6 kernel source code for
the x86 architecture types.


------------------------------------------------------------------------------
Overview of Linux memory management

32-bit architectures can reference 4 GB of physical memory (2^32). Processors
that have an MMU (Memory Management Unit) support the concept of virtual
memory: page tables are set up by the kernel which map "virtual addresses" to
"physical addresses"; this basically means that each process can access 4 GB
of memory, thinking it's the only process running on the machine (much like
multi-tasking, in which each process is made to think that it's the only
process executing on a CPU).

The virtual address to physical address mappings are done by the kernel. When
a new process is "fork()"ed, the kernel creates a new set of page tables for
the process. The addresses referenced within a process in user-space are
virtual addresses. They do not necessarily map directly to the same physical
address. The virtual address is passed to the MMU (Memory Management Unit of
the processor) which converts it to the proper physical address based on the
tables set up by the kernel. Hence, two processes can refer to memory address
0x08329, but they would refer to two different locations in memory.

The Linux kernel splits the 4 GB virtual address space of a process in two
parts: 3 GB and 1 GB. The lower 3 GB of the process virtual address space is
accessible as the user-space virtual addresses and the upper 1 GB space is
reserved for the kernel virtual addresses. This is true for all processes.


+----------+ 4 GB
| |
| |
| |
| Kernel |
| | +----------+
| Virtual | | |
| | | |
| Space | | High |
| | | |
| (1 GB) | | Memory |
| | | |
| | | (unused) |
+----------+ 3 GB +----------+ 1 GB
| | | |
| | | |
| | | |
| | | Kernel |
| | | |
| | | Physical |
| | | |
|User-space| | Space |
| | | |
| Virtual | | |
| | | |
| Space | | |
| | | |
| (3 GB) | +----------+ 0 GB
| |
| | Physical
| | Memory
| |
| |
| |
| |
| |
+----------+ 0 GB

Virtual
Memory
The kernel virtual area (3 - 4 GB address space) maps to the first 1 GB of
physical RAM. The 3 GB addressable RAM available to each process is mapped to
the available physical RAM.

The Problem
So, the basic problem here is, the kernel can just address 1 GB of virtual
addresses, which can translate to a maximum of 1 GB of physical memory. This
is because the kernel directly maps all available kernel virtual space
addresses to the available physical memory.

Solutions
There are some solutions which address this problem:

2G / 2G, 1G / 3G split
HIGHMEM solution for using up to 4 GB of memory
HIGHMEM solution for using up to 64 GB of memory
1. 2G / 2G, 1G / 3G split
Instead of splitting the virtual address space the traditional way of 3G / 1G
(3 GB for user-space, 1 GB for kernel space), third-party patches exist to
split the virtual address space 2G / 2G or 1G / 3G. The 1G / 3G split is a
bit extreme in that you can map up to 3 GB of physical memory, but user-space
applications cannot grow beyond 1 GB. It could work for simple applications;
but if one has more than 3 GB of physical RAM, he / she won't run simple
applications on it, right?

The 2G / 2G split seems to be a balanced approach to using RAM more than 1 GB
without using the HIGHMEM patches. However, server applications like
databases always want as much virtual addressing space as possible; so this
approach may not work in those scenarios.

There's a patch for 2.4.23 that includes a config-time option of selecting
the user / kernel split values by Andrea Arcangeli. It is available at his
kernel page. It's a simple patch and making it work on 2.6 should not be too
difficult.

Before looking at solutions 2 & 3, let's take a look at some more Linux
Memory Management issues.

Zones
In Linux, the memory available from all banks is classified into "nodes".
These nodes indicate how much memory each bank has. This classification is
mainly useful for NUMA architectures, but it's also used for UMA
architectures, where the number of nodes is just 1.

Memory in each node is divided into "zones". The zones currently defined are
ZONE_DMA, ZONE_NORMAL and ZONE_HIGHMEM.

ZONE_DMA is used by some devices for data transfer and is mapped in the lower
physical memory range (up to 16 MB).

Memory in the ZONE_NORMAL region is mapped by the kernel in the upper region
of the linear address space. Most operations can only take place in
ZONE_NORMAL; so this is the most performance critical zone. ZONE_NORMAL goes
from 16 MB to 896 MB.

To address memory from 1 GB onwards, the kernel has to map pages from high
memory into ZONE_NORMAL.

Some area of memory is reserved for storing several kernel data structures
that store information about the memory map and page tables. This on x86 is
128 MB. Hence, of the 1 GB physical memory the kernel can access, 128MB is
reserved. This means that the kernel virtual address in this 128 MB is not
mapped to physical memory. This leaves a maximum of 896 MB for ZONE_NORMAL.
So, even if one has 1 GB of physical RAM, just 896 MB will be actually
available.

Back to the solutions:

2. HIGHMEM solution for using up to 4 GB of memory
Since Linux can't access memory which hasn't been directly mapped into its
address space, to use memory > 1 GB, the physical pages have to be mapped in
the kernel virtual address space first. This means that the pages in
ZONE_HIGHMEM have to be mapped in ZONE_NORMAL before they can be accessed.

The reserved space which we talked about earlier (in case of x86, 128 MB) has
an area in which pages from high memory are mapped into the kernel address
space.

To create a permanent mapping, the "kmap" function is used. Since this
function may sleep, it may not be used in interrupt context. Since the number
of permanent mappings is limited (if not, we could've directly mapped all the
high memory in the address space), pages mapped this way should be
"kunmap"ped when no longer needed.

Temporary mappings can be created via "kmap_atomic". This function doesn't
block, so it can be used in interrupt context. "kunmap_atomic" un-maps the
mapped high memory page. A temporary mapping is only available as long as the
next temporary mapping. However, since the mapping and un-mapping functions
also disable / enable preemption, it's a bug to not kunmap_atomic a page
mapped via kmap_atomic.

3. HIGHMEM solution for using 64 GB of memory
This is enabled via the PAE (Physical Address Extension) extension of the
PentiumPro processors. PAE addresses the 4 GB physical memory limitation and
is seen as Intel's answer to AMD 64-bit and AMD x86-64. PAE allows processors
to access physical memory up to 64 GB (36 bits of address bus). However,
since the virtual address space is just 32 bits wide, each process can't grow
beyond 4 GB. The mechanism used to access memory from 4 GB to 64 GB is
essentially the same as that of accessing the 1 GB - 4 GB RAM via the HIGHMEM
solution discussed above.

Should I enable CONFIG_HIGHMEM for my 1 GB RAM system?
It is advised to not enable CONFIG_HIGHMEM in the kernel to utilize the extra
128 MB you get for your 1 GB RAM system. I/O Devices cannot directly address
high memory from PCI space, so bounce buffers have to be used. Plus the
virtual memory management and paging costs come with extra mappings. For
details on bounce buffers, refer to Mel Gorman's documentation (link below).

For more information, see
Andrea Arcangeli's article on the original HIGHMEM patches in Linux 2.3
Mel Gorman's VM Documentation for the 2.4 Linux Memory Management subsystem
Linux kernel sources
A recent thread on the Linux kernel mailing list on the effect of enabling
highmem in kernel 2.6

------------------------------------------------------------------------------
--
Copyright (C) 2004, Amit Shah <amit...@gmx.net>.

--
向想安装使用 Debian GNU/Linux 的新手推荐: [1;31m http://dahuang.dhxy.dhs.org/ [0m


[36m※ 修改:·dahuang 于 Feb 24 08:54:06 修改本文·[FROM: 162.105.246.*] [m
[m [1;34m※ 来源:·BBS 水木清华站 smth.org·[FROM: 162.105.246.*] [m

Reply all
Reply to author
Forward
0 new messages