How does gdb write to RO virtual memory and get away with it?

Joel Fernandes

unread,

Mar 2, 2010, 2:48:28 PM3/2/10

to

Hi,
I was just playing with gdb today and noticed that I was able to write
to the text sections of a loaded process. This is a bit perplexing
from my understanding of the mechanism so far, my questions are:

1. Does gdb use ptrace to achieve writing to arbitrary memory? GDB
being a userspace process itself has to have some way of instructing
the kernel to write to memory, is this how it is done?

2. Doesn't the processor segfault if a write is attempted on a read
only page such as the text section?

3. It is my understanding that portions of the ELF object / executable
and memory mapped into the process's address space. If this correct,
and if GDB were allowed to write to the segments/sections in memory
like the .text section for example, then wouldn't these writes inturn
go to the memory mapped elf object on disk? Just like how it happens
with any other memory mapped file?

Thanks!
-Joel

David Schwartz

unread,

Mar 2, 2010, 3:56:26 PM3/2/10

to

On Mar 2, 11:48 am, Joel Fernandes <agnel.j...@gmail.com> wrote:

> 2. Doesn't the processor segfault if a write is attempted on a read
> only page such as the text section?

Self-modifying code and data pages has always been legal on ELF
systems. Text sections have always been modifiable, stopped only by
'soft' protection which can trivially be switched off (see
'mprotect').

> 3. It is my understanding that portions of the ELF object / executable
> and memory mapped into the process's address space. If this correct,
> and if GDB were allowed to write to the segments/sections in memory
> like the .text section for example, then wouldn't these writes inturn
> go to the memory mapped elf object on disk? Just like how it happens
> with any other memory mapped file?

The page is read-only and may be shared by any number of processes.
When any process (or gdb) attempts to write to them, this causes a
fault. In the fault handler, the kernel allocates a new page, copies
the old data into it, marks the new page writable, and replaces the
shared page with the new page in the faulting process' map.

While the old page was backed by a disk file, the new page is
anonymous and is treated roughly the same way as allocated memory --
if squeezed out of physical memory, it will be written to swap.

This is called "copy on write".

DS

Rainer Weikusat

unread,

Mar 2, 2010, 4:12:52 PM3/2/10

to

Joel Fernandes <agnel...@gmail.com> writes:
> 1. Does gdb use ptrace to achieve writing to arbitrary memory?

Since this is the available debugging interface, probably. The
debugger must be able to write to the text section because it couldn't
insert breakpoints otherwise.

Alan Curry

unread,

Mar 2, 2010, 11:13:26 PM3/2/10

to

In article <87635es...@fever.mssgmbh.com>,

Overwriting instructions in the text segment isn't the only way to do
breakpoints. See the hbreak command

--
Alan Curry

Joel Fernandes

unread,

Mar 3, 2010, 2:53:14 AM3/3/10

to

Hi David,

Thanks for your message.

> > 3. It is my understanding that portions of the ELF object / executable
> > and memory mapped into the process's address space. If this correct,
> > and if GDB were allowed to write to the segments/sections in memory
> > like the .text section for example, then wouldn't these writes inturn
> > go to the memory mapped elf object on disk? Just like how it happens
> > with any other memory mapped file?
>
> The page is read-only and may be shared by any number of processes.
> When any process (or gdb) attempts to write to them, this causes a
> fault. In the fault handler, the kernel allocates a new page, copies
> the old data into it, marks the new page writable, and replaces the
> shared page with the new page in the faulting process' map.
>
> While the old page was backed by a disk file, the new page is
> anonymous and is treated roughly the same way as allocated memory --
> if squeezed out of physical memory, it will be written to swap.
>
> This is called "copy on write".

So what's happening is initially the pages of the text and data
segment that are memory mapped are marked as Read Only.
Then as and when the processor seg faults, the kernel does a copy-on-
write if the VMA of the pages has the VM_SHARED flag set. This flag is
set based on the PROT_* flags passed to mprotect or mmap, is my
analysis correct?

Does this copy-on-write happen for both the text and data sections or
only the text?

Thanks,
-Joel

David Schwartz

unread,

Mar 3, 2010, 4:52:48 AM3/3/10

to

On Mar 2, 11:53 pm, Joel Fernandes <agnel.j...@gmail.com> wrote:

> So what's happening is initially the pages of the text and data
> segment that are memory mapped are marked as Read Only.

They are not marked shared (see below), they are private, file-backed,
and hard read only. That is, a write will send a SIGSEGV to the
process. As soon as the process calls 'mprotect', they are change to
soft read only, copy on write. That is, a write will cause them to be
copied.

> Then as and when the processor seg faults, the kernel does a copy-on-
> write if the VMA of the pages has the VM_SHARED flag set. This flag is
> set based on the PROT_* flags passed to mprotect or mmap, is my
> analysis correct?

Shared is set no matter what, as the pages are file backed and
unmodified.

> Does this copy-on-write happen for both the text and data sections or
> only the text?

Yes, it happens for both. It also happens for file mappings if they're
private. If they're shared, no copy on write it needed.

Shared actually has the opposite of the meaning you might think it
has. A mapping is "shared" if modifications by one process are seen by
all processes. A normal 'mmap' is shared. But the text/data sections
are not shared. The text/data sections are private, that is, a
modification by one process is *not* stored to the file or seen by any
other process.

Check out the man page for 'mmap', particularly MAP_SHARED and
MAP_PRIVATE. MAP_SHARED means *changes* are shared. MAP_PRIVATE means
*changes* are private, and thus copy on write is needed.

DS

Rainer Weikusat

unread,

Mar 3, 2010, 6:05:27 AM3/3/10

to

A pretty useless remark in the given context. Usually, gdb does
'insert breakpoints' by changing the code which is executed. I didn't
intend my three-line text to be a comprehensive summary of the gdb
documentation and your assumption that these three lines contain
everything (non-relevant) I happen to know about gdb seems a little
far-fetched.

Joel Fernandes

unread,

Mar 4, 2010, 11:19:29 PM3/4/10

to

Hi David,

> Shared actually has the opposite of the meaning you might think it
> has. A mapping is "shared" if modifications by one process are seen by
> all processes. A normal 'mmap' is shared. But the text/data sections
> are not shared. The text/data sections are private, that is, a
> modification by one process is *not* stored to the file or seen by any
> other process.
>
> Check out the man page for 'mmap', particularly MAP_SHARED and
> MAP_PRIVATE. MAP_SHARED means *changes* are shared. MAP_PRIVATE means
> *changes* are private, and thus copy on write is needed.

Thanks for your message. Its always confusing, the difference between
private and shared but thanks to you it is much clearer now.

One last question, Do you think new VMAs are created when a copy-on-
write happens? Because now memory areas that copy on write triggered
on might not be file backed but anonymous. If yes, doesn't this create
a lot of VMAs if a copy-on-writes keeps happening? Does the kernel
have some sort of a VMA merging algorithm for this?

Thanks,
-Joel

Kaz Kylheku

unread,

Mar 5, 2010, 2:37:55 AM3/5/10

to

On 2010-03-02, Joel Fernandes <agnel...@gmail.com> wrote:
> Hi,
> I was just playing with gdb today and noticed that I was able to write
> to the text sections of a loaded process.

How would the debugger be able to place software breakpoints
if it could not overwrite machine instructions?

> 1. Does gdb use ptrace to achieve writing to arbitrary memory? GDB
> being a userspace process itself has to have some way of instructing
> the kernel to write to memory, is this how it is done?
>
> 2. Doesn't the processor segfault if a write is attempted on a read
> only page such as the text section?

Ptrace is a kernel interface, which can do whatever is needed to make a
page of the traced process writeable without the changes propagating to
the underlying executable file, and bypassing the protection.

You can study the kernel to see how the PTRACE_PEEKUSER and
PTRACE_POKEUSER interfaces are implemented.

Note that even without ptrace, a program /can/ write to its own text
segments by requesting permission using mprotect.

> 3. It is my understanding that portions of the ELF object / executable
> and memory mapped into the process's address space. If this correct,
> and if GDB were allowed to write to the segments/sections in memory
> like the .text section for example, then wouldn't these writes inturn
> go to the memory mapped elf object on disk? Just like how it happens
> with any other memory mapped file?

With mmap you can create a mapping that corresponds to a file,
and is writeable, yet such that the changes are private (do
not go into the file, and thus not visible to any file
descriptor reading the file, nor to any other mappings
of the file). See the mmap man page.

Executable mappings are already MAP_PRIVATE, so changes will not
propagate to the underlying files. Despite being MAP_PRIVATE, the pages
are shared among processes. This is similar to copy-on-write in fork.
When you write to a MAP_PRIVATE page, a private copy is cloned.

The kernel doesn't have to manipulate the protection flags to write to a
user page; it can go through the logical view of the page in the
kernel's address space. I.e. obtain the ``struct page *'' pointer
corresponding to the virtual address, by means of the get_user_pages
function (whose interface has the option to force write access in spite
of the page protection), and then manipulate the page directly. After
writing through the kernel view, it then has to perform some
architecture-specific cache flush, to make sure that the process'
mapping of the page is coherent with the modification with regard to the
instruction and data caches. The copy-on-write is handled inside
get_user_pages; it calls handle_mm_fault explicitly.

That's probably how the PTRACE_POKEUSER works.

Kaz Kylheku

unread,

Mar 5, 2010, 2:39:16 AM3/5/10

to

What if the machine has no hardware breakpoints?

What if it lets you set four hardware breakpoints, but
you ask the debuger for forty-two?

Alan Curry

unread,

Mar 5, 2010, 4:23:53 AM3/5/10

to

In article <201003051...@gmail.com>,

Then you'll be sad, I guess.

I wonder why one method of breakpoint setting is called "hardware
supported" and the other one isn't.

The "normal" kind of breakpoint would be impossible if the hardware
didn't include a trap instruction specifically designed for the purpose
(i.e. "int3" as a single-byte i386 opcode distinct from "int $3" which
would have the unwanted side effect of clobbering the next byte)

I see really 2 kinds of hardware-supported breakpoints. And some
hardware is generous enough to include both.

Worse than being limited to a single kind of breakpoint is not having
hardware support for watchpoints, so gdb falls back to single-stepping.
Slow. I suppose that's what it would do if it had a limited number of
breakpoints available and ran out of them.

--
Alan Curry

Rainer Weikusat

unread,

Mar 5, 2010, 4:53:43 AM3/5/10

to

pac...@kosh.dhis.org (Alan Curry) writes:

[...]

> I wonder why one method of breakpoint setting is called "hardware
> supported" and the other one isn't.
>
> The "normal" kind of breakpoint would be impossible if the hardware
> didn't include a trap instruction specifically designed for the purpose
> (i.e. "int3" as a single-byte i386 opcode distinct from "int $3" which
> would have the unwanted side effect of clobbering the next byte)

A 'hardware breakpoint' is one where the intended location of the
breakpoint is communicated to 'hardware' which then automatically
monitors the PC/IP for this value and 'does something suitable' when
it is seen. No special kind of 'debugging trap instruction' is needed
for a 'software breakpoint'. Any code which causes a CPU exception
which can be 'caught' in userspace would do.

David Schwartz

unread,

Mar 5, 2010, 5:48:08 AM3/5/10

to

On Mar 4, 8:19 pm, Joel Fernandes <agnel.j...@gmail.com> wrote:

> One last question, Do you think new VMAs are created when a copy-on-
> write happens? Because now memory areas that copy on write triggered
> on might not be file backed but anonymous. If yes, doesn't this create
> a lot of VMAs if a copy-on-writes keeps happening? Does the kernel
> have some sort of a VMA merging algorithm for this?

The kernel does have a VMA merging algorithm. When a fault triggers a
copy on write, the kernel checks the previous page and the next page
to see if they have a VMA that the new page can be merged into.

DS