Code Review - Migrate copy to/from user to exception table based ones

6 views
Skip to first unread message

Davide Libenzi

unread,
Oct 31, 2015, 12:18:04 PM10/31/15
to Akaros
I have also added the strcpy to/from user, but we seem to pass the length from userspace AFAICS.
Up to you if you think they will become handy or not.

https://github.com/brho/akaros/compare/staging...dlibenzi:user_memory_access_v2

The following changes since commit 2a9b3cdc47dbde55f9d125fde3e11832ca4c0b91:

  Avoid double declarations, integer overflow, and use branch hints (2015-10-30 16:02:29 -0400)

are available in the git repository at:

  g...@github.com:dlibenzi/akaros user_memory_access_v2

for you to fetch changes up to 52fac7354aa047cc72281ee39cc6291944cd9ca4:

  Tabified file carrying spaces (2015-10-31 08:41:43 -0700)

----------------------------------------------------------------
Davide Libenzi (2):
      Migrated user memory copy APIs to use the new exception table guards
      Tabified file carrying spaces

 kern/arch/x86/uaccess.h    | 134 +++++++++++++++++++---------------
 kern/include/umem.h        |  38 ++++++++--
 kern/src/ktest/pb_ktests.c |  16 +++++
 kern/src/syscall.c         |   6 +-
 kern/src/umem.c            | 176 ++++++++++++++++++---------------------------
 5 files changed, 197 insertions(+), 173 deletions(-)

Barret Rhoden

unread,
Nov 9, 2015, 6:26:22 PM11/9/15
to aka...@googlegroups.com
On 2015-10-31 at 09:18 "'Davide Libenzi' via Akaros"
<aka...@googlegroups.com> wrote:
> I have also added the strcpy to/from user, but we seem to pass the
> length from userspace AFAICS.
> Up to you if you think they will become handy or not.

Overall, I like replacing memcpy_{to,from}_user, but I think the string
ops have some issues. More below:


> From 33ee96c0726fe928e3219a12cff819cac9ad131f Mon Sep 17 00:00:00 2001
> From: Davide Libenzi <dlib...@google.com>
> Date: Fri, 30 Oct 2015 18:31:15 -0700
> Subject: Migrated user memory copy APIs to use the new exception table guards
>
> Migrated user memory copy APIs to use the new exception table guards.
> Added string copy APIs as well.
>
> Signed-off-by: Davide Libenzi <dlib...@google.com>
> ---
> kern/arch/x86/uaccess.h | 98 ++++++++++++++-----------

Minor thing, but anytime we add a arch-specific header that gets used in
arch-independent code (as this one is later), please add a header with
stubs to the other architectures.

> diff --git a/kern/src/umem.c b/kern/src/umem.c
> index 87f4e3a1e332..8038653918fb 100644
> --- a/kern/src/umem.c
> +++ b/kern/src/umem.c
> @@ -7,6 +7,7 @@
> +static int string_copy_from_user(char *dst, const char *src)
> {
> - const void *start, *end;
> - size_t num_pages, i;
> - pte_t pte;
> - uintptr_t perm = PTE_USER_RO;
> - size_t bytes_copied = 0;
> + if (unlikely(!is_user_raddr((void *) src, 1)))
> + return -EFAULT;

Is this unlikely() necessary? We already have the unlikely() inside
is_user_raddr, which is static inlined. I'd like to limit the
unlikely() calls a bit.

More importantly, what happens if the user picks an address close to the
border of the an area where it does not have access, but the string goes
into that area. For instance, in copy_to_user, I think the user could
pick UWLIM - 1, then trick the kernel into writing above that into
memory that is read-only.

> + for (;; dst++, src++) {
> + int error = __get_user(dst, src, 1);
>
> - static_assert(ULIM % PGSIZE == 0 && ULIM != 0); // prevent wrap-around
> + if (unlikely(error))
> + return error;
> + if (unlikely(!*dst))
> + break;

How unlikely is this? It'll happen once per strcpy, right? Is that
going to be a TLB miss or something based on what the compiler does to
make something unlikely? This example gets at the tradeoff of using
likely/unlikely.

> +int memcpy_to_user(struct proc *p, void *dest, const void *src, size_t len)
> +{
> + struct proc *prev = switch_to(p);
> + int error = copy_to_user(dest, src, len);
> +
> + switch_back(p, prev);
> +
> + return error;
> +}

This is great; I'm a huge fan of the memcpy_to_user cleanups.

Barret

Davide Libenzi

unread,
Nov 9, 2015, 6:41:54 PM11/9/15
to Akaros
On Mon, Nov 9, 2015 at 3:26 PM, Barret Rhoden <br...@cs.berkeley.edu> wrote:
>  kern/arch/x86/uaccess.h    |  98 ++++++++++++++-----------

Minor thing, but anytime we add a arch-specific header that gets used in
arch-independent code (as this one is later), please add a header with
stubs to the other architectures.

Are you going to add it this time?


> +     if (unlikely(!is_user_raddr((void *) src, 1)))
> +             return -EFAULT;

Is this unlikely() necessary?  We already have the unlikely() inside
is_user_raddr, which is static inlined.  I'd like to limit the
unlikely() calls a bit.

If you want to bolt in the knowledge that a function is static inlined, then no, it is not.


 

More importantly, what happens if the user picks an address close to the
border of the an area where it does not have access, but the string goes
into that area.  For instance, in copy_to_user, I think the user could
pick UWLIM - 1, then trick the kernel into writing above that into
memory that is read-only.

The code assume that there is one (at least) separation page between user and kernel VMAs.
Should have made it more clear.

 

> +     for (;; dst++, src++) {
> +             int error = __get_user(dst, src, 1);
>
> -     static_assert(ULIM % PGSIZE == 0 && ULIM != 0); // prevent wrap-around
> +             if (unlikely(error))
> +                     return error;
> +             if (unlikely(!*dst))
> +                     break;

How unlikely is this?  It'll happen once per strcpy, right?  Is that
going to be a TLB miss or something based on what the compiler does to
make something unlikely?  This example gets at the tradeoff of using
likely/unlikely.

If AVG string length is, say, 8, then that is more unlikely than likely.
So the question is, in a tight loop like that, out of 8 times, would you prefer to be kicked in the ankle 7 times out of 8, or 1 out of 8? ☺


Davide Libenzi

unread,
Nov 9, 2015, 6:45:35 PM11/9/15
to Akaros
On Mon, Nov 9, 2015 at 3:41 PM, Davide Libenzi <dlib...@google.com> wrote:
More importantly, what happens if the user picks an address close to the
border of the an area where it does not have access, but the string goes
into that area.  For instance, in copy_to_user, I think the user could
pick UWLIM - 1, then trick the kernel into writing above that into
memory that is read-only.

The code assume that there is one (at least) separation page between user and kernel VMAs.
Should have made it more clear.

Never mind, sorry, that needs fixing.

Davide Libenzi

unread,
Nov 9, 2015, 7:53:39 PM11/9/15
to Akaros
Let me know if I can go ahead and fix both the uaccess stub and strcpy code.

Davide Libenzi

unread,
Nov 9, 2015, 8:35:21 PM11/9/15
to Akaros
OK, I fixed the strcpy to/from user code, in the same branch.
For uaccess.h in RISCV, what do we want to do?
I don't know it well enough, I don't have anything to build and test, and the only Linux thing on it, is an off-tree stepchild.

Davide Libenzi

unread,
Nov 10, 2015, 8:26:49 AM11/10/15
to Akaros
Counter never mind ☺ that would have worked, but I changed the code to remove the usual page red zone.

About the unlikely on *src/*dst, it's not even 1/8 or 7/8 actually.
The thing is, in a loop like that, to not pay the branch miss the first time.
The ones after that, while within that loop, at that time, are predicated using the dynamic branch predictor, because even one entry there, overrides the i-know-nothing-about initial static branch hint.



Barret Rhoden

unread,
Nov 10, 2015, 9:42:06 AM11/10/15
to aka...@googlegroups.com
On 2015-11-09 at 17:35 "'Davide Libenzi' via Akaros"
<aka...@googlegroups.com> wrote:
> OK, I fixed the strcpy to/from user code, in the same branch.

Thanks, I'll take a look.

> For uaccess.h in RISCV, what do we want to do?
> I don't know it well enough, I don't have anything to build and test,
> and the only Linux thing on it, is an off-tree stepchild.

Basically I just make a header file with the stubs and put #warnings in
them. It'll make the updating of RISC-V a little easier.

Barret

Barret Rhoden

unread,
Nov 10, 2015, 9:48:33 AM11/10/15
to aka...@googlegroups.com
On 2015-11-09 at 15:41 "'Davide Libenzi' via Akaros"
<aka...@googlegroups.com> wrote:
> > > + for (;; dst++, src++) {
> > > + int error = __get_user(dst, src, 1);
> > >
> > > - static_assert(ULIM % PGSIZE == 0 && ULIM != 0); // prevent
> > wrap-around
> > > + if (unlikely(error))
> > > + return error;
> > > + if (unlikely(!*dst))
> > > + break;
> >
> > How unlikely is this? It'll happen once per strcpy, right? Is that
> > going to be a TLB miss or something based on what the compiler does
> > to make something unlikely? This example gets at the tradeoff of
> > using likely/unlikely.
> >
>
> If AVG string length is, say, 8, then that is more unlikely than
> likely. So the question is, in a tight loop like that, out of 8
> times, would you prefer to be kicked in the ankle 7 times out of 8,
> or 1 out of 8? ☺

It depends how hard the kicks are. If the 7 kicks are light taps, and
the 1 kick is really painful, then I don't mind the 7.

That's my understanding of the tradeoffs with these branch hints: due
to the techniques used by the compiler, it may be that taking the
unlikely branch is disproportionately more painful than not taking the
branch without a hint, such that the inflection point for using
unlikely is not necessarily 51%. Anyway, that's why I don't like the
use of likely/unlikely except in provably better cases (e.g. as in
assert()) or with actual performance numbers.

Barret

Davide Libenzi

unread,
Nov 10, 2015, 11:46:20 AM11/10/15
to Akaros
there is no evil compiler worst case that is triggered by bad prediction.
A branch can be arranged only in two ways. Without hints, it's random, with hints, you can make sure that you are doing better than random (50%).



Barret

--
You received this message because you are subscribed to the Google Groups "Akaros" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akaros+un...@googlegroups.com.
To post to this group, send email to aka...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Barret Rhoden

unread,
Nov 10, 2015, 11:48:27 AM11/10/15
to aka...@googlegroups.com
On 2015-11-10 at 08:46 "'Davide Libenzi' via Akaros"
<aka...@googlegroups.com> wrote:
> there is no evil compiler worst case that is triggered by bad
> prediction. A branch can be arranged only in two ways. Without hints,
> it's random, with hints, you can make sure that you are doing better
> than random (50%).

I thought the compiler could put the unlikely case in a different code
section, which would be on a separate page that might not be in the TLB.

That's the basic guidance I've been given by architects, but I
personally haven't looked into it.

Barret

Davide Libenzi

unread,
Nov 10, 2015, 12:56:50 PM11/10/15
to Akaros
I will not be able to even try if the stub are working though.
Anyway I will do that, unless you stop me because you already did.

Davide Libenzi

unread,
Nov 10, 2015, 1:06:40 PM11/10/15
to Akaros
OK, I did it, same branch.

Barret Rhoden

unread,
Nov 13, 2015, 2:58:03 PM11/13/15
to aka...@googlegroups.com
On 2015-11-10 at 10:06 "'Davide Libenzi' via Akaros"
<aka...@googlegroups.com> wrote:
> OK, I did it, same branch.

i think i got the latest from this branch, but i didn't see the riscv
header. is e18edccd6..cc1afbf2 the right patch set?

barret

Davide Libenzi

unread,
Nov 13, 2015, 5:52:44 PM11/13/15
to Akaros

Barret Rhoden

unread,
Nov 13, 2015, 6:05:36 PM11/13/15
to aka...@googlegroups.com
On 2015-11-13 at 14:52 "'Davide Libenzi' via Akaros"
Ah, I see what the problem was.

These are the patches I see on your branch that differ from master
(basically the (FROM, TO] for user_memory_access_v2):

| * cc1afbf22dcd (dlibenzi/user_memory_access_v2) Tabified file carrying spaces (Davide Libenzi)
| * e18edccd6eb4 Migrated user memory copy APIs to use the new exception table guards (Davide Libenzi)
| * 6c983def2b27 Avoid double declarations, integer overflow, and use branch hints (Davide Libenzi)
| * fa2a6c79d8a5 Added new kernel test case (Davide Libenzi)
| * 6cc759554b02 Plugged the exception handling code (Davide Libenzi)

The bottom 3 from that list (Avoid, Addded, and Plugged) were already
added to master:

* | 2a9b3cdc47db Avoid double declarations, integer overflow, and use branch hints (Davide Libenzi)
* | 40ad9f1f5f48 Added new kernel test case (Davide Libenzi)
* | c7c7f1ceb352 Plugged the exception handling code (Davide Libenzi)

"Plugged" is the one that now has the riscv header.

I'll work some git magic on it. =)

Barret


Davide Libenzi

unread,
Nov 13, 2015, 6:10:18 PM11/13/15
to Akaros
Yes, the changes the were agreed, I made them on the same commits.
This is what I have been told to do.


Davide Libenzi

unread,
Nov 14, 2015, 8:29:50 AM11/14/15
to Akaros
Any news on this one?

Kevin Klues

unread,
Nov 14, 2015, 10:40:27 AM11/14/15
to aka...@googlegroups.com
Barret typically doesn't check his email on weekends, so I wouldn't
expect a response to this until Monday...

On Sat, Nov 14, 2015 at 5:29 AM, 'Davide Libenzi' via Akaros
--
~Kevin

Davide Libenzi

unread,
Nov 14, 2015, 11:41:42 AM11/14/15
to Akaros
That's OK ☺

Barret Rhoden

unread,
Nov 16, 2015, 6:06:32 PM11/16/15
to aka...@googlegroups.com
On 2015-11-14 at 05:29 "'Davide Libenzi' via Akaros"
<aka...@googlegroups.com> wrote:
> Any news on this one?

Thanks, merged to master at bc87a54..bef4b38 (from, to]

You can see the entire diff with 'git diff' or at
https://github.com/brho/akaros/compare/bc87a54...bef4b38

------------

yeah, i've been trying to avoid work email on the weekend. =)

barret

Reply all
Reply to author
Forward
0 new messages