[PATCH 0 of 6] Add pthread_setaffinity_np and pthread_getaffinity_np support

100 views
Skip to first unread message

ren...@renatocunha.com

unread,
Jun 20, 2014, 1:52:54 PM6/20/14
to osv...@googlegroups.com
Hello,

This is a first try in adding affinity support to OSv's pthread implementation.
As previously discussed, this is achieved through pinning. As you can see, this
series makes changes to internal header files and adds new methods to
sched::thread. Having your feedback for this implementation would be very
helpful.

The patches I'm sending were generated by mercurial's "email" extension.
I tried to pass the command lines so they were the most similar to git's as
possible. I hope this doesn't annoy anyone.

bsd/sys/sys/cpuset.h | 26 +++++++++++++-------------
bsd/sys/sys/cpuset.h | 2 ++
include/api/sched.h | 2 ++
core/sched.cc | 10 ++++++++++
include/osv/sched.hh | 2 ++
bsd/include/pthread_np.h | 2 --
include/api/pthread.h | 2 ++
libc/pthread.cc | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
8 files changed, 79 insertions(+), 15 deletions(-)


ren...@renatocunha.com

unread,
Jun 20, 2014, 1:52:57 PM6/20/14
to osv...@googlegroups.com
bsd/sys/sys/cpuset.h | 26 +++++++++++++-------------
1 files changed, 13 insertions(+), 13 deletions(-)


# HG changeset patch
# Date 1403276731 10800
# Fri Jun 20 12:05:31 2014 -0300
# Node ID e140d8d9e055c422d38efb08502a0f8092e3a691
# Parent a5f8d3421625fec6036273dcd2f07e0522234a9f
bsd: Quit using internal __size_t type

The internal bsd code uses the __size_t type, but this is just a typedef from
the regular size_t. To prevent including new files, we change this in cpuset.h,
making it easily importable by other modules.

diff --git a/bsd/sys/sys/cpuset.h b/bsd/sys/sys/cpuset.h
--- a/bsd/sys/sys/cpuset.h
+++ b/bsd/sys/sys/cpuset.h
@@ -42,7 +42,7 @@
* optimize cases when CPU_SETSIZE fits into single machine word.
*/
#define __cpuset_mask(n) \
- ((long)1 << ((_NCPUWORDS == 1) ? (__size_t)(n) : ((n) % _NCPUBITS)))
+ ((long)1 << ((_NCPUWORDS == 1) ? (size_t)(n) : ((n) % _NCPUBITS)))
#define __cpuset_word(n) ((_NCPUWORDS == 1) ? 0 : ((n) / _NCPUBITS))

#define CPU_CLR(n, p) ((p)->__bits[__cpuset_word(n)] &= ~__cpuset_mask(n))
@@ -50,13 +50,13 @@
#define CPU_ISSET(n, p) (((p)->__bits[__cpuset_word(n)] & __cpuset_mask(n)) != 0)
#define CPU_SET(n, p) ((p)->__bits[__cpuset_word(n)] |= __cpuset_mask(n))
#define CPU_ZERO(p) do { \
- __size_t __i; \
+ size_t __i; \
for (__i = 0; __i < _NCPUWORDS; __i++) \
(p)->__bits[__i] = 0; \
} while (0)

#define CPU_FILL(p) do { \
- __size_t __i; \
+ size_t __i; \
for (__i = 0; __i < _NCPUWORDS; __i++) \
(p)->__bits[__i] = -1; \
} while (0)
@@ -68,7 +68,7 @@

/* Is p empty. */
#define CPU_EMPTY(p) __extension__ ({ \
- __size_t __i; \
+ size_t __i; \
for (__i = 0; __i < _NCPUWORDS; __i++) \
if ((p)->__bits[__i]) \
break; \
@@ -77,7 +77,7 @@

/* Is p full set. */
#define CPU_ISFULLSET(p) __extension__ ({ \
- __size_t __i; \
+ size_t __i; \
for (__i = 0; __i < _NCPUWORDS; __i++) \
if ((p)->__bits[__i] != (long)-1) \
break; \
@@ -86,7 +86,7 @@

/* Is c a subset of p. */
#define CPU_SUBSET(p, c) __extension__ ({ \
- __size_t __i; \
+ size_t __i; \
for (__i = 0; __i < _NCPUWORDS; __i++) \
if (((c)->__bits[__i] & \
(p)->__bits[__i]) != \
@@ -97,7 +97,7 @@

/* Are there any common bits between b & c? */
#define CPU_OVERLAP(p, c) __extension__ ({ \
- __size_t __i; \
+ size_t __i; \
for (__i = 0; __i < _NCPUWORDS; __i++) \
if (((c)->__bits[__i] & \
(p)->__bits[__i]) != 0) \
@@ -107,7 +107,7 @@

/* Compare two sets, returns 0 if equal 1 otherwise. */
#define CPU_CMP(p, c) __extension__ ({ \
- __size_t __i; \
+ size_t __i; \
for (__i = 0; __i < _NCPUWORDS; __i++) \
if (((c)->__bits[__i] != \
(p)->__bits[__i])) \
@@ -116,19 +116,19 @@
})

#define CPU_OR(d, s) do { \
- __size_t __i; \
+ size_t __i; \
for (__i = 0; __i < _NCPUWORDS; __i++) \
(d)->__bits[__i] |= (s)->__bits[__i]; \
} while (0)

#define CPU_AND(d, s) do { \
- __size_t __i; \
+ size_t __i; \
for (__i = 0; __i < _NCPUWORDS; __i++) \
(d)->__bits[__i] &= (s)->__bits[__i]; \
} while (0)

#define CPU_NAND(d, s) do { \
- __size_t __i; \
+ size_t __i; \
for (__i = 0; __i < _NCPUWORDS; __i++) \
(d)->__bits[__i] &= ~(s)->__bits[__i]; \
} while (0)
@@ -141,14 +141,14 @@

/* Convenience functions catering special cases. */
#define CPU_OR_ATOMIC(d, s) do { \
- __size_t __i; \
+ size_t __i; \
for (__i = 0; __i < _NCPUWORDS; __i++) \
atomic_set_long(&(d)->__bits[__i], \
(s)->__bits[__i]); \
} while (0)

#define CPU_COPY_STORE_REL(f, t) do { \
- __size_t __i; \
+ size_t __i; \
for (__i = 0; __i < _NCPUWORDS; __i++) \
atomic_store_rel_long(&(t)->__bits[__i], \
(f)->__bits[__i]); \


ren...@renatocunha.com

unread,
Jun 20, 2014, 1:53:00 PM6/20/14
to osv...@googlegroups.com
bsd/sys/sys/cpuset.h | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)


# HG changeset patch
# Date 1403276731 10800
# Fri Jun 20 12:05:31 2014 -0300
# Node ID 05c3db8fb831eca621b7d2532048e5fd0d7506f6
# Parent e140d8d9e055c422d38efb08502a0f8092e3a691
bsd: Add knowledge of lwpid_t in cpuset.h

cpuset.h references the lwpid_t type in some prototypes it defines. Instead of
including new dependencies, just give visibility of the symbol to the header
file.

diff --git a/bsd/sys/sys/cpuset.h b/bsd/sys/sys/cpuset.h
--- a/bsd/sys/sys/cpuset.h
+++ b/bsd/sys/sys/cpuset.h
@@ -201,6 +201,8 @@
struct setlist cs_children; /* (c) List of children. */
};

+typedef pid_t lwpid_t;
+
#define CPU_SET_ROOT 0x0001 /* Set is a root set. */
#define CPU_SET_RDONLY 0x0002 /* No modification allowed. */



ren...@renatocunha.com

unread,
Jun 20, 2014, 1:53:03 PM6/20/14
to osv...@googlegroups.com
include/api/sched.h | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)


# HG changeset patch
# Date 1403286181 10800
# Fri Jun 20 14:43:01 2014 -0300
# Node ID 525fa25391470ef5d08fbdecdd743e4eeb1aa18f
# Parent 05c3db8fb831eca621b7d2532048e5fd0d7506f6
sched: Make cpu_set_t visible so pthread.h can reference it

diff --git a/include/api/sched.h b/include/api/sched.h
--- a/include/api/sched.h
+++ b/include/api/sched.h
@@ -65,6 +65,8 @@
int setns(int, int);
#endif

+typedef struct cpu_set_t {} cpu_set_t;
+
#ifdef __cplusplus
}
#endif


ren...@renatocunha.com

unread,
Jun 20, 2014, 1:53:06 PM6/20/14
to osv...@googlegroups.com
core/sched.cc | 10 ++++++++++
include/osv/sched.hh | 2 ++
2 files changed, 12 insertions(+), 0 deletions(-)


# HG changeset patch
# Date 1403286187 10800
# Fri Jun 20 14:43:07 2014 -0300
# Node ID ef68ec75e88281011a9b9e39b00de6b57cd8b7a3
# Parent 525fa25391470ef5d08fbdecdd743e4eeb1aa18f
sched: Add public interface for pinning threads to specific CPUs

For supporting pthread_setaffinity_np and pthread_getaffinity_np, we need to
support pinning threads to specific CPUs (we don't support more than pinning to
one CPU), so we need an interface for defining the pinned CPU from other parts
of the code.

diff --git a/core/sched.cc b/core/sched.cc
--- a/core/sched.cc
+++ b/core/sched.cc
@@ -1384,4 +1384,14 @@

}

+void sched::thread::set_pinned_cpu(sched::cpu *cpu)
+{
+ _attr.pin(cpu);
+}
+
+sched::cpu *sched::thread::get_pinned_cpu() const
+{
+ return _attr._pinned_cpu;
+}
+
irq_lock_type irq_lock;
diff --git a/include/osv/sched.hh b/include/osv/sched.hh
--- a/include/osv/sched.hh
+++ b/include/osv/sched.hh
@@ -441,6 +441,8 @@
void set_name(std::string name);
std::string name() const;
std::array<char, 16> name_raw() const { return _attr._name; }
+ void set_pinned_cpu(sched::cpu *);
+ sched::cpu *get_pinned_cpu() const;
/**
* Set thread's priority
*


ren...@renatocunha.com

unread,
Jun 20, 2014, 1:53:09 PM6/20/14
to osv...@googlegroups.com
bsd/include/pthread_np.h | 2 --
include/api/pthread.h | 2 ++
2 files changed, 2 insertions(+), 2 deletions(-)


# HG changeset patch
# Date 1403286193 10800
# Fri Jun 20 14:43:13 2014 -0300
# Node ID c64f3596b1892661abd966926e24313e81fdba5a
# Parent ef68ec75e88281011a9b9e39b00de6b57cd8b7a3
pthread: Move NP implementations to main pthread.h

diff --git a/bsd/include/pthread_np.h b/bsd/include/pthread_np.h
--- a/bsd/include/pthread_np.h
+++ b/bsd/include/pthread_np.h
@@ -45,8 +45,6 @@
__BEGIN_DECLS
int pthread_attr_setcreatesuspend_np(pthread_attr_t *);
int pthread_attr_get_np(pthread_t, pthread_attr_t *);
-int pthread_attr_getaffinity_np(const pthread_attr_t *, size_t, cpuset_t *);
-int pthread_attr_setaffinity_np(pthread_attr_t *, size_t, const cpuset_t *);
int pthread_getaffinity_np(pthread_t, size_t, cpuset_t *);
int pthread_getthreadid_np(void);
int pthread_main_np(void);
diff --git a/include/api/pthread.h b/include/api/pthread.h
--- a/include/api/pthread.h
+++ b/include/api/pthread.h
@@ -212,6 +212,8 @@
#ifdef _GNU_SOURCE
int pthread_getattr_np(pthread_t, pthread_attr_t *);
int pthread_setname_np(pthread_t pthread, const char* name);
+int pthread_attr_getaffinity_np(const pthread_attr_t *, size_t, cpu_set_t *);
+int pthread_attr_setaffinity_np(pthread_attr_t *, size_t, const cpu_set_t *);
#endif

#ifdef __cplusplus


ren...@renatocunha.com

unread,
Jun 20, 2014, 1:53:11 PM6/20/14
to osv...@googlegroups.com
libc/pthread.cc | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 48 insertions(+), 0 deletions(-)


# HG changeset patch
# Date 1403286198 10800
# Fri Jun 20 14:43:18 2014 -0300
# Node ID 97b21d7fbc115c83c48096e6c75c63c0d3ac3614
# Parent c64f3596b1892661abd966926e24313e81fdba5a
pthread: Add support for setting thread affinity through pinning

OpenMP applications (and probably other users) need to set a thread's affinity
to specific processors. Even though we don't support affinity, we do support
pinning, so we can have an implementation that at least pins threads to
specific CPUs.

diff --git a/libc/pthread.cc b/libc/pthread.cc
--- a/libc/pthread.cc
+++ b/libc/pthread.cc
@@ -28,6 +28,10 @@
#include <api/time.h>
#include <osv/spinlock.h>

+#include <sys/param.h>
+#include <porting/netport.h>
+#include <sys/cpuset.h>
+
namespace pthread_private {

const unsigned tsd_nkeys = 100;
@@ -733,3 +737,47 @@
pthread::from_libc(p)->_thread.set_name(name);
return 0;
}
+
+extern "C" int pthread_setaffinity_np(pthread_t thread, size_t cpusetsize,
+ const cpuset_t *cpuset) {
+ if (cpuset == NULL) {
+ return EFAULT;
+ }
+
+ if (cpusetsize > sizeof(sched::max_cpus)) {
+ return EINVAL;
+ }
+ for (unsigned c = 0; c < sched::max_cpus; c++) {
+ unsigned cpuon = CPU_ISSET(c, cpuset);
+ if (cpuon) {
+ if (c > sched::cpus.size()) {
+ return EINVAL;
+ }
+ auto *cpu = sched::cpus[c];
+ pthread::from_libc(thread)->_thread.set_pinned_cpu(cpu);
+ return 0;
+ }
+ }
+ return EINVAL;
+}
+
+extern "C" int pthread_getaffinity_np(pthread_t thread, size_t cpusetsize,
+ cpuset_t *cpuset) {
+ if (cpuset == NULL) {
+ return EFAULT;
+ }
+ if (sizeof(sched::max_cpus) > cpusetsize) {
+ return EINVAL;
+ }
+ CPU_ZERO(cpuset);
+ auto *cpu = pthread::from_libc(thread)->_thread.get_pinned_cpu();
+ if (cpu != nullptr) {
+ for (unsigned int i = 0; i < sched::cpus.size(); i++) {
+ if (cpu == sched::cpus[i]) {
+ CPU_SET(i, cpuset);
+ break;
+ }
+ }
+ }
+ return 0;
+}


Pekka Enberg

unread,
Jun 22, 2014, 6:41:26 AM6/22/14
to ren...@renatocunha.com, Osv Dev, Glauber Costa, Nadav Har'El
On Fri, Jun 20, 2014 at 8:45 PM, <ren...@renatocunha.com> wrote:
> This is a first try in adding affinity support to OSv's pthread implementation.
> As previously discussed, this is achieved through pinning. As you can see, this
> series makes changes to internal header files and adds new methods to
> sched::thread. Having your feedback for this implementation would be very
> helpful.
>
> The patches I'm sending were generated by mercurial's "email" extension.
> I tried to pass the command lines so they were the most similar to git's as
> possible. I hope this doesn't annoy anyone.

Glauber, Nadav, care to take a look at this series?

Nadav Har'El

unread,
Jun 22, 2014, 7:03:01 AM6/22/14
to ren...@renatocunha.com, Osv Dev
This patch in itself seems reasonable, although I have to wonder why BSD used this "__size_t" instead of using "size_t" in the first place. I'm sure they had a reason, but I don't know what it was, or whether it's still relevant.

One thing that bothers me somewhat about this whole patch series is that now we'll start to use bsd/sys/sys/cpuset.h as part of our API, as sched.hh includes that. This seems to me kind of ugly. Another alternative could be to do what Zika L'Etrange did recently, in a patch we didn't end up committing, which was creating our own bits/sched.h with just necessary cpuset.h stuff. Of course the downside of this is that now we'll have two variants of the same thing in the source code. So I don't know which solution is better.

Reviewed-by: Nadav Har'El <n...@cloudius-systems.com>




--
You received this message because you are subscribed to the Google Groups "OSv Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--

Nadav Har'El

unread,
Jun 22, 2014, 7:08:19 AM6/22/14
to ren...@renatocunha.com, Osv Dev
I'm fine with this, although I think you could have just added  #include <sys/procfs.h> and get this definition - seems safer than repeating it?


Reviewed-by: Nadav Har'El <n...@cloudius-systems.com>
On Fri, Jun 20, 2014 at 1:45 PM, <ren...@renatocunha.com> wrote:
--
You received this message because you are subscribed to the Google Groups "OSv Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Nadav Har'El

unread,
Jun 22, 2014, 7:11:27 AM6/22/14
to ren...@renatocunha.com, Osv Dev
On Fri, Jun 20, 2014 at 8:45 PM, <ren...@renatocunha.com> wrote:
I don't understand this definition -
If you just want an *incomplete* definition, that you could use as a pointer, I think you should have written
   typedef struct cpu_set_t cpu_set_t;

The way that you wrote it, doesn't it actually define cpu_set_t to be an empty structure, which it shouldn't be?

 
+
 #ifdef __cplusplus
 }
 #endif



--
You received this message because you are subscribed to the Google Groups "OSv Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Avi Kivity

unread,
Jun 22, 2014, 7:17:44 AM6/22/14
to Nadav Har'El, ren...@renatocunha.com, Osv Dev

On 06/22/2014 02:03 PM, Nadav Har'El wrote:
This patch in itself seems reasonable, although I have to wonder why BSD used this "__size_t" instead of using "size_t" in the first place. I'm sure they had a reason, but I don't know what it was, or whether it's still relevant.


The standard reason is not to pollute the user namespace.  If the user did not include any C89 headers that bring in size_t, then they can expect that symbol not to be defined.

Avi Kivity

unread,
Jun 22, 2014, 7:18:18 AM6/22/14
to Nadav Har'El, ren...@renatocunha.com, Osv Dev

On 06/22/2014 02:03 PM, Nadav Har'El wrote:
> This patch in itself seems reasonable, although I have to wonder why
> BSD used this "__size_t" instead of using "size_t" in the first place.
> I'm sure they had a reason, but I don't know what it was, or whether
> it's still relevant.
>
> One thing that bothers me somewhat about this whole patch series is
> that now we'll start to use bsd/sys/sys/cpuset.h as part of our API,
> as sched.hh includes that. This seems to me kind of ugly. Another
> alternative could be to do what Zika L'Etrange did recently, in a
> patch we didn't end up committing, which was creating our own
> bits/sched.h with just necessary cpuset.h stuff. Of course the
> downside of this is that now we'll have two variants of the same thing
> in the source code. So I don't know which solution is better.

Probably the second. Are we sure bsd cpuset_t is ABI equivalent to linux?

Nadav Har'El

unread,
Jun 22, 2014, 7:28:25 AM6/22/14
to ren...@renatocunha.com, Osv Dev
On Fri, Jun 20, 2014 at 8:45 PM, <ren...@renatocunha.com> wrote:
 core/sched.cc        |  10 ++++++++++
 include/osv/sched.hh |   2 ++
 2 files changed, 12 insertions(+), 0 deletions(-)


# HG changeset patch
# Date 1403286187 10800
#      Fri Jun 20 14:43:07 2014 -0300
# Node ID ef68ec75e88281011a9b9e39b00de6b57cd8b7a3
# Parent  525fa25391470ef5d08fbdecdd743e4eeb1aa18f
sched: Add public interface for pinning threads to specific CPUs

For supporting pthread_setaffinity_np and pthread_getaffinity_np, we need to
support pinning threads to specific CPUs (we don't support more than pinning to
one CPU), so we need an interface for defining the pinned CPU from other parts
of the code.

diff --git a/core/sched.cc b/core/sched.cc
--- a/core/sched.cc
+++ b/core/sched.cc
@@ -1384,4 +1384,14 @@

 }

+void sched::thread::set_pinned_cpu(sched::cpu *cpu)
+{
+    _attr.pin(cpu);
+}

This function will probably not do what you want it to (BTW, you should probably have a comment saying
what you want this function to do).

The pinned_cpu attribute is only consulted when the thread is started, to decide which CPU to start it on.
Additionally, if it is set, the thread is also "pinned", i.e., the load balancer never moves it.
Setting this attribute later will do absolutely nothing.
Can you please write a unit test (tests/tst-*.cc) that verfies that set_pinned_cpu actually does what you want?

One thing you can do later is to call migrate_disable(), to pin the *current* thread to the *current* cpu (whatever that is), but this is not really what you wanted to do here.

cpu::load_balance() contains the code you'll need to "send" a thread to a different CPU (see the code in that function inside the IRQ lock). However, there are addtional complications that will need to be solved. Namely,
1. If a thread tries to pin *itself* to a different CPU (not the one it is running on now), it will need to call the scheduler at the end, because the thread is no longer running on this CPU.
2. If a thread tries to pin a thread which is assigned to a different CPU than the one running now the set_pinned_cpu(), the thread may start, stop, or even migrate in the middle of this function, so some thought needs to go into how to do this correctly.

Another idea which you can consider (I haven't thought this through) is to reuse the load_balance() mechanism for the pinning feature. I.e., on set_pinned_cpu() time, don't actually migrate the thread but rather just mark somewhere that it needs migration. When the CPU running this thread runs its load_balancer (normally 10 times a second), it will notice this command and perform the actual migration.
This will mean that pinning a running thread will always experience a delay (of, e.g., 0.1 seconds) but perhaps this is acceptable.

 
+
+sched::cpu *sched::thread::get_pinned_cpu() const
+{
+    return _attr._pinned_cpu;
+}
+
 irq_lock_type irq_lock;
diff --git a/include/osv/sched.hh b/include/osv/sched.hh
--- a/include/osv/sched.hh
+++ b/include/osv/sched.hh
@@ -441,6 +441,8 @@
     void set_name(std::string name);
     std::string name() const;
     std::array<char, 16> name_raw() const { return _attr._name; }
+    void set_pinned_cpu(sched::cpu *);
+    sched::cpu *get_pinned_cpu() const;
     /**
      * Set thread's priority
      *
--
You received this message because you are subscribed to the Google Groups "OSv Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Nadav Har'El

unread,
Jun 22, 2014, 7:34:53 AM6/22/14
to ren...@renatocunha.com, Osv Dev
This patch seems reasonable, although please verify that nobody currently includes bsd/include/pthread_np.h and expects to get these declarations.

BTW, note that the patch title is misleading - these are the functions' declaration, not implementation :-)

Finally, how is this patch relevant to the rest of the series? In the rest of the series I got the impressions you wanted to change the affinity of a running thread, not of an about-to-be-created. Also, where is the implementation of the functions you're declaring here?

Please don't commit this patch without some clarifications.



On Fri, Jun 20, 2014 at 8:45 PM, <ren...@renatocunha.com> wrote:
--
You received this message because you are subscribed to the Google Groups "OSv Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Nadav Har'El

unread,
Jun 22, 2014, 7:51:37 AM6/22/14
to Renato Cunha, Osv Dev
Hi, as a general comment which I also made earlier, pinning an already-existing thread is a non-trivial problem. Please add a test for this function, because I don't think your implementation actually works (I mean, get() will find whatever was set by set(), but set() won't really change anything).

I wonder if for your needs, it wouldn't be enough to create a new thread already-pinned? I.e., to get pthread_attr_setaffinity_np() to work, instead of pthread_setaffinity_np() which you're doing here? Doing this should be much easier.

On Fri, Jun 20, 2014 at 8:45 PM, <ren...@renatocunha.com> wrote:
+
+    if (cpusetsize > sizeof(sched::max_cpus)) {
+        return EINVAL;

This doesn't look like the right test to me... if max_cpus is an integer, sizeof(max_cpus) will be 4. Who said this is the limit of cpu_set_t? I see in bsd/sys/sys/_cpuset.h that the limit is CPU_SETSIZE/8, or something similar.
 
+    }
+    for (unsigned c = 0; c < sched::max_cpus; c++) {

I think you should loop until CPU_SETSIZE (or something similar), not max_cpus?
 
+        unsigned cpuon = CPU_ISSET(c, cpuset);
+        if (cpuon) {
+            if (c > sched::cpus.size()) {
+                return EINVAL;
+            }
+            auto *cpu = sched::cpus[c];
+            pthread::from_libc(thread)->_thread.set_pinned_cpu(cpu); 
+            return 0;

This will only make sense if there is a single set CPU in the CPU set.
I think we should probably check if this is the case, and if not warn.
Also, possibly, if all the bits are on, we need to disable the pinning?
 
+        }
+    }
+    return EINVAL;
+}
+
+extern "C" int pthread_getaffinity_np(pthread_t thread, size_t cpusetsize,
+        cpuset_t *cpuset) {
+    if (cpuset == NULL) {
+        return EFAULT;
+    }
+    if (sizeof(sched::max_cpus) > cpusetsize) {

Again I think you should compare CPU_SET_SIZE/8 (but please verify...) not sizeof(integer constant).
 
+        return EINVAL;
+    }
+    CPU_ZERO(cpuset);
+    auto *cpu = pthread::from_libc(thread)->_thread.get_pinned_cpu();
+    if (cpu != nullptr) {
+        for (unsigned int i = 0; i < sched::cpus.size(); i++) {
+            if (cpu == sched::cpus[i]) {
+                CPU_SET(i, cpuset);
+                break;
+            }
+        }

Instead of this loop, you can also  get the CPU index (cpu->id). I think this should work too.

+    }
+    return 0;
+}
--
You received this message because you are subscribed to the Google Groups "OSv Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Pekka Enberg

unread,
Jun 22, 2014, 8:01:19 AM6/22/14
to Renato Cunha, Osv Dev
On Fri, Jun 20, 2014 at 8:45 PM, <ren...@renatocunha.com> wrote:
> The patches I'm sending were generated by mercurial's "email" extension.
> I tried to pass the command lines so they were the most similar to git's as
> possible. I hope this doesn't annoy anyone.

I don't mind that at all but you need to "sign-off" the patches for me
to be able to merge them:

https://github.com/cloudius-systems/osv/blob/master/CONTRIBUTING

Please also make sure name is configured properly. Now it's just your
email address in the "From" field.

- Pekka

Glauber Costa

unread,
Jun 22, 2014, 1:49:21 PM6/22/14
to Pekka Enberg, Nadav Har'El, ren...@renatocunha.com, Osv Dev

I intend to review as soon as I get back home.

Renato Cunha

unread,
Jun 22, 2014, 1:53:52 PM6/22/14
to Avi Kivity, Nadav Har'El, ren...@renatocunha.com, Osv Dev
Hello,
The second is probably better, I just didn't know how to proceed here.

As for the equivalence, cpu_set_t is defined as

typedef struct
{
__cpu_mask __bits[__CPU_SETSIZE / __NCPUBITS];
} cpu_set_t;

Which is similar, but not exactly equal.

But I do agree with you. Defining/importing cpu_set_t for OSv is
probably better.

Best Regards,
Renato Cunha.

Renato Cunha

unread,
Jun 22, 2014, 1:58:03 PM6/22/14
to Nadav Har'El, ren...@renatocunha.com, Osv Dev
On 2014-06-22 08:11, Nadav Har'El wrote:
> On Fri, Jun 20, 2014 at 8:45 PM, <ren...@renatocunha.com> wrote:
>>
>> +typedef struct cpu_set_t {} cpu_set_t;
>
> I don't understand this definition -
>
> If you just want an *incomplete* definition, that you could use as a
> pointer, I think you should have written
>
> typedef struct cpu_set_t cpu_set_t;
>
> The way that you wrote it, doesn't it actually define cpu_set_t to be
> an empty structure, which it shouldn't be?

You're right. My mistake.

Thanks.

--
Renato Cunha


Nadav Har'El

unread,
Jun 22, 2014, 3:04:16 PM6/22/14
to Renato Cunha, Avi Kivity, Renato Cunha, Osv Dev, Zika L'Etrange
On Sun, Jun 22, 2014 at 8:53 PM, Renato Cunha <ren...@cunha.io> wrote:
Probably the second.  Are we sure bsd cpuset_t is ABI equivalent to linux?

The second is probably better, I just didn't know how to proceed here.

You can search the mailing list for a patch from Zika L'Etrange (CC'ed) who extracted the cpu_set_t into OSv's
sched.h (I think he actually created a new bits/sched.h and put it there). I don't know where he took the definitions
he put there - I assume/hope he didn't copy it verbatim from Linux, because the licenses don't fit (Linux is GPL,
OSv is BSD license), but any form of "inspiration" that doesn't amount to verbatim copying is fine (some even
claim that header files specifying just an interface cannot be copyrighted, but I'm not a lawyer).

You can try to use his patch instead of importing cpu_set_t on your own.


As for the equivalence, cpu_set_t is defined as

    typedef struct
    {
      __cpu_mask __bits[__CPU_SETSIZE / __NCPUBITS];
    } cpu_set_t;

Which is similar, but not exactly equal.

The more I look at this, the less I see any justification from using the BSD ones.
Please correct me if I'm wrong, but don't they even have different names? In Linux I see "cpu_set_t", but in BSD, it's "cpuset_t".
 

But I do agree with you. Defining/importing cpu_set_t for OSv is probably better.

Best Regards,
Renato Cunha.

Renato Cunha

unread,
Jun 23, 2014, 2:46:34 PM6/23/14
to Nadav Har'El, Renato Cunha, Osv Dev
Hello,

On 2014-06-22 08:51, Nadav Har'El wrote:
> Hi, as a general comment which I also made earlier, pinning an
> already-existing thread is a non-trivial problem. Please add a test
> for this function, because I don't think your implementation actually
> works (I mean, get() will find whatever was set by set(), but set()
> won't really change anything).

Ok, I see. I'll write a test. But as you said, this might not be enough.
Even
if this doesn't work, interface-wise, and assuming we'll actually figure
a way
to do this, would it be ok to extend sched::thread's interface to
support
pinning?

> I wonder if for your needs, it wouldn't be enough to create a new
> thread already-pinned? I.e., to get pthread_attr_setaffinity_np() to
> work, instead of pthread_setaffinity_np() which you're doing here?
> Doing this should be much easier.

Not really, because these are called by gcc's OpenMP implementation. So,
even
though adding pthread_attr_* might be easier (and probably needed for
completeness), to have libgomp doing anything useful with pinning, we'd
need
these functions implemented as well.

> On Fri, Jun 20, 2014 at 8:45 PM, <ren...@renatocunha.com> wrote:
>
>> +
>> + if (cpusetsize > sizeof(sched::max_cpus)) {
>> + return EINVAL;
>
> This doesn't look like the right test to me... if max_cpus is an
> integer, sizeof(max_cpus) will be 4. Who said this is the limit of
> cpu_set_t? I see in bsd/sys/sys/_cpuset.h that the limit is
> CPU_SETSIZE/8, or something similar.

I see. Good point.

>> + }
>> + for (unsigned c = 0; c < sched::max_cpus; c++) {
>
> I think you should loop until CPU_SETSIZE (or something similar), not
> max_cpus?

Yes. Actually, I think I should loop until the number of vCPUs
available.
Probably the size of sched::cpus.

>> + unsigned cpuon = CPU_ISSET(c, cpuset);
>> + if (cpuon) {
>> + if (c > sched::cpus.size()) {
>> + return EINVAL;
>> + }
>> + auto *cpu = sched::cpus[c];
>> +
>> pthread::from_libc(thread)->_thread.set_pinned_cpu(cpu);
>
>> + return 0;
>
> This will only make sense if there is a single set CPU in the CPU set.
>
> I think we should probably check if this is the case, and if not warn.
> Also, possibly, if all the bits are on, we need to disable the
> pinning?

Wouldn't it be better if we:

1- Warned if more than one CPU was set, and
2- Disabled pinning if the set was empty

?


> Again I think you should compare CPU_SET_SIZE/8 (but please verify...)
> not sizeof(integer constant).

Yes. Agreed.

>> + return EINVAL;
>> + }
>> + CPU_ZERO(cpuset);
>> + auto *cpu =
>> pthread::from_libc(thread)->_thread.get_pinned_cpu();
>> + if (cpu != nullptr) {
>> + for (unsigned int i = 0; i < sched::cpus.size(); i++) {
>> + if (cpu == sched::cpus[i]) {
>> + CPU_SET(i, cpuset);
>> + break;
>> + }
>> + }
>
> Instead of this loop, you can also get the CPU index (cpu->id). I
> think this should work too.

Yes. And would probably be better.

Thank you for your input.

Regards.

--
Renato Cunha

Reply all
Reply to author
Forward
0 new messages