[PATCH] sched.h: increment TASK_COMM

Luben Tuikov

unread,

Jun 23, 2006, 12:50:11 PM6/23/06

to linux-...@vger.kernel.org

Lets use 19+1 chars. This helps display properly
kernel threads (e.g. SATA translation threads) which bear
the address of the STP/SATA bridge where the SATA disk is
connected. Those are 16+1 chars long. Currently (15+1) the last
character is not displayed as it is used by the '\0'.

The array is 4 byte aligned so we add another 4 bytes to it.

Signed-off-by: Luben Tuikov <ltu...@yahoo.com>
---
include/linux/sched.h | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 18f12cb..3fc11bc 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -154,7 +154,7 @@ #define set_current_state(state_value)
set_mb(current->state, (state_value))

/* Task command name length */
-#define TASK_COMM_LEN 16
+#define TASK_COMM_LEN 20

/*
* Scheduling policies
--
1.4.0.g470d

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Albert Cahalan

unread,

Jun 23, 2006, 10:27:34 PM6/23/06

to linux-...@vger.kernel.org, ltu...@yahoo.com

Luben Tuikov writes:

> Lets use 19+1 chars. This helps display properly
> kernel threads (e.g. SATA translation threads) which bear
> the address of the STP/SATA bridge where the SATA disk is
> connected. Those are 16+1 chars long. Currently (15+1) the
> last character is not displayed as it is used by the '\0'.

This makes apps crash when they do this:

char buf[16];
prctl(PR_GET_NAME, buf, 42, 42, 42);

Check the last 15 lines of kernel/sys.c to see why.

Though none of my code breaks, I have to wonder about what might
be out there. (my code truncates it)

I tend to think that putting 64 bits of hex in the command name
is an abuse of the command name. Perhaps it is time to make argv
work for built-in kernel tasks. It's also long past time to group
these things by tgid, reducing the horrible clutter seen on
large systems.

As for the size chosen...

Solaris uses 15. (plus 80 for unswappable argv)
Darwin uses 16. (was 10 I believe)
OpenBSD uses 16.
NetBSD uses 16.
We use 15. (you propose 19)
FreeBSD uses 19.

Luben Tuikov

unread,

Jun 30, 2006, 9:06:29 PM6/30/06

to linux-...@vger.kernel.org, Andrew Morton

It is 4 byte aligned anyway. This way we can use
up to 19+1 chars.

Signed-off-by: Luben Tuikov <ltu...@yahoo.com>
---
include/linux/sched.h | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 18f12cb..3fc11bc 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -154,7 +154,7 @@ #define set_current_state(state_value)
set_mb(current->state, (state_value))

/* Task command name length */
-#define TASK_COMM_LEN 16
+#define TASK_COMM_LEN 20

/*
* Scheduling policies
--

1.4.1.rc2.g4ce4

Andrew Morton

unread,

Jun 30, 2006, 9:17:01 PM6/30/06

to ltu...@yahoo.com, linux-...@vger.kernel.org

Luben Tuikov <ltu...@yahoo.com> wrote:
>
> It is 4 byte aligned anyway.

That's a 64-bitism. And 32-bit machines are more space-sensitive.

> This way we can use
> up to 19+1 chars.
>
> Signed-off-by: Luben Tuikov <ltu...@yahoo.com>
> ---
> include/linux/sched.h | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 18f12cb..3fc11bc 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -154,7 +154,7 @@ #define set_current_state(state_value)
> set_mb(current->state, (state_value))
>
> /* Task command name length */
> -#define TASK_COMM_LEN 16
> +#define TASK_COMM_LEN 20

So this is basically "increase size of comm[] by 4 bytes, happens to be
zero-cost on 64-bit machines".

We do occasionally hit task_struct.comm[] truncation, when people use
"too-long-a-name%d" for their kernel thread names. But we seem to manage.

Luben Tuikov

unread,

Jun 30, 2006, 9:29:18 PM6/30/06

to Andrew Morton, linux-...@vger.kernel.org

--- Andrew Morton <ak...@osdl.org> wrote:
> Luben Tuikov <ltu...@yahoo.com> wrote:
> >
> > It is 4 byte aligned anyway.
>
> That's a 64-bitism. And 32-bit machines are more space-sensitive.

I see.

> > This way we can use
> > up to 19+1 chars.
> >
> > Signed-off-by: Luben Tuikov <ltu...@yahoo.com>
> > ---
> > include/linux/sched.h | 2 +-
> > 1 files changed, 1 insertions(+), 1 deletions(-)
> >
> > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > index 18f12cb..3fc11bc 100644
> > --- a/include/linux/sched.h
> > +++ b/include/linux/sched.h
> > @@ -154,7 +154,7 @@ #define set_current_state(state_value)
> > set_mb(current->state, (state_value))
> >
> > /* Task command name length */
> > -#define TASK_COMM_LEN 16
> > +#define TASK_COMM_LEN 20
>
> So this is basically "increase size of comm[] by 4 bytes, happens to be
> zero-cost on 64-bit machines".

Yes.

> We do occasionally hit task_struct.comm[] truncation, when people use
> "too-long-a-name%d" for their kernel thread names. But we seem to manage.

It would be especially helpful if you want to name a task thread
the NAA IEEE Registered name format (16 chars, globally unique), for things
like FC, SAS, etc. This way you can identify the task thread with
the device bearing the NAA IEEE name.

Currently just last character is cut off, since TASK_COMM_LEN is 15+1.

I think incrementing it would be a good thing, plus other things
may want to represent 8 bytes as a character array to be the name
of a task thread.

Luben

Andrew Morton

unread,

Jun 30, 2006, 9:35:09 PM6/30/06

to ltu...@yahoo.com, linux-...@vger.kernel.org

Luben Tuikov <ltu...@yahoo.com> wrote:
>
> > We do occasionally hit task_struct.comm[] truncation, when people use
> > "too-long-a-name%d" for their kernel thread names. But we seem to manage.
>
> It would be especially helpful if you want to name a task thread
> the NAA IEEE Registered name format (16 chars, globally unique), for things
> like FC, SAS, etc. This way you can identify the task thread with
> the device bearing the NAA IEEE name.
>
> Currently just last character is cut off, since TASK_COMM_LEN is 15+1.
>
> I think incrementing it would be a good thing, plus other things
> may want to represent 8 bytes as a character array to be the name
> of a task thread.

OK, that's a reason. Being able to map a kernel thread onto a particular
device is useful.

Luben Tuikov

unread,

Jun 30, 2006, 10:48:57 PM6/30/06

to Andrew Morton, linux-...@vger.kernel.org

--- Andrew Morton <ak...@osdl.org> wrote:

> Luben Tuikov <ltu...@yahoo.com> wrote:
> >
> > > We do occasionally hit task_struct.comm[] truncation, when people use
> > > "too-long-a-name%d" for their kernel thread names. But we seem to manage.
> >
> > It would be especially helpful if you want to name a task thread
> > the NAA IEEE Registered name format (16 chars, globally unique), for things
> > like FC, SAS, etc. This way you can identify the task thread with
> > the device bearing the NAA IEEE name.
> >
> > Currently just last character is cut off, since TASK_COMM_LEN is 15+1.
> >
> > I think incrementing it would be a good thing, plus other things
> > may want to represent 8 bytes as a character array to be the name
> > of a task thread.
>
> OK, that's a reason. Being able to map a kernel thread onto a particular
> device is useful.

OK, great.

Luben

Jeff Garzik

unread,

Jul 1, 2006, 12:47:13 AM7/1/06

to Andrew Morton, ltu...@yahoo.com, linux-...@vger.kernel.org

Andrew Morton wrote:
> Luben Tuikov <ltu...@yahoo.com> wrote:
>>> We do occasionally hit task_struct.comm[] truncation, when people use
>>> "too-long-a-name%d" for their kernel thread names. But we seem to manage.
>> It would be especially helpful if you want to name a task thread
>> the NAA IEEE Registered name format (16 chars, globally unique), for things
>> like FC, SAS, etc. This way you can identify the task thread with
>> the device bearing the NAA IEEE name.
>>
>> Currently just last character is cut off, since TASK_COMM_LEN is 15+1.
>>
>> I think incrementing it would be a good thing, plus other things
>> may want to represent 8 bytes as a character array to be the name
>> of a task thread.
>
> OK, that's a reason. Being able to map a kernel thread onto a particular
> device is useful.

But will it wind up this way, when the does-not-exist-yet-upstream code
appears?

I would think it would make more sense to increase the size of the key
task structure only when there are justified, merged users in the kernel.

Jeff

Andrew Morton

unread,

Jul 1, 2006, 12:54:01 AM7/1/06

to Jeff Garzik, ltu...@yahoo.com, linux-...@vger.kernel.org

On Sat, 01 Jul 2006 00:46:15 -0400
Jeff Garzik <je...@garzik.org> wrote:

> Andrew Morton wrote:
> > Luben Tuikov <ltu...@yahoo.com> wrote:
> >>> We do occasionally hit task_struct.comm[] truncation, when people use
> >>> "too-long-a-name%d" for their kernel thread names. But we seem to manage.
> >> It would be especially helpful if you want to name a task thread
> >> the NAA IEEE Registered name format (16 chars, globally unique), for things
> >> like FC, SAS, etc. This way you can identify the task thread with
> >> the device bearing the NAA IEEE name.
> >>
> >> Currently just last character is cut off, since TASK_COMM_LEN is 15+1.
> >>
> >> I think incrementing it would be a good thing, plus other things
> >> may want to represent 8 bytes as a character array to be the name
> >> of a task thread.
> >
> > OK, that's a reason. Being able to map a kernel thread onto a particular
> > device is useful.
>
> But will it wind up this way, when the does-not-exist-yet-upstream code
> appears?
>
> I would think it would make more sense to increase the size of the key
> task structure only when there are justified, merged users in the kernel.
>

Well yes - I assumed that would be happening fairly soon. Luben?

Jan Engelhardt

unread,

Jul 1, 2006, 5:02:11 AM7/1/06

to Andrew Morton, ltu...@yahoo.com, linux-...@vger.kernel.org

>>
>> I think incrementing it would be a good thing, plus other things
>> may want to represent 8 bytes as a character array to be the name
>> of a task thread.
>
>OK, that's a reason. Being able to map a kernel thread onto a particular
>device is useful.
>

Eh, that is not the case for some ATM.

Do "migration", "ksoftirqd", "watchdog", "events", "kblockd" and "aio" map
to a particular device besides possibly "CPU"?

"xfslogd", "xfsdatad", and "rpciod" also have one thread per CPU rather
than per-device or per-mount.

"pdflush" even has a variable number of threads running, depending on load.
One thread may serve many mounts/devices, or - I have seen this - eight
instances manage two mounts.

Jan Engelhardt
--

Jan Engelhardt

unread,

Jul 3, 2006, 8:08:15 AM7/3/06

to Andrew Morton, ltu...@yahoo.com, linux-...@vger.kernel.org

>
>We do occasionally hit task_struct.comm[] truncation, when people use
>"too-long-a-name%d" for their kernel thread names. But we seem to manage.
>

Maybe this one can help?

Have kthread_create() print a warning message if the command name is
going to be truncated, for ease of development.

diff --fast -dpru linux-2.6.17~/kernel/kthread.c linux-2.6.17+/kernel/kthread.c
--- linux-2.6.17~/kernel/kthread.c 2006-06-06 02:57:02.000000000 +0200
+++ linux-2.6.17+/kernel/kthread.c 2006-07-01 11:08:57.687698000 +0200
@@ -147,8 +147,11 @@ struct task_struct *kthread_create(int (
if (!IS_ERR(create.result)) {
va_list args;
va_start(args, namefmt);
- vsnprintf(create.result->comm, sizeof(create.result->comm),
- namefmt, args);
+ if(vsnprintf(create.result->comm, sizeof(create.result->comm),
+ namefmt, args) != strlen(create.result->comm))
+ printk(KERN_WARNING "kthread_create: command name of "
+ "pid %d truncated to \"%s\"\n", create.result->pid,
+ create.result->comm);
va_end(args);
}

#<<eof>>

Jan Engelhardt
--

Luben Tuikov

unread,

Jul 7, 2006, 1:00:50 PM7/7/06

to Andrew Morton, Jeff Garzik, ltu...@yahoo.com, linux-...@vger.kernel.org

--- Andrew Morton <ak...@osdl.org> wrote:

> On Sat, 01 Jul 2006 00:46:15 -0400
> Jeff Garzik <je...@garzik.org> wrote:
>
> > Andrew Morton wrote:
> > > Luben Tuikov <ltu...@yahoo.com> wrote:
> > >>> We do occasionally hit task_struct.comm[] truncation, when people use
> > >>> "too-long-a-name%d" for their kernel thread names. But we seem to manage.
> > >> It would be especially helpful if you want to name a task thread
> > >> the NAA IEEE Registered name format (16 chars, globally unique), for things
> > >> like FC, SAS, etc. This way you can identify the task thread with
> > >> the device bearing the NAA IEEE name.
> > >>
> > >> Currently just last character is cut off, since TASK_COMM_LEN is 15+1.
> > >>
> > >> I think incrementing it would be a good thing, plus other things
> > >> may want to represent 8 bytes as a character array to be the name
> > >> of a task thread.
> > >
> > > OK, that's a reason. Being able to map a kernel thread onto a particular
> > > device is useful.
> >
> > But will it wind up this way, when the does-not-exist-yet-upstream code
> > appears?
> >
> > I would think it would make more sense to increase the size of the key
> > task structure only when there are justified, merged users in the kernel.
> >
>
> Well yes - I assumed that would be happening fairly soon. Luben?

Andrew,

If this patch is so much affecting Garzik, please drop it.

As to "merged users in the kernel" -- my code is GPL, and at one point
was in the -mm tree as maintained by yourself.

Currently it also implements a SAT-r08a complient SCSI/ATA Translation
Layer for a SAS Stack including SATA capabilities adjustment as advertized
by the protocol, NCQ, passthrough, etc, etc. (Garzik may see this as
objectionable as it is not "libata", but it cannot be due to architectural
hurdles.)

Anyway, kernel threads bear the name of STP/SATA bridge which is representable
in 16+1 ASCII chars (IEEE NAA Registered format identifier, 8 bytes), and thus
the last character (4 bits of the name) are chopped off in a 15+1 char array.

So in effect, a kernel thread can be (WW-) uniquely identified with the device
it services.

In the (near) future, I can see other protocols wanting TASK_COMM_LEN to be larger
to accomodate this (e.g. any protocol which allows for IEEE NAA names).

Thank you for your time,
Luben

Jeff Garzik

unread,

Jul 7, 2006, 1:10:52 PM7/7/06

to ltu...@yahoo.com, Andrew Morton, linux-...@vger.kernel.org

Luben Tuikov wrote:
> Andrew,
>
> If this patch is so much affecting Garzik, please drop it.
>
> As to "merged users in the kernel" -- my code is GPL, and at one point
> was in the -mm tree as maintained by yourself.
>
> Currently it also implements a SAT-r08a complient SCSI/ATA Translation
> Layer for a SAS Stack including SATA capabilities adjustment as advertized
> by the protocol, NCQ, passthrough, etc, etc. (Garzik may see this as
> objectionable as it is not "libata", but it cannot be due to architectural
> hurdles.)
>
> Anyway, kernel threads bear the name of STP/SATA bridge which is representable
> in 16+1 ASCII chars (IEEE NAA Registered format identifier, 8 bytes), and thus
> the last character (4 bits of the name) are chopped off in a 15+1 char array.

Your patch increases the size of a key data structure -- task struct --
for all users on all platforms, even when there are _no_ users currently
in the kernel.

It is thus wasted space, for all users on all platforms.

Linux development doesn't work like this. We don't know the future,
until it happens.

Thus, this patch is appropriate when there are real users in the kernel,
and not before.

Jeff

Luben Tuikov

unread,

Jul 8, 2006, 8:48:38 PM7/8/06

to Jeff Garzik, Andrew Morton, linux-...@vger.kernel.org

--- Jeff Garzik <je...@garzik.org> wrote:
> Your patch increases the size of a key data structure -- task struct --
> for all users on all platforms, even when there are _no_ users currently
> in the kernel.
>
> It is thus wasted space, for all users on all platforms.
>
> Linux development doesn't work like this. We don't know the future,
> until it happens.

:-) I love your logic... Can I quote you on this?

Your thinking: "We don't know the future, until it happens",
leaves you unprepared for that future when it actually happens.

> Thus, this patch is appropriate when there are real users in the kernel,
> and not before.

Impeccable logic, wouldn't you say?

How would you know that there will _not_ be any users unless you
give them the opportunity to use it?

Your logic is not only the chicken-and-the-egg problem but is also
the classical example in political history:
Since there are no current users of "facility X", don't give
it to the people.

How would you know? How? When it is not available in the first place?

I know subsystems' developers go through _great_ strives to keep
the kernel thread names to fit 15+1 chars.* This is why you don't
see any "users".

If TASK_COMM_LEN _were_ 19+1 chars, I bet you users of that facility
will spring up.

Luben

* At least that was the case for me 6 years ago when developing iSCSI
code... Those iSCSI threads can certainly make use of something larger
than 15+1 chars. And it's not only iSCSI. You have to keep an
_open mind_ to the future.

[PATCH] sched.h: increment TASK_COMM_LEN to 20 bytes

Luben Tuikov

Albert Cahalan

Luben Tuikov

Andrew Morton

Luben Tuikov

Andrew Morton

Luben Tuikov

Jeff Garzik

Andrew Morton

Jan Engelhardt

Jan Engelhardt

Luben Tuikov

Jeff Garzik

Luben Tuikov