Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[Regression] kdesu broken

308 views
Skip to first unread message

Rafael J. Wysocki

unread,
Jul 23, 2009, 7:45:42 PM7/23/09
to LKML, Andrew Morton
Hi,

A recent kernel change broke kdesu (from KDE 4.2) on my test boxes. ISTR a
discussion about that, but I can't find it right now. Any clues?

Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Ray Lee

unread,
Jul 23, 2009, 8:22:05 PM7/23/09
to Rafael J. Wysocki, LKML, Andrew Morton
On Thu, Jul 23, 2009 at 4:45 PM, Rafael J. Wysocki<r...@sisk.pl> wrote:
> A recent kernel change broke kdesu (from KDE 4.2) on my test boxes.  ISTR a
> discussion about that, but I can't find it right now.  Any clues?

See the thread starting here: ("possible regression with pty.c commit")
http://lkml.org/lkml/2009/7/11/125

Linus Torvalds

unread,
Jul 24, 2009, 12:35:35 PM7/24/09
to Alan Cox, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

On Fri, 24 Jul 2009, Alan Cox wrote:

> On Fri, 24 Jul 2009 17:21:45 +0200


> "Rafael J. Wysocki" <r...@sisk.pl> wrote:
>

> > On Friday 24 July 2009, Ray Lee wrote:
> > > On Thu, Jul 23, 2009 at 4:45 PM, Rafael J. Wysocki<r...@sisk.pl> wrote:
> > > > A recent kernel change broke kdesu (from KDE 4.2) on my test boxes. ISTR a
> > > > discussion about that, but I can't find it right now. Any clues?
> > >
> > > See the thread starting here: ("possible regression with pty.c commit")
> > > http://lkml.org/lkml/2009/7/11/125
> >

> > Thanks for the pointer.
> >
> > Well, I thought we were expected to avoid breaking existing user space, even
> > if that were buggy etc.
>
> I don't know where you got that idea from. Avoiding breaking user space
> unneccessarily is good but if its buggy you often can't do anything about
> it.

Alan, he got that idea from me.

We don't do regressions. If user space depended on old behavior, we don't
change behavior.

And regardless of that, I do not think EIO is the right thing to return at
all. If the other side of a pty went away, return 0 and possibly send a
HUP, or whatever. What did we do before?

Linus

Aneesh Kumar K.V

unread,
Jul 24, 2009, 2:26:06 PM7/24/09
to Ray Lee, Rafael J. Wysocki, LKML, Andrew Morton
On Thu, Jul 23, 2009 at 05:21:36PM -0700, Ray Lee wrote:
> On Thu, Jul 23, 2009 at 4:45 PM, Rafael J. Wysocki<r...@sisk.pl> wrote:
> > A recent kernel change broke kdesu (from KDE 4.2) on my test boxes. �ISTR a
> > discussion about that, but I can't find it right now. �Any clues?
>
> See the thread starting here: ("possible regression with pty.c commit")
> http://lkml.org/lkml/2009/7/11/125

I am also facing a similar problem.

http://bugzilla.kernel.org/show_bug.cgi?id=13815

-aneesh

OGAWA Hirofumi

unread,
Jul 25, 2009, 2:26:10 AM7/25/09
to Alan Cox, Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
Linus Torvalds <torv...@linux-foundation.org> writes:

>> I don't know where you got that idea from. Avoiding breaking user space
>> unneccessarily is good but if its buggy you often can't do anything about
>> it.
>
> Alan, he got that idea from me.
>
> We don't do regressions. If user space depended on old behavior, we don't
> change behavior.
>
> And regardless of that, I do not think EIO is the right thing to return at
> all. If the other side of a pty went away, return 0 and possibly send a
> HUP, or whatever. What did we do before?

I also was seeing this. I hope the attached test code shows the problem.

The problem seems to be complex. And before change, write() seems to
send buffer to ldisc directly. After change, write() seems to send
buffer to tty buffer. With some debug, I'm not sure though, I guess the
following

slave master
write()
write to buffer
tty_flip_buffer_push()
schedule_delayed_work()
close()
set_bit(TTY_OTHER_CLOSED)
read()
input_available_p()
# buffer was not received yet
test_bit(TTY_OTHER_CLOSED)
return -EIO

flush_to_ldisc()
->receive_buf()

master is having the input data in tty->buf, but ->receive_buf() is not
called yet. So, it seems to return -EIO before handling input data in
tty->buf.

Thanks.
--
OGAWA Hirofumi <hiro...@mail.parknet.co.jp>

#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <error.h>
#include <limits.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <sys/wait.h>
#include <unistd.h>

static char pts_name[PATH_MAX];

static int open_pty(void)
{
int master;
char *name;

master = getpt();
if (master < 0)
return -1;

if (grantpt(master) < 0 || unlockpt(master) < 0)
goto close_master;
#if 0
{
int on = 1;
ioctl(master, FIONBIO, &on);
}
#endif
name = ptsname(master);
if (name == NULL)
goto close_master;

strcpy(pts_name, name);

return master;

close_master:
close(master);
return -1;
}

static pid_t child(int master)
{
pid_t pid;
int slave;

pid = fork();
if (pid < 0)
error(1, errno, "%s: fork", __func__);

if (pid == 0) {
slave = open(pts_name, O_RDWR);
if (slave < 0)
error(1, errno, "%s: open", __func__);

close(master);
dup2(slave, 0);
dup2(slave, 1);
dup2(slave, 2);
close(slave);

printf("1-----------------------------------------------\n");
printf("2-----------------------------------------------\n");
printf("3-----------------------------------------------\n");
printf("4-----------------------------------------------\n");
printf("5-----------------------------------------------\n");
printf("6-----------------------------------------------\n");
printf("7-----------------------------------------------\n");
printf("8-----------------------------------------------\n");
printf("9-----------------------------------------------\n");
exit(0);
}

return pid;
}

int main()
{
pid_t pid;
int master;

master = open_pty();
if (master < 0)
error(1, errno, "%s: open_pty", __func__);

pid = child(master);

waitpid(pid, NULL, 0);
while (1) {
char buf[4096];
ssize_t size;

size = read(master, buf, sizeof(buf));
if (size < 0) {
if (errno == EAGAIN) {
printf("EAGAIN\n");
continue;
}
error(1, errno, "%s: read", __func__);
}
if (size == 0)
break;
write(STDOUT_FILENO, buf, size);
}

return 0;

Alan Cox

unread,
Jul 25, 2009, 7:48:57 AM7/25/09
to Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
> > I don't know where you got that idea from. Avoiding breaking user space
> > unneccessarily is good but if its buggy you often can't do anything about
> > it.
>
> Alan, he got that idea from me.
>
> We don't do regressions. If user space depended on old behavior, we don't
> change behavior

I think there are two bugs being confused here. The first kdesu bug isn't
the EIO one. It's just busted userspace code that happened to be lucky.
It's also impossible to keep that "luck" going. The other problem is the
EIO - which is a bug.

> And regardless of that, I do not think EIO is the right thing to return at
> all. If the other side of a pty went away, return 0 and possibly send a
> HUP, or whatever. What did we do before?

If the other side of a pty goes away we send SIGHUP and at that point
return -EIO as required by POSIX, SuS and various other things. You
cannot return "EOF" as that means the user caused an EOF event (eg hit
^D).

There is a bug going on here (the one that bites emacs meta-whatever
compile) where you get an -EIO and that seems to be timing related due to
flushing. I'm currently working on that one. I suspect we will need to
make the close on the writer wait for the data to be flushed through the
reader side and I'm testing some fixes for that.

So there are two things here that need not to be muddled up:

1. Case where kdesu from logging what is going on and code
inspection does:

read
not what I wanted
discard the lot
read
oh damn I missed the bit I wanted

If we go back to the old pty approach we break ppp, locking rules and
everything else, and re-introduce DoS attacks (and some worse ones) going
back to 2.0 if not earlier.

2. A case where previously it happened that

write(pty, "Wibble", 6);
close(pty);

would *usually* block on close until the "Wibble" was consumed the
other end. (ie logically speaking the buffering is on the writer
not the reader)

We can implement that (although probably a timeout to avoid
certain deadlocks is needed)

I plan to fix #2 but not #1. There really isn't any way to keep #1 true.
Although the spec has nothing to say about #2 every implementation I've
you get the wibble reliably and we used to, so it needs fixing.

Alan

Alan Cox

unread,
Jul 25, 2009, 8:06:27 AM7/25/09
to Aneesh Kumar K.V, Ray Lee, Rafael J. Wysocki, LKML, Andrew Morton
> > See the thread starting here: ("possible regression with pty.c commit")
> > http://lkml.org/lkml/2009/7/11/125
>
> I am also facing a similar problem.
>
> http://bugzilla.kernel.org/show_bug.cgi?id=13815

Probably something like this fixes it. I'll be working on that Monday/Tuesday
along with various other bugs that need a review.

--


pty: ensure writes hit the reader before close

From: Alan Cox <al...@linux.intel.com>

Logically we move the buffering from one side to the other
---

drivers/char/pty.c | 11 +++++++++++
1 files changed, 11 insertions(+), 0 deletions(-)


diff --git a/drivers/char/pty.c b/drivers/char/pty.c
index 6e6942c..7555890 100644
--- a/drivers/char/pty.c
+++ b/drivers/char/pty.c
@@ -36,6 +36,15 @@ static struct tty_driver *ptm_driver;
static struct tty_driver *pts_driver;
#endif

+static int pty_empty(struct tty_struct *tty)
+{
+ if (tty->buf.memory_used == 0)
+ return 1;
+ if (test_bit(TTY_HUPPED, &tty->flags))
+ return 1;
+ return 0;
+}
+
static void pty_close(struct tty_struct *tty, struct file *filp)
{
BUG_ON(!tty);
@@ -47,9 +56,11 @@ static void pty_close(struct tty_struct *tty, struct file *filp)
}
wake_up_interruptible(&tty->read_wait);
wake_up_interruptible(&tty->write_wait);
+
tty->packet = 0;
if (!tty->link)
return;
+ wait_event_interruptible(tty->write_wait, pty_empty(tty->link));
tty->link->packet = 0;
set_bit(TTY_OTHER_CLOSED, &tty->link->flags);
wake_up_interruptible(&tty->link->read_wait);

Alan Cox

unread,
Jul 25, 2009, 9:31:24 AM7/25/09
to OGAWA Hirofumi, Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
On Sat, 25 Jul 2009 15:04:06 +0900

> I also was seeing this. I hope the attached test code shows the problem.

It does for me: I've been using the test attached below from the Emacs 21
for Mac OS X web site where MacOS developed the same behaviour

> input_available_p()
> # buffer was not received yet
> test_bit(TTY_OTHER_CLOSED)
> return -EIO
>
> flush_to_ldisc()
> ->receive_buf()
>
> master is having the input data in tty->buf, but ->receive_buf() is not
> called yet. So, it seems to return -EIO before handling input data in
> tty->buf.

Would make sense. Just investigating that now.

-----------

#include <stdio.h>
#include <unistd.h>

int main(int argc, char **argv) {
pid_t pid;
int master;
char buf[101];
int n;

pid = forkpty(&master, NULL, NULL, NULL);
if(pid < 0) {
perror("fork error");
exit(-1);
} else if(pid == 0) {
printf("### This is the child process ###\n"); // To be read by parent
fflush(stdout); // Doesn't help.
sleep(1); // Shouldn't be needed, but it makes things work.
return(0);
} else {
while(n = read(master, buf, 100)) {
if(n < 0) {
perror("read error");
exit(-1);
}
buf[n] = 0; // Make a string out of our data.
printf("Read %d bytes: %s", n, buf);
}
}
exit(0);

Alan Cox

unread,
Jul 25, 2009, 10:04:28 AM7/25/09
to OGAWA Hirofumi, Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
Actually try this:


commit b0e6bdde87725a5d46273ecc4bd00c54bd675848
Author: Alan Cox <al...@linux.intel.com>
Date: Sat Jul 25 15:00:04 2009 +0100

pty: ensure writes hit the reader before close

This is elegant in all the wrong ways. Put the pty into low latency mode (which
we can do as we always post bytes from user context). The tty_flip_buffer_push
then always calls into the ldisc which means we clear the ldisc buffer before
we set the TTY_OTHER_CLOSED flag.

Means pty has subtle knowledge of tty internals we really don't want it to, but
it fixes the problem for the moment.

Signed-off-by: Alan Cox <al...@linux.intel.com>

diff --git a/drivers/char/pty.c b/drivers/char/pty.c
index 6e6942c..87d729b 100644
--- a/drivers/char/pty.c
+++ b/drivers/char/pty.c
@@ -47,10 +47,12 @@ static void pty_close(struct tty_struct *tty, struct file *filp)


}
wake_up_interruptible(&tty->read_wait);
wake_up_interruptible(&tty->write_wait);
+
tty->packet = 0;
if (!tty->link)
return;

tty->link->packet = 0;

+ tty_flip_buffer_push(tty->link);


set_bit(TTY_OTHER_CLOSED, &tty->link->flags);
wake_up_interruptible(&tty->link->read_wait);

wake_up_interruptible(&tty->link->write_wait);
@@ -207,6 +209,7 @@ static int pty_open(struct tty_struct *tty, struct file *filp)
clear_bit(TTY_OTHER_CLOSED, &tty->link->flags);
set_bit(TTY_THROTTLED, &tty->flags);
retval = 0;
+ tty->low_latency = 1;
out:
return retval;

OGAWA Hirofumi

unread,
Jul 25, 2009, 11:16:22 AM7/25/09
to Alan Cox, Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
Alan Cox <al...@lxorguk.ukuu.org.uk> writes:

> Actually try this:

Thanks. This patch improved situation. However, if slave writes big data
to buffer, it seems we still have the problem.

> + tty_flip_buffer_push(tty->link);

This is handling the pending buffer, but in flush_to_ldisc(), if
!tty->receive_room, it seems still delay the ->receive_buf().

> set_bit(TTY_OTHER_CLOSED, &tty->link->flags);
> wake_up_interruptible(&tty->link->read_wait);
> wake_up_interruptible(&tty->link->write_wait);

Thanks.
--
OGAWA Hirofumi <hiro...@mail.parknet.co.jp>


#define _GNU_SOURCE
#define BIG_BUF

static char pts_name[PATH_MAX];

strcpy(pts_name, name);

return master;

close_master:
close(master);
return -1;
}

#ifdef BIG_BUF
{
char buf[4096];
size_t size;

memset(buf, '-', sizeof(buf));
size = 0;
while (size < 8192) {
ssize_t r = write(STDOUT_FILENO, buf, sizeof(buf));
if (r < 0)
error(1, errno, "%s: write", __func__);
size += r;
}
}
#else


printf("1-----------------------------------------------\n");
printf("2-----------------------------------------------\n");
printf("3-----------------------------------------------\n");
printf("4-----------------------------------------------\n");
printf("5-----------------------------------------------\n");
printf("6-----------------------------------------------\n");
printf("7-----------------------------------------------\n");
printf("8-----------------------------------------------\n");
printf("9-----------------------------------------------\n");

#endif
exit(0);
}

return pid;
}

int main()
{
pid_t pid;
int master;

master = open_pty();
if (master < 0)
error(1, errno, "%s: open_pty", __func__);

pid = child(master);

waitpid(pid, NULL, 0);
while (1) {
char buf[4096];
ssize_t size;

size = read(master, buf, sizeof(buf));
if (size < 0) {
if (errno == EAGAIN) {
printf("EAGAIN\n");
continue;
}
error(1, errno, "%s: read", __func__);
}
if (size == 0)
break;

#ifdef BIG_BUF
printf("size %zd\n", size);
#else
write(STDOUT_FILENO, buf, size);
#endif
}

return 0;

Alan Cox

unread,
Jul 25, 2009, 11:32:40 AM7/25/09
to OGAWA Hirofumi, Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
On Sat, 25 Jul 2009 23:55:39 +0900
OGAWA Hirofumi <hiro...@mail.parknet.co.jp> wrote:

> Alan Cox <al...@lxorguk.ukuu.org.uk> writes:
>
> > Actually try this:
>
> Thanks. This patch improved situation. However, if slave writes big data
> to buffer, it seems we still have the problem.
>
> > + tty_flip_buffer_push(tty->link);
>
> This is handling the pending buffer, but in flush_to_ldisc(), if
> !tty->receive_room, it seems still delay the ->receive_buf().

Agreed - we could

wait_event_interruptible(tty->write_wait,
tty->link->receive_room);

or similar.

Good to know the initial fix works. To actually do it cleanly probably
wants a way to pass a logical channel close through the tty layer which
isn't I think too hard

set a new TTY_OTHER_CLOSING in the pty code
set TTY_OTHER_CLOSED in the flip_buffer_push code if we empty the
buffer queue and TTY_OTHER_CLOSING is set.

That would also avoid using tty->low_latency=1 in the pty layer which I
worry may harm PPP gateway performance and the like.

Aneesh Kumar K.V

unread,
Jul 25, 2009, 12:19:04 PM7/25/09
to Alan Cox, Ray Lee, Rafael J. Wysocki, LKML, Andrew Morton


After applying the patch on Fedora 10 the system bootup hangs after modrpobe.
on ubuntu jaunty i have the system booting fine. But trying to recompile
the test prg on emacs gives me the error message

-UUU:%*--F1 *compilation* All (1,0) (Compilation:run Compiling)----<E> -------
A compilation process is running; kill it? (yes or no)

So i guess even though it gets the error information it cause emacs to think that
the compliation process is still running. The process details listed by emacs
Proc Status Buffer Tty Command
---- ------ ------ --- -------
compilation run *compilation* /dev/pts/2 /bin/bash -c cc -g a.c

But a ps -eaf doesn't list the command running. So something more is going on.

-aneesh

Aneesh Kumar K.V

unread,
Jul 25, 2009, 1:06:56 PM7/25/09
to Alan Cox, Ray Lee, Rafael J. Wysocki, LKML, Andrew Morton

ok i have in the process list

kvaneesh 6584 6583 0 22:34 pts/6 00:00:00 [cc]

Rafael J. Wysocki

unread,
Jul 25, 2009, 4:12:27 PM7/25/09
to Alan Cox, OGAWA Hirofumi, Linus Torvalds, Ray Lee, LKML, Andrew Morton
On Saturday 25 July 2009, Alan Cox wrote:
> Actually try this:
>
>
> commit b0e6bdde87725a5d46273ecc4bd00c54bd675848
> Author: Alan Cox <al...@linux.intel.com>
> Date: Sat Jul 25 15:00:04 2009 +0100
>
> pty: ensure writes hit the reader before close
>
> This is elegant in all the wrong ways. Put the pty into low latency mode (which
> we can do as we always post bytes from user context). The tty_flip_buffer_push
> then always calls into the ldisc which means we clear the ldisc buffer before
> we set the TTY_OTHER_CLOSED flag.
>
> Means pty has subtle knowledge of tty internals we really don't want it to, but
> it fixes the problem for the moment.
>
> Signed-off-by: Alan Cox <al...@linux.intel.com>

Works for me, thanks!

Best,
Rafael

Rafael J. Wysocki

unread,
Jul 25, 2009, 4:16:19 PM7/25/09
to Frans Pop, Alan Cox, torv...@linux-foundation.org, ray...@madrabbit.org, linux-...@vger.kernel.org, ak...@linux-foundation.org, Sergey Senozhatsky
On Saturday 25 July 2009, Frans Pop wrote:
> Alan Cox wrote:

> > Linus Torvalds wrote:
> >> We don't do regressions. If user space depended on old behavior, we
> >> don't change behavior
> >
> > I think there are two bugs being confused here. The first kdesu bug
> > isn't the EIO one. It's just busted userspace code that happened to be
> > lucky.
> > It's also impossible to keep that "luck" going. The other problem is the
> > EIO - which is a bug.
>
> Just out of curiosity ad for reference.
> Has this issue already been reported to the KDE people? Any links to a bug
> report or discussions there?

Not that I know of. It's probably worth reporting, though.

Best,
Rafael

Alan Cox

unread,
Jul 25, 2009, 5:03:23 PM7/25/09
to Rafael J. Wysocki, Frans Pop, torv...@linux-foundation.org, ray...@madrabbit.org, linux-...@vger.kernel.org, ak...@linux-foundation.org, Sergey Senozhatsky
> > Just out of curiosity ad for reference.
> > Has this issue already been reported to the KDE people? Any links to a bug
> > report or discussions there?
>
> Not that I know of. It's probably worth reporting, though.

The EIO one hasn't because it isn't their bug, the other one has.

Alan

OGAWA Hirofumi

unread,
Jul 26, 2009, 8:06:25 AM7/26/09
to Alan Cox, Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
Alan Cox <al...@lxorguk.ukuu.org.uk> writes:

> Good to know the initial fix works. To actually do it cleanly probably
> wants a way to pass a logical channel close through the tty layer which
> isn't I think too hard
>
> set a new TTY_OTHER_CLOSING in the pty code
> set TTY_OTHER_CLOSED in the flip_buffer_push code if we empty the
> buffer queue and TTY_OTHER_CLOSING is set.
>
> That would also avoid using tty->low_latency=1 in the pty layer which I
> worry may harm PPP gateway performance and the like.

I see. It sounds like good thing. The attached patch or something?
Anyway, I'm not familiar with the tty stuff obviously, so, I'm not sure
whether this patch is right or not.

If needed, I'll test the new patch.

Thanks.


Signed-off-by: OGAWA Hirofumi <hiro...@mail.parknet.co.jp>
---

drivers/char/pty.c | 19 ++++++++++++++++---
drivers/char/tty_buffer.c | 28 ++++++++++++++++++++++++++--
include/linux/tty.h | 13 +++++++------
3 files changed, 49 insertions(+), 11 deletions(-)

diff -puN drivers/char/pty.c~pty-fixes2 drivers/char/pty.c
--- linux-2.6/drivers/char/pty.c~pty-fixes2 2009-07-26 20:04:17.000000000 +0900
+++ linux-2.6-hirofumi/drivers/char/pty.c 2009-07-26 20:06:17.000000000 +0900
@@ -51,11 +51,19 @@ static void pty_close(struct tty_struct

tty->packet = 0;
if (!tty->link)
return;

- tty->link->packet = 0;
+
+ /*
+ * This try to flush pending tty->buf. And after flushed all
+ * pending tty->buf, TTY_OTHER_CLOSED will be set.
+ */
+ set_bit(TTY_OTHER_CLOSING, &tty->link->flags);
tty_flip_buffer_push(tty->link);
+#if 0
+ tty->link->packet = 0;


set_bit(TTY_OTHER_CLOSED, &tty->link->flags);
wake_up_interruptible(&tty->link->read_wait);
wake_up_interruptible(&tty->link->write_wait);

+#endif
if (tty->driver->subtype == PTY_TYPE_MASTER) {
set_bit(TTY_OTHER_CLOSED, &tty->flags);
#ifdef CONFIG_UNIX98_PTYS
@@ -199,17 +207,22 @@ static int pty_open(struct tty_struct *t
goto out;

retval = -EIO;
- if (test_bit(TTY_OTHER_CLOSED, &tty->flags))
+ if (test_bit(TTY_OTHER_CLOSING, &tty->flags) ||
+ test_bit(TTY_OTHER_CLOSED, &tty->flags))
goto out;
if (test_bit(TTY_PTY_LOCK, &tty->link->flags))
goto out;
if (tty->link->count != 1)
goto out;

+ spin_lock_irq(&tty->link->buf.lock);
+ clear_bit(TTY_OTHER_CLOSING, &tty->link->flags);
clear_bit(TTY_OTHER_CLOSED, &tty->link->flags);
+ spin_unlock_irq(&tty->link->buf.lock);
+


set_bit(TTY_THROTTLED, &tty->flags);
retval = 0;

- tty->low_latency = 1;
+// tty->low_latency = 1;
out:
return retval;
}
diff -puN drivers/char/tty_buffer.c~pty-fixes2 drivers/char/tty_buffer.c
--- linux-2.6/drivers/char/tty_buffer.c~pty-fixes2 2009-07-26 20:04:17.000000000 +0900
+++ linux-2.6-hirofumi/drivers/char/tty_buffer.c 2009-07-26 20:04:46.000000000 +0900
@@ -74,6 +74,20 @@ static struct tty_buffer *tty_buffer_all
return p;
}

+/* must hold tty->buf.lock */
+static void tty_check_other_closing(struct tty_struct *tty)
+{
+ if (test_bit(TTY_OTHER_CLOSING, &tty->flags)) {
+ printk("%s: tty %p, closed\n", __func__, tty);
+ tty->link->packet = 0;
+ set_bit(TTY_OTHER_CLOSED, &tty->flags);
+ wake_up_interruptible(&tty->read_wait);
+ wake_up_interruptible(&tty->write_wait);
+ /* Clear TTY_OTHER_CLOSING after set TTY_OTHER_CLOSED */
+ clear_bit(TTY_OTHER_CLOSING, &tty->flags);
+ }
+}
+
/**
* tty_buffer_free - free a tty buffer
* @tty: tty owning the buffer
@@ -119,6 +133,7 @@ static void __tty_buffer_flush(struct tt
tty_buffer_free(tty, thead);
}
tty->buf.tail = NULL;
+ tty_check_other_closing(tty);
}

/**
@@ -407,8 +422,12 @@ static void flush_to_ldisc(struct work_s
unsigned char *flag_buf;

disc = tty_ldisc_ref(tty);
- if (disc == NULL) /* !TTY_LDISC */
+ if (disc == NULL) { /* !TTY_LDISC */
+ spin_lock_irqsave(&tty->buf.lock, flags);
+ tty_check_other_closing(tty);
+ spin_unlock_irqrestore(&tty->buf.lock, flags);
return;
+ }

spin_lock_irqsave(&tty->buf.lock, flags);
/* So we know a flush is running */
@@ -419,8 +438,11 @@ static void flush_to_ldisc(struct work_s
for (;;) {
int count = head->commit - head->read;
if (!count) {
- if (head->next == NULL)
+ if (head->next == NULL) {
+ printk("%s: tty %p, next == NULL\n", __func__, tty);
+ tty_check_other_closing(tty);
break;
+ }
tbuf = head;
head = head->next;
tty_buffer_free(tty, tbuf);
@@ -448,9 +470,11 @@ static void flush_to_ldisc(struct work_s
/* Restore the queue head */
tty->buf.head = head;
}
+
/* We may have a deferred request to flush the input buffer,
if so pull the chain under the lock and empty the queue */
if (test_bit(TTY_FLUSHPENDING, &tty->flags)) {
+ printk("%s: tty %p, flushing\n", __func__, tty);
__tty_buffer_flush(tty);
clear_bit(TTY_FLUSHPENDING, &tty->flags);
wake_up(&tty->read_wait);
diff -puN include/linux/tty.h~pty-fixes2 include/linux/tty.h
--- linux-2.6/include/linux/tty.h~pty-fixes2 2009-07-26 20:04:17.000000000 +0900
+++ linux-2.6-hirofumi/include/linux/tty.h 2009-07-26 20:04:17.000000000 +0900
@@ -309,12 +309,13 @@ struct tty_struct {
*/
#define TTY_THROTTLED 0 /* Call unthrottle() at threshold min */
#define TTY_IO_ERROR 1 /* Cause an I/O error (may be no ldisc too) */
-#define TTY_OTHER_CLOSED 2 /* Other side (if any) has closed */
-#define TTY_EXCLUSIVE 3 /* Exclusive open mode */
-#define TTY_DEBUG 4 /* Debugging */
-#define TTY_DO_WRITE_WAKEUP 5 /* Call write_wakeup after queuing new */
-#define TTY_PUSH 6 /* n_tty private */
-#define TTY_CLOSING 7 /* ->close() in progress */
+#define TTY_OTHER_CLOSING 2 /* Other side (if any) is closing */
+#define TTY_OTHER_CLOSED 3 /* Other side (if any) has closed */
+#define TTY_EXCLUSIVE 4 /* Exclusive open mode */
+#define TTY_DEBUG 5 /* Debugging */
+#define TTY_DO_WRITE_WAKEUP 6 /* Call write_wakeup after queuing new */
+#define TTY_PUSH 7 /* n_tty private */
+#define TTY_CLOSING 8 /* ->close() in progress */
#define TTY_LDISC 9 /* Line discipline attached */
#define TTY_LDISC_CHANGING 10 /* Line discipline changing */
#define TTY_LDISC_OPEN 11 /* Line discipline is open */
_
--
OGAWA Hirofumi <hiro...@mail.parknet.co.jp>

Sergey Senozhatsky

unread,
Jul 26, 2009, 11:48:58 AM7/26/09
to Frans Pop, Alan Cox, torv...@linux-foundation.org, r...@sisk.pl, ray...@madrabbit.org, linux-...@vger.kernel.org, ak...@linux-foundation.org, Sergey Senozhatsky
On (07/25/09 16:02), Frans Pop wrote:
> Date: Sat, 25 Jul 2009 16:02:48 +0200
> From: Frans Pop <ele...@planet.nl>
> To: Alan Cox <al...@lxorguk.ukuu.org.uk>
> Cc: torv...@linux-foundation.org, r...@sisk.pl, ray...@madrabbit.org,
> linux-...@vger.kernel.org, ak...@linux-foundation.org,
> Sergey Senozhatsky <sergey.se...@mail.by>
> Subject: Re: [Regression] kdesu broken
> User-Agent: KMail/1.9.9

>
> Alan Cox wrote:
> > Linus Torvalds wrote:
> >> We don't do regressions. If user space depended on old behavior, we
> >> don't change behavior
> >
> > I think there are two bugs being confused here. The first kdesu bug
> > isn't the EIO one. It's just busted userspace code that happened to be
> > lucky.
> > It's also impossible to keep that "luck" going. The other problem is the
> > EIO - which is a bug.
>
> Just out of curiosity ad for reference.
> Has this issue already been reported to the KDE people? Any links to a bug
> report or discussions there?
>

Not yet.

Sergey

signature.asc

Aneesh Kumar K.V

unread,
Jul 26, 2009, 1:41:49 PM7/26/09
to Alan Cox, OGAWA Hirofumi, Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

This one fixes "the compile in emacs bug" for me.

http://bugzilla.kernel.org/show_bug.cgi?id=13815

-aneesh

Alan Cox

unread,
Jul 27, 2009, 6:56:52 AM7/27/09
to OGAWA Hirofumi, Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
> I see. It sounds like good thing. The attached patch or something?
> Anyway, I'm not familiar with the tty stuff obviously, so, I'm not sure
> whether this patch is right or not.

It turns out to be a little bit trickier than I thought because of open
v close v flush_to_ldisc races (some of the open/close ones being long
standing).

We now use tty->buf.lock to keep EOF/EOFPENDING/OTHER_CLOSED all in order
together. That has no real cost as we already hold the buf.lock in the hot
path which is flush_to_ldisc.

Anyway this is my current thoughts (not yet given a testing)


diff --git a/drivers/char/n_tty.c b/drivers/char/n_tty.c
index ff47907..acae995 100644
--- a/drivers/char/n_tty.c
+++ b/drivers/char/n_tty.c
@@ -1777,7 +1777,8 @@ do_it_again:
tty->minimum_to_wake = (minimum - (b - buf));

if (!input_available_p(tty, 0)) {


- if (test_bit(TTY_OTHER_CLOSED, &tty->flags)) {

+ if (test_bit(TTY_EOF, &tty->flags)) {
+ /* PTY pair closed and all data consumed */
retval = -EIO;
break;
}
diff --git a/drivers/char/pty.c b/drivers/char/pty.c
index 6e6942c..de10cc0 100644
--- a/drivers/char/pty.c
+++ b/drivers/char/pty.c
@@ -38,6 +38,9 @@ static struct tty_driver *pts_driver;



static void pty_close(struct tty_struct *tty, struct file *filp)

{
+ struct tty_struct *pair;
+ unsigned long flags;
+
BUG_ON(!tty);


if (tty->driver->subtype == PTY_TYPE_MASTER)

WARN_ON(tty->count > 1);
@@ -47,13 +50,22 @@ static void pty_close(struct tty_struct *tty, struct file *filp)


}
wake_up_interruptible(&tty->read_wait);
wake_up_interruptible(&tty->write_wait);
+
tty->packet = 0;

- if (!tty->link)
+ pair = tty->link;
+ if (!pair)


return;
- tty->link->packet = 0;

- set_bit(TTY_OTHER_CLOSED, &tty->link->flags);
- wake_up_interruptible(&tty->link->read_wait);
- wake_up_interruptible(&tty->link->write_wait);
+
+ spin_lock_irqsave(&pair->buf.lock, flags);
+ pair->packet = 0;
+ /* Indicate that the other end is now closed, set the
+ ENDPENDING marker so that the true end can be processed by
+ the line discipline */
+ set_bit(TTY_EOFPENDING, &pair->flags);
+ set_bit(TTY_OTHER_CLOSED, &pair->flags);
+ spin_unlock_irqrestore(&pair->buf.lock, flags);
+ wake_up_interruptible(&pair->read_wait);
+ wake_up_interruptible(&pair->write_wait);


if (tty->driver->subtype == PTY_TYPE_MASTER) {
set_bit(TTY_OTHER_CLOSED, &tty->flags);
#ifdef CONFIG_UNIX98_PTYS

@@ -180,7 +192,6 @@ static void pty_flush_buffer(struct tty_struct *tty)

if (!to)
return;
- /* tty_buffer_flush(to); FIXME */
if (to->packet) {
spin_lock_irqsave(&tty->ctrl_lock, flags);
tty->ctrl_status |= TIOCPKT_FLUSHWRITE;
@@ -191,23 +202,30 @@ static void pty_flush_buffer(struct tty_struct *tty)



static int pty_open(struct tty_struct *tty, struct file *filp)

{
- int retval = -ENODEV;
+ int retval = -EIO;
+ unsigned long flags;
+ struct tty_struct *pair;

- if (!tty || !tty->link)
- goto out;
+ if (tty == NULL || (pair = tty->link) == NULL)
+ return -ENODEV;

- retval = -EIO;
if (test_bit(TTY_OTHER_CLOSED, &tty->flags))
+ return -EIO;
+ spin_lock_irqsave(&pair->buf.lock, flags);
+ if (test_bit(TTY_PTY_LOCK, &pair->flags))
goto out;
- if (test_bit(TTY_PTY_LOCK, &tty->link->flags))
- goto out;
- if (tty->link->count != 1)
+ if (pair->count != 1)
goto out;

- clear_bit(TTY_OTHER_CLOSED, &tty->link->flags);
+ clear_bit(TTY_OTHER_CLOSED, &pair->flags);
+ /* The buf.lock stops this racing a flush_to_ldisc from
+ the other end */
+ clear_bit(TTY_EOFPENDING, &pair->flags);
+ clear_bit(TTY_EOF, &pair->flags);


set_bit(TTY_THROTTLED, &tty->flags);
retval = 0;

out:
+ spin_unlock_irqrestore(&pair->buf.lock, flags);
return retval;
}

diff --git a/drivers/char/tty_buffer.c b/drivers/char/tty_buffer.c
index 810ee25..448cd1d 100644
--- a/drivers/char/tty_buffer.c
+++ b/drivers/char/tty_buffer.c
@@ -119,6 +119,12 @@ static void __tty_buffer_flush(struct tty_struct *tty)


tty_buffer_free(tty, thead);
}
tty->buf.tail = NULL;

+ /* We can EOF a pty/tty pair with a flush as well as by consuming
+ all the data left over */
+ if (test_bit(TTY_EOFPENDING, &tty->flags)) {
+ set_bit(TTY_EOF, &tty->flags);
+ wake_up(&tty->read_wait);
+ }
}

/**
@@ -455,6 +461,10 @@ static void flush_to_ldisc(struct work_struct *work)


clear_bit(TTY_FLUSHPENDING, &tty->flags);
wake_up(&tty->read_wait);
}

+ if (tty->buf.head == NULL && test_bit(TTY_EOFPENDING, &tty->flags)) {
+ wake_up(&tty->read_wait);
+ set_bit(TTY_EOF, &tty->flags);
+ }
clear_bit(TTY_FLUSHING, &tty->flags);
spin_unlock_irqrestore(&tty->buf.lock, flags);

diff --git a/include/linux/tty.h b/include/linux/tty.h
index 85aa525..427d107 100644
--- a/include/linux/tty.h
+++ b/include/linux/tty.h
@@ -321,6 +321,8 @@ struct tty_struct {


#define TTY_LDISC 9 /* Line discipline attached */
#define TTY_LDISC_CHANGING 10 /* Line discipline changing */
#define TTY_LDISC_OPEN 11 /* Line discipline is open */

+#define TTY_EOF 12 /* TTY/PTY pair at EOF */
+#define TTY_EOFPENDING 13 /* TTY/PTY pair EOF when data emptied */
#define TTY_HW_COOK_OUT 14 /* Hardware can do output cooking */
#define TTY_HW_COOK_IN 15 /* Hardware can do input cooking */
#define TTY_PTY_LOCK 16 /* pty private */

Alan Cox

unread,
Jul 27, 2009, 9:22:42 AM7/27/09
to OGAWA Hirofumi, Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
> I worried "pair->packet = 0" when I'm thinking this. I guess it would be
> changed more early than before. Is it ok?

I think so, and we can get stuck otherwise.


Tested patch:

commit 70325fd1d4341896c17b6f1f1965370b5258d0b1
Author: Alan Cox <al...@linux.intel.com>
Date: Mon Jul 27 14:18:52 2009 +0100

pty: ensure writes hit the reader before close

Implement TTY_EOF/EOFPENDING flags so that we can propogate the close of the
pty through the buffering correctly. The new flag state is locked but the
tty buffer lock as it relates to buffers, and also because the buffer
lock is already held in the hot path.

Signed-off-by: Alan Cox <al...@linux.intel.com>

index 810ee25..19a7ced 100644


--- a/drivers/char/tty_buffer.c
+++ b/drivers/char/tty_buffer.c
@@ -119,6 +119,12 @@ static void __tty_buffer_flush(struct tty_struct *tty)
tty_buffer_free(tty, thead);
}
tty->buf.tail = NULL;
+ /* We can EOF a pty/tty pair with a flush as well as by consuming
+ all the data left over */
+ if (test_bit(TTY_EOFPENDING, &tty->flags)) {
+ set_bit(TTY_EOF, &tty->flags);
+ wake_up(&tty->read_wait);
+ }
}

/**

@@ -405,6 +411,7 @@ static void flush_to_ldisc(struct work_struct *work)
struct tty_buffer *tbuf, *head;
char *char_buf;
unsigned char *flag_buf;
+ int done = 1;

disc = tty_ldisc_ref(tty);


if (disc == NULL) /* !TTY_LDISC */

@@ -433,10 +440,13 @@ static void flush_to_ldisc(struct work_struct *work)
break;
if (!tty->receive_room) {
schedule_delayed_work(&tty->buf.work, 1);
+ done = 0;
break;
}
- if (count > tty->receive_room)
+ if (count > tty->receive_room) {
count = tty->receive_room;
+ done = 0;
+ }
char_buf = head->char_buf_ptr + head->read;
flag_buf = head->flag_buf_ptr + head->read;
head->read += count;
@@ -454,6 +464,10 @@ static void flush_to_ldisc(struct work_struct *work)
__tty_buffer_flush(tty);


clear_bit(TTY_FLUSHPENDING, &tty->flags);
wake_up(&tty->read_wait);

+ } else if (done && test_bit(TTY_EOFPENDING, &tty->flags)) {
+ /* The last bits hit the ldisc so set EOF */


+ wake_up(&tty->read_wait);
+ set_bit(TTY_EOF, &tty->flags);
}

OGAWA Hirofumi

unread,
Jul 27, 2009, 9:50:46 AM7/27/09
to Alan Cox, Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
Alan Cox <al...@lxorguk.ukuu.org.uk> writes:

> commit 70325fd1d4341896c17b6f1f1965370b5258d0b1
> Author: Alan Cox <al...@linux.intel.com>
> Date: Mon Jul 27 14:18:52 2009 +0100
>
> pty: ensure writes hit the reader before close
>
> Implement TTY_EOF/EOFPENDING flags so that we can propogate the close of the
> pty through the buffering correctly. The new flag state is locked but the
> tty buffer lock as it relates to buffers, and also because the buffer
> lock is already held in the hot path.
>
> Signed-off-by: Alan Cox <al...@linux.intel.com>

I also tested this patch, and it seems to work in some case. However, in
some case, n_tty_read() didn't return -EIO for read() of master.

> + spin_lock_irqsave(&pair->buf.lock, flags);
> + pair->packet = 0;
> + /* Indicate that the other end is now closed, set the
> + ENDPENDING marker so that the true end can be processed by

This seems typo s/ENDPENDING/EOFPENDING/


> @@ -47,13 +50,22 @@ static void pty_close(struct tty_struct *tty, struct file *filp)

[...]

> + set_bit(TTY_EOFPENDING, &pair->flags);
> + set_bit(TTY_OTHER_CLOSED, &pair->flags);
> + spin_unlock_irqrestore(&pair->buf.lock, flags);

tty_flip_buffer_push() or something?

> + wake_up_interruptible(&pair->read_wait);
> + wake_up_interruptible(&pair->write_wait);

It seems, this sets TTY_EOFPENDING, but if flush_to_ldisc() was done
already here, anybody doesn't set TTY_EOF for master.

Thanks.
--
OGAWA Hirofumi <hiro...@mail.parknet.co.jp>

Alan Cox

unread,
Jul 27, 2009, 9:58:34 AM7/27/09
to OGAWA Hirofumi, Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
> > + /* Indicate that the other end is now closed, set the
> > + ENDPENDING marker so that the true end can be processed by
>
> This seems typo s/ENDPENDING/EOFPENDING/

I renamed it but missed that.

> > @@ -47,13 +50,22 @@ static void pty_close(struct tty_struct *tty, struct file *filp)
>
> [...]
>
> > + set_bit(TTY_EOFPENDING, &pair->flags);
> > + set_bit(TTY_OTHER_CLOSED, &pair->flags);
> > + spin_unlock_irqrestore(&pair->buf.lock, flags);
>
> tty_flip_buffer_push() or something?
>
> > + wake_up_interruptible(&pair->read_wait);
> > + wake_up_interruptible(&pair->write_wait);
>
> It seems, this sets TTY_EOFPENDING, but if flush_to_ldisc() was done
> already here, anybody doesn't set TTY_EOF for master.

Does putting a tty_flip_buffer_push() at that point fix it. I had thought
I could remove it but you are right that creates a situation where EOF
may never get set.

Aneesh Kumar K.V

unread,
Jul 27, 2009, 9:59:15 AM7/27/09
to Alan Cox, OGAWA Hirofumi, Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

With this patch i still have "the compile in emacs" problem. But it is less frequent.
1 in 5 tries fail now

-aneesh

OGAWA Hirofumi

unread,
Jul 27, 2009, 11:04:53 AM7/27/09
to Alan Cox, Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
Alan Cox <al...@lxorguk.ukuu.org.uk> writes:

>> > + set_bit(TTY_EOFPENDING, &pair->flags);
>> > + set_bit(TTY_OTHER_CLOSED, &pair->flags);
>> > + spin_unlock_irqrestore(&pair->buf.lock, flags);
>>
>> tty_flip_buffer_push() or something?
>>
>> > + wake_up_interruptible(&pair->read_wait);
>> > + wake_up_interruptible(&pair->write_wait);
>>
>> It seems, this sets TTY_EOFPENDING, but if flush_to_ldisc() was done
>> already here, anybody doesn't set TTY_EOF for master.
>
> Does putting a tty_flip_buffer_push() at that point fix it. I had thought
> I could remove it but you are right that creates a situation where EOF
> may never get set.

With this patch, test program seems to work.

It seems "done = 0" hunk was needed for correct "done".

Thanks.


Signed-off-by: OGAWA Hirofumi <hiro...@mail.parknet.co.jp>
---

drivers/char/pty.c | 1 +
drivers/char/tty_buffer.c | 4 +---
2 files changed, 2 insertions(+), 3 deletions(-)

diff -puN drivers/char/pty.c~pty-fixes4-fix drivers/char/pty.c
--- linux-2.6/drivers/char/pty.c~pty-fixes4-fix 2009-07-27 23:56:27.000000000 +0900
+++ linux-2.6-hirofumi/drivers/char/pty.c 2009-07-27 23:56:40.000000000 +0900
@@ -64,6 +64,7 @@ static void pty_close(struct tty_struct
set_bit(TTY_EOFPENDING, &pair->flags);
set_bit(TTY_OTHER_CLOSED, &pair->flags);
spin_unlock_irqrestore(&pair->buf.lock, flags);
+ tty_flip_buffer_push(pair);
wake_up_interruptible(&pair->read_wait);


wake_up_interruptible(&pair->write_wait);
if (tty->driver->subtype == PTY_TYPE_MASTER) {

diff -puN drivers/char/tty_buffer.c~pty-fixes4-fix drivers/char/tty_buffer.c
--- linux-2.6/drivers/char/tty_buffer.c~pty-fixes4-fix 2009-07-27 23:57:27.000000000 +0900
+++ linux-2.6-hirofumi/drivers/char/tty_buffer.c 2009-07-27 23:57:30.000000000 +0900
@@ -443,10 +443,8 @@ static void flush_to_ldisc(struct work_s


done = 0;
break;
}
- if (count > tty->receive_room) {
+ if (count > tty->receive_room)
count = tty->receive_room;

- done = 0;
- }


char_buf = head->char_buf_ptr + head->read;
flag_buf = head->flag_buf_ptr + head->read;
head->read += count;

_

--
OGAWA Hirofumi <hiro...@mail.parknet.co.jp>

Aneesh Kumar K.V

unread,
Jul 27, 2009, 12:14:48 PM7/27/09
to OGAWA Hirofumi, Alan Cox, Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton


I still have the "compile in emacs" bug. So this patch along with
http://article.gmane.org/gmane.linux.kernel/869824 doesn't fix the
bug for me.

-aneesh

Alan Cox

unread,
Jul 27, 2009, 12:42:58 PM7/27/09
to Aneesh Kumar K.V, OGAWA Hirofumi, Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

> > - if (count > tty->receive_room) {
> > + if (count > tty->receive_room)
> > count = tty->receive_room;
> > - done = 0;
> > - }
> > char_buf = head->char_buf_ptr + head->read;
> > flag_buf = head->flag_buf_ptr + head->read;
> > head->read += count;
> > _
>
>
> I still have the "compile in emacs" bug. So this patch along with
> http://article.gmane.org/gmane.linux.kernel/869824 doesn't fix the
> bug for me.

Can you stick some printk calls in and debug it further then because at
with the fixes Ogawa provided I'm finding it impossible to reproduce even
on an 8 way x86.

What hardware are you using ?

Aneesh Kumar K.V

unread,
Jul 27, 2009, 1:12:38 PM7/27/09
to Alan Cox, OGAWA Hirofumi, Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
On Mon, Jul 27, 2009 at 05:42:52PM +0100, Alan Cox wrote:
>
> > > - if (count > tty->receive_room) {
> > > + if (count > tty->receive_room)
> > > count = tty->receive_room;
> > > - done = 0;
> > > - }
> > > char_buf = head->char_buf_ptr + head->read;
> > > flag_buf = head->flag_buf_ptr + head->read;
> > > head->read += count;
> > > _
> >
> >
> > I still have the "compile in emacs" bug. So this patch along with
> > http://article.gmane.org/gmane.linux.kernel/869824 doesn't fix the
> > bug for me.
>
> Can you stick some printk calls in and debug it further then because at
> with the fixes Ogawa provided I'm finding it impossible to reproduce even
> on an 8 way x86.
>
> What hardware are you using ?

My T60p lenovo laptop. If you can send me a debug prink patch i can redo the
test with the patches. I don't know what data i should start collecting so
that it make sense. This is the patch i tried

diff -ru /home/opensource/sources/linux-2.6/drivers/char/n_tty.c /home/opensource/build/linux-2.6-distro/drivers/char/n_tty.c
--- /home/opensource/sources/linux-2.6/drivers/char/n_tty.c 2009-07-17 10:07:40.210991073 +0530
+++ /home/opensource/build/linux-2.6-distro/drivers/char/n_tty.c 2009-07-27 18:56:14.518330502 +0530
@@ -1777,7 +1777,8 @@


tty->minimum_to_wake = (minimum - (b - buf));

if (!input_available_p(tty, 0)) {
- if (test_bit(TTY_OTHER_CLOSED, &tty->flags)) {
+ if (test_bit(TTY_EOF, &tty->flags)) {
+ /* PTY pair closed and all data consumed */
retval = -EIO;
break;
}

diff -ru /home/opensource/sources/linux-2.6/drivers/char/pty.c /home/opensource/build/linux-2.6-distro/drivers/char/pty.c
--- /home/opensource/sources/linux-2.6/drivers/char/pty.c 2009-07-26 22:45:50.354419907 +0530
+++ /home/opensource/build/linux-2.6-distro/drivers/char/pty.c 2009-07-27 21:29:26.217922778 +0530
@@ -38,6 +38,9 @@



static void pty_close(struct tty_struct *tty, struct file *filp)
{
+ struct tty_struct *pair;
+ unsigned long flags;
+
BUG_ON(!tty);

if (tty->driver->subtype == PTY_TYPE_MASTER)

WARN_ON(tty->count > 1);
@@ -47,13 +50,23 @@


}
wake_up_interruptible(&tty->read_wait);
wake_up_interruptible(&tty->write_wait);
+
tty->packet = 0;
- if (!tty->link)
+ pair = tty->link;
+ if (!pair)
return;
- tty->link->packet = 0;
- set_bit(TTY_OTHER_CLOSED, &tty->link->flags);
- wake_up_interruptible(&tty->link->read_wait);
- wake_up_interruptible(&tty->link->write_wait);
+
+ spin_lock_irqsave(&pair->buf.lock, flags);
+ pair->packet = 0;

+ /* Indicate that the other end is now closed, set the
+ ENDPENDING marker so that the true end can be processed by

+ the line discipline */

+ set_bit(TTY_EOFPENDING, &pair->flags);
+ set_bit(TTY_OTHER_CLOSED, &pair->flags);
+ spin_unlock_irqrestore(&pair->buf.lock, flags);

+ tty_flip_buffer_push(pair);


+ wake_up_interruptible(&pair->read_wait);
+ wake_up_interruptible(&pair->write_wait);

if (tty->driver->subtype == PTY_TYPE_MASTER) {

set_bit(TTY_OTHER_CLOSED, &tty->flags);
#ifdef CONFIG_UNIX98_PTYS

@@ -180,7 +193,6 @@



if (!to)
return;
- /* tty_buffer_flush(to); FIXME */
if (to->packet) {
spin_lock_irqsave(&tty->ctrl_lock, flags);
tty->ctrl_status |= TIOCPKT_FLUSHWRITE;

@@ -191,23 +203,30 @@

diff -ru /home/opensource/sources/linux-2.6/drivers/char/tty_buffer.c /home/opensource/build/linux-2.6-distro/drivers/char/tty_buffer.c
--- /home/opensource/sources/linux-2.6/drivers/char/tty_buffer.c 2009-04-28 21:21:06.964692375 +0530
+++ /home/opensource/build/linux-2.6-distro/drivers/char/tty_buffer.c 2009-07-27 21:29:26.229922140 +0530
@@ -119,6 +119,12 @@


tty_buffer_free(tty, thead);
}
tty->buf.tail = NULL;
+ /* We can EOF a pty/tty pair with a flush as well as by consuming
+ all the data left over */
+ if (test_bit(TTY_EOFPENDING, &tty->flags)) {
+ set_bit(TTY_EOF, &tty->flags);
+ wake_up(&tty->read_wait);
+ }
}

/**
@@ -405,6 +411,7 @@

struct tty_buffer *tbuf, *head;
char *char_buf;
unsigned char *flag_buf;
+ int done = 1;

disc = tty_ldisc_ref(tty);
if (disc == NULL) /* !TTY_LDISC */

@@ -433,6 +440,7 @@


break;
if (!tty->receive_room) {
schedule_delayed_work(&tty->buf.work, 1);
+ done = 0;
break;
}

if (count > tty->receive_room)
@@ -454,6 +462,10 @@


__tty_buffer_flush(tty);
clear_bit(TTY_FLUSHPENDING, &tty->flags);
wake_up(&tty->read_wait);
+ } else if (done && test_bit(TTY_EOFPENDING, &tty->flags)) {
+ /* The last bits hit the ldisc so set EOF */
+ wake_up(&tty->read_wait);
+ set_bit(TTY_EOF, &tty->flags);
}
clear_bit(TTY_FLUSHING, &tty->flags);
spin_unlock_irqrestore(&tty->buf.lock, flags);

Andreas Schwab

unread,
Jul 27, 2009, 2:29:02 PM7/27/09
to Aneesh Kumar K.V, OGAWA Hirofumi, Alan Cox, Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
"Aneesh Kumar K.V" <aneesh...@linux.vnet.ibm.com> writes:

> I still have the "compile in emacs" bug. So this patch along with
> http://article.gmane.org/gmane.linux.kernel/869824 doesn't fix the
> bug for me.

The expect testcase still fails, too.

Andreas.

$ cat pty.tcl
#!/usr/bin/expect -f
spawn sh -c "echo foo bar"
expect {
"foo bar" {puts $expect_out(buffer)}
}
$ expect -f pty.tcl
spawn echo foo bar

--
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."

OGAWA Hirofumi

unread,
Jul 27, 2009, 3:51:15 PM7/27/09
to Aneesh Kumar K.V, Alan Cox, Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
"Aneesh Kumar K.V" <aneesh...@linux.vnet.ibm.com> writes:

> On Mon, Jul 27, 2009 at 05:42:52PM +0100, Alan Cox wrote:
>>
>> > > - if (count > tty->receive_room) {
>> > > + if (count > tty->receive_room)
>> > > count = tty->receive_room;
>> > > - done = 0;
>> > > - }
>> > > char_buf = head->char_buf_ptr + head->read;
>> > > flag_buf = head->flag_buf_ptr + head->read;
>> > > head->read += count;
>> > > _
>> >
>> >
>> > I still have the "compile in emacs" bug. So this patch along with
>> > http://article.gmane.org/gmane.linux.kernel/869824 doesn't fix the
>> > bug for me.
>>
>> Can you stick some printk calls in and debug it further then because at
>> with the fixes Ogawa provided I'm finding it impossible to reproduce even
>> on an 8 way x86.
>>
>> What hardware are you using ?
>
> My T60p lenovo laptop. If you can send me a debug prink patch i can redo the
> test with the patches. I don't know what data i should start collecting so
> that it make sense. This is the patch i tried

If I read that part of emacs correctly, it seems to be assuming the data
was already sent to master side if the child process was exited.

And if it's right, unfortunately, I guess we can't return -EAGAIN in
this case to preserve behavior.

Aneesh, can you get the log of strace of emacs error case?

Thanks.
--
OGAWA Hirofumi <hiro...@mail.parknet.co.jp>

#define _GNU_SOURCE


#define BIG_BUF
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <error.h>
#include <limits.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <sys/wait.h>
#include <unistd.h>

static char pts_name[PATH_MAX];

static int open_pty(void)
{
int master;
char *name;

master = getpt();
if (master < 0)
return -1;

if (grantpt(master) < 0 || unlockpt(master) < 0)
goto close_master;

{


int on = 1;
ioctl(master, FIONBIO, &on);
}

name = ptsname(master);

strcpy(pts_name, name);

return master;

close_master:
close(master);
return -1;
}

return pid;
}

pid = child(master);

printf("some app doesn't expect EAGAIN\n");
break;


}
error(1, errno, "%s: read", __func__);
}
if (size == 0)
break;
#ifdef BIG_BUF
printf("size %zd\n", size);
#else
write(STDOUT_FILENO, buf, size);
#endif
}

return 0;
}

OGAWA Hirofumi

unread,
Jul 27, 2009, 4:38:36 PM7/27/09
to Linus Torvalds, Aneesh Kumar K.V, Alan Cox, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
Linus Torvalds <torv...@linux-foundation.org> writes:

> On Tue, 28 Jul 2009, OGAWA Hirofumi wrote:
>>
>> If I read that part of emacs correctly, it seems to be assuming the data
>> was already sent to master side if the child process was exited.
>

> That sounds like a rather obvious assumption.
>
> Aren't pty's flushing the data at flush() time? Which should be happening
> when the child process exits and closes the pty slave.

For tty, I guess yes. However, now, pty is pushing the data to other
side by background, I'm not sure at all though, I guess ppp is requiring it.

> So at what point do we just admit that the commit that caused all this was
> a buggy pile of sh*t and just revert it?

Also, I'm not sure though, I guess it depends on the bug which was fixed
by the commit.

I don't know about ppp problem at all. So, personally I'm ok either way
- we revert it and try to fix this for next merge window, or continue to
fix this for a while.

Alan-san?

Thanks.
--
OGAWA Hirofumi <hiro...@mail.parknet.co.jp>

Alan Cox

unread,
Jul 27, 2009, 4:53:01 PM7/27/09
to Linus Torvalds, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
On Mon, 27 Jul 2009 12:40:11 -0700 (PDT)
Linus Torvalds <torv...@linux-foundation.org> wrote:

>
>
> On Tue, 28 Jul 2009, OGAWA Hirofumi wrote:
> >

> > If I read that part of emacs correctly, it seems to be assuming the data
> > was already sent to master side if the child process was exited.
>

> That sounds like a rather obvious assumption.

> Aren't pty's flushing the data at flush() time? Which should be happening
> when the child process exits and closes the pty slave.

When you close the slave the device all the data has been queued to the
master

> So at what point do we just admit that the commit that caused all this was
> a buggy pile of sh*t and just revert it?

If we could "just revert it" and get a sane tree I'd have asked you to do
so quite some time ago and readdressed it later.

We can't "just revert it" or I'd have deferred it for the next release. If
you revert it you

- introduce a DoS attack
- break ppp
- introduce a pile of other tty races including at least one where the
right timings should let you jump through null pointers
- put back all sorts of random obscure hangs caused by all the lock
violations

and you'll note I pointed out this was a late change I was forced to make
and really didn't want to in the original commit.

Nor can we revert several patches because the ppp stuff means going back
to about 2.6.28 or so for the entire tty layer plus some of the DoS and
null pointer races go back to 2.2 or 2.0 8(.

We can use the two line slightly imperfect quickfix which people reported
does fix their problem and I'm tempted to go with that for 2.6.31 because
it works for the real world cases that matter.

Alan

Alan Cox

unread,
Jul 27, 2009, 5:19:54 PM7/27/09
to OGAWA Hirofumi, Aneesh Kumar K.V, Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
> > My T60p lenovo laptop. If you can send me a debug prink patch i can redo the
> > test with the patches. I don't know what data i should start collecting so
> > that it make sense. This is the patch i tried
>
> If I read that part of emacs correctly, it seems to be assuming the data
> was already sent to master side if the child process was exited.

Oops.. that would be rather buggy.

It is true that the data has left the slave. It is not neccessarily true
the data has arrived on the master before the call to close() completes.

We can fake that but what the hell are the semantics if the process on
the master blocks or if both processes each end fill the queue and then
close so no data can be consumed.

> And if it's right, unfortunately, I guess we can't return -EAGAIN in
> this case to preserve behavior.

To avoid the EAGAIN the close needs to block until all queued I/O has
been processed by the ldisc the other end. That's not standards
guaranteed or even what happens with a non pty port, but it is doable
unless you take signals - we can't block signals or it can all deadlock.

Linus Torvalds

unread,
Jul 27, 2009, 5:23:32 PM7/27/09
to Alan Cox, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

On Mon, 27 Jul 2009, Alan Cox wrote:
>
> Nor can we revert several patches because the ppp stuff means going back
> to about 2.6.28 or so for the entire tty layer plus some of the DoS and
> null pointer races go back to 2.2 or 2.0 8(.

The thing is, breaking ppp and reverting at least to 2.6.30 behavior.
Since it's better than the _current_ breakage.

> We can use the two line slightly imperfect quickfix which people reported
> does fix their problem and I'm tempted to go with that for 2.6.31 because
> it works for the real world cases that matter.

Umm. Which ones? People have reported kdesu and emacs breaking. Last I
saw, the emacs breakage wasn't fixed by any of the patches seen so far.

Linus

Alan Cox

unread,
Jul 27, 2009, 5:41:40 PM7/27/09
to OGAWA Hirofumi, Linus Torvalds, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
I'd favour we go with this which passes t1, expect, emacs and a corrected
t3

(test t3 is buggy as it has no \n and leaves the tty in line by line mode
so its typing a long line at the pty but never hits return to finish the
input)


It's theoretically imperfect in the case where you write a vast amount of
output in one go, the tty is blocked the other end and then you close.
However in practice that doesn't happen because with tty->low_latency = 1
we run the pty received n_tty ldisc code in our context so each write
fires through the entire n_tty ldisc and does flow control synchronously.

It needs re-addressing but its simple which at this point wins over
everything else and its one people tested before we tried to fix the hard
corner cases

commit aaf9da79c95a32fc5286fb851632baf09dc6134b
Author: Alan Cox <al...@linux.intel.com>
Date: Mon Jul 27 22:17:51 2009 +0100

pty: quickfix for the pty ENXIO timing problems

This also makes close stall in the normal case which is apparently needed
to fix emacs

Signed-off-by: Alan Cox <al...@linux.intel.com>

diff --git a/drivers/char/pty.c b/drivers/char/pty.c
index 6e6942c..3850a68 100644
--- a/drivers/char/pty.c
+++ b/drivers/char/pty.c
@@ -52,6 +52,7 @@ static void pty_close(struct tty_struct *tty, struct file *filp)
return;


tty->link->packet = 0;

set_bit(TTY_OTHER_CLOSED, &tty->link->flags);
+ tty_flip_buffer_push(tty->link);
wake_up_interruptible(&tty->link->read_wait);
wake_up_interruptible(&tty->link->write_wait);


if (tty->driver->subtype == PTY_TYPE_MASTER) {

@@ -207,6 +208,7 @@ static int pty_open(struct tty_struct *tty, struct file *filp)
clear_bit(TTY_OTHER_CLOSED, &tty->link->flags);


set_bit(TTY_THROTTLED, &tty->flags);
retval = 0;

+ tty->low_latency = 1;
out:
return retval;
}

Alan Cox

unread,
Jul 27, 2009, 5:53:56 PM7/27/09
to Linus Torvalds, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
On Mon, 27 Jul 2009 14:22:40 -0700 (PDT)
Linus Torvalds <torv...@linux-foundation.org> wrote:

>
>
> On Mon, 27 Jul 2009, Alan Cox wrote:
> >
> > Nor can we revert several patches because the ppp stuff means going back
> > to about 2.6.28 or so for the entire tty layer plus some of the DoS and
> > null pointer races go back to 2.2 or 2.0 8(.
>
> The thing is, breaking ppp and reverting at least to 2.6.30 behavior.
> Since it's better than the _current_ breakage.
>
> > We can use the two line slightly imperfect quickfix which people reported
> > does fix their problem and I'm tempted to go with that for 2.6.31 because
> > it works for the real world cases that matter.
>
> Umm. Which ones? People have reported kdesu and emacs breaking. Last I
> saw, the emacs breakage wasn't fixed by any of the patches seen so far.

Aneesh verified the tty->low_latency patch fixed emacs (Sunday mail in
the thread "Re: [Regression] kdesu broken")

Linus Torvalds

unread,
Jul 27, 2009, 6:05:03 PM7/27/09
to Alan Cox, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

On Mon, 27 Jul 2009, Alan Cox wrote:
>

> It's theoretically imperfect in the case where you write a vast amount of
> output in one go, the tty is blocked the other end and then you close.
> However in practice that doesn't happen because with tty->low_latency = 1
> we run the pty received n_tty ldisc code in our context so each write
> fires through the entire n_tty ldisc and does flow control synchronously.

An alternative might be something like this.

THIS IS TOTALLY UNTESTED.

This is just meant as a "this kind of approach may be a good idea", and
just tries to make sure that any delayed work is flushed by closing the
tty.

No guarantees. It may or may not compile. It may or may not make any
difference. It might do unspeakable things to your pets. It really is
meant to be an example of what a "flush" function could/should do. There
are probably other things/buffers that may need flushing.

Linus

---
drivers/char/tty_io.c | 33 +++++++++++++++++++++++++++++++++
1 files changed, 33 insertions(+), 0 deletions(-)

diff --git a/drivers/char/tty_io.c b/drivers/char/tty_io.c
index a3afa0c..1be69af 100644
--- a/drivers/char/tty_io.c
+++ b/drivers/char/tty_io.c
@@ -143,6 +143,7 @@ ssize_t redirected_tty_write(struct file *, const char __user *,
static unsigned int tty_poll(struct file *, poll_table *);
static int tty_open(struct inode *, struct file *);
static int tty_release(struct inode *, struct file *);
+static int tty_flush(struct file *filp, fl_owner_t id);
long tty_ioctl(struct file *file, unsigned int cmd, unsigned long arg);
#ifdef CONFIG_COMPAT
static long tty_compat_ioctl(struct file *file, unsigned int cmd,
@@ -415,6 +416,7 @@ static const struct file_operations tty_fops = {
.unlocked_ioctl = tty_ioctl,
.compat_ioctl = tty_compat_ioctl,
.open = tty_open,
+ .flush = tty_flush,
.release = tty_release,
.fasync = tty_fasync,
};
@@ -1831,7 +1833,38 @@ static int tty_open(struct inode *inode, struct file *filp)
return ret;
}

+/* This should probably be a generic function */
+static void flush_delayed_work(struct delayed_work *work)
+{
+ if (cancel_delayed_work(work)) {
+ schedule_delayed_work(work, 0);
+ flush_scheduled_work();
+ }
+}

+/**
+ * tty_flush - vfs callback for close
+ * @filp: file pointer for handle to tty
+ * @id: struct files_struct of owner.
+ *
+ * Called for every close(), whether the last or not
+ */
+static int tty_flush(struct file *filp, fl_owner_t id)
+{
+ struct tty_struct *tty, *o_tty;
+ struct inode *inode;
+
+ inode = filp->f_path.dentry->d_inode;
+ tty = (struct tty_struct *)filp->private_data;
+
+ if (tty_paranoia_check(tty, inode, "tty_flush"))
+ return 0;
+
+ o_tty = tty->link;
+ if (o_tty)
+ flush_delayed_work(&o_tty->buf.work);
+ return 0;
+}


/**

Alan Cox

unread,
Jul 27, 2009, 6:40:48 PM7/27/09
to Linus Torvalds, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
> No guarantees. It may or may not compile. It may or may not make any
> difference. It might do unspeakable things to your pets. It really is
> meant to be an example of what a "flush" function could/should do. There
> are probably other things/buffers that may need flushing.

Wrong end of the stick. close() already flushes the transmit buffers
properly (and passes the SuS test suite tests on behaviour). Its the
receive the other end you need to sync which isn't an assumption made or
something with any meaning on real hardware.

> +/* This should probably be a generic function */
> +static void flush_delayed_work(struct delayed_work *work)
> +{
> + if (cancel_delayed_work(work)) {
> + schedule_delayed_work(work, 0);
> + flush_scheduled_work();
> + }

If the ldisc is busy the scheduled work will not flush the data to the
other tty. The tty->low_latency fix actually works better than this
because it makes tty_flip_buffer_push() force the work through that can
be done as it calls the ldisc work synchronously.

> + * Called for every close(), whether the last or not

The pty case is "special" and seems to be an accident of implementations
in history. The problem on the pty side is on the input not the output
side.

A normal tty write goes from the ldisc to the device, and in that case
the close behaviour is defined by the close delay and FIFO flush in the
driver. In other words it is already done correctly, only done for the
last close (some apps rely on that as a sneaky way to keep the main
thread from blocking).

The pty race is quite different

A write to the pty is guaranteed to correctly have completed on the
close(). The data may however not have been received by the other end as
the ldisc could be busy or not yet scheduled. If you like the data is
sitting on the virtual wire". Or more exactly its in the "not yet fed to
the ldisc" buffers on the other end.

Adding the EOFPENDING/EOF flag fixes the detection of the EOF by the
receiver but doesn't guarantee the sender blocks. If the sender also
waits for the EOF flag to be set then the sender knows the receiving
ldisc has got all the data. Thats a fairly small mod from the existing
long patch.

Except: The other end could also be in EOFPENDING/EOF wait with both ends
blocked by the ldisc and not accepting data. In which case you just
deadlocked. The close() blocks forever waiting for another end which may
also be close() blocked in the other direction isn't a viable
implementation although one I'm quite keen to try out on a Mac or BSD box
out of curiosity.

OGAWA Hirofumi

unread,
Jul 28, 2009, 1:34:05 AM7/28/09
to Alan Cox, Aneesh Kumar K.V, Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
Alan Cox <al...@lxorguk.ukuu.org.uk> writes:

>> And if it's right, unfortunately, I guess we can't return -EAGAIN in
>> this case to preserve behavior.
>
> To avoid the EAGAIN the close needs to block until all queued I/O has
> been processed by the ldisc the other end. That's not standards
> guaranteed or even what happens with a non pty port, but it is doable
> unless you take signals - we can't block signals or it can all deadlock.

Just a quick hack though. Is this wrong/unpreferable way?

n_tty_read() checks the pending buffer and consume it before
input_available_p().

Thanks.
--
OGAWA Hirofumi <hiro...@mail.parknet.co.jp>

Signed-off-by: OGAWA Hirofumi <hiro...@mail.parknet.co.jp>
---

drivers/char/n_tty.c | 1 +
drivers/char/pty.c | 1 -
drivers/char/tty_buffer.c | 7 +++++--
include/linux/tty.h | 1 +
4 files changed, 7 insertions(+), 3 deletions(-)

diff -puN drivers/char/pty.c~pty-fixes4-fix2 drivers/char/pty.c
--- linux-2.6/drivers/char/pty.c~pty-fixes4-fix2 2009-07-28 05:00:45.000000000 +0900
+++ linux-2.6-hirofumi/drivers/char/pty.c 2009-07-28 05:00:48.000000000 +0900
@@ -64,7 +64,6 @@ static void pty_close(struct tty_struct

set_bit(TTY_EOFPENDING, &pair->flags);
set_bit(TTY_OTHER_CLOSED, &pair->flags);
spin_unlock_irqrestore(&pair->buf.lock, flags);

- tty_flip_buffer_push(pair);
wake_up_interruptible(&pair->read_wait);


wake_up_interruptible(&pair->write_wait);
if (tty->driver->subtype == PTY_TYPE_MASTER) {

diff -puN drivers/char/n_tty.c~pty-fixes4-fix2 drivers/char/n_tty.c
--- linux-2.6/drivers/char/n_tty.c~pty-fixes4-fix2 2009-07-28 05:01:10.000000000 +0900
+++ linux-2.6-hirofumi/drivers/char/n_tty.c 2009-07-28 05:07:49.000000000 +0900
@@ -1776,6 +1776,7 @@ do_it_again:
((minimum - (b - buf)) >= 1))


tty->minimum_to_wake = (minimum - (b - buf));

+ tty_flush_to_ldisc(tty);
if (!input_available_p(tty, 0)) {
if (test_bit(TTY_EOF, &tty->flags)) {


/* PTY pair closed and all data consumed */

diff -puN include/linux/tty.h~pty-fixes4-fix2 include/linux/tty.h
--- linux-2.6/include/linux/tty.h~pty-fixes4-fix2 2009-07-28 05:07:07.000000000 +0900
+++ linux-2.6-hirofumi/include/linux/tty.h 2009-07-28 05:07:30.000000000 +0900
@@ -396,6 +396,7 @@ extern void __do_SAK(struct tty_struct *
extern void disassociate_ctty(int priv);
extern void no_tty(void);
extern void tty_flip_buffer_push(struct tty_struct *tty);
+extern void tty_flush_to_ldisc(struct tty_struct *tty);
extern void tty_buffer_free_all(struct tty_struct *tty);
extern void tty_buffer_flush(struct tty_struct *tty);
extern void tty_buffer_init(struct tty_struct *tty);
diff -puN drivers/char/tty_buffer.c~pty-fixes4-fix2 drivers/char/tty_buffer.c
--- linux-2.6/drivers/char/tty_buffer.c~pty-fixes4-fix2 2009-07-28 05:41:12.000000000 +0900
+++ linux-2.6-hirofumi/drivers/char/tty_buffer.c 2009-07-28 14:24:48.000000000 +0900
@@ -388,8 +388,6 @@ int tty_prepare_flip_string_flags(struct
}
EXPORT_SYMBOL_GPL(tty_prepare_flip_string_flags);

-
-
/**
* flush_to_ldisc
* @work: tty structure passed from work queue.
@@ -473,6 +471,11 @@ static void flush_to_ldisc(struct work_s
tty_ldisc_deref(disc);
}

+void tty_flush_to_ldisc(struct tty_struct *tty)
+{
+ flush_to_ldisc(&tty->buf.work.work);
+}
+
/**
* tty_flip_buffer_push - terminal
* @tty: tty to push
_

Alan Cox

unread,
Jul 28, 2009, 6:21:43 AM7/28/09
to OGAWA Hirofumi, Aneesh Kumar K.V, Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
On Tue, 28 Jul 2009 14:33:40 +0900
OGAWA Hirofumi <hiro...@mail.parknet.co.jp> wrote:

> Alan Cox <al...@lxorguk.ukuu.org.uk> writes:
>
> >> And if it's right, unfortunately, I guess we can't return -EAGAIN in
> >> this case to preserve behavior.
> >
> > To avoid the EAGAIN the close needs to block until all queued I/O has
> > been processed by the ldisc the other end. That's not standards
> > guaranteed or even what happens with a non pty port, but it is doable
> > unless you take signals - we can't block signals or it can all deadlock.
>
> Just a quick hack though. Is this wrong/unpreferable way?
>
> n_tty_read() checks the pending buffer and consume it before
> input_available_p().

That won't change the fact that the process could have exited. You can
fix the -ENXIO reporting that way (and it is basically what the
EOFPENDING/EOF patch did), but the only way I can see to fix the
assumption that the process exit means all the data is in the ldisc the
other end ready to use is to actually to make the close() path block -
but with some kind of limits to prevent deadlocks.

Given the assumptions in emacs are wrong, low_latency fixes the real
world cases and we are standards compliant perhaps we are trying too
hard ?

OGAWA Hirofumi

unread,
Jul 28, 2009, 6:42:47 AM7/28/09
to Alan Cox, Aneesh Kumar K.V, Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
Alan Cox <al...@lxorguk.ukuu.org.uk> writes:

>> Just a quick hack though. Is this wrong/unpreferable way?
>>
>> n_tty_read() checks the pending buffer and consume it before
>> input_available_p().
>
> That won't change the fact that the process could have exited.

Yes.

> You can fix the -ENXIO reporting that way (and it is basically what
> the EOFPENDING/EOF patch did), but the only way I can see to fix the
> assumption that the process exit means all the data is in the ldisc
> the other end ready to use is to actually to make the close() path
> block - but with some kind of limits to prevent deadlocks.

I thought, to check in n_tty_read() may guarantee that tty->buf (slave
guarantee to sent to tty->buf) is consumed by master side.

I hoped this tty->buf works as the pending data in ldisc.

Thanks.
--
OGAWA Hirofumi <hiro...@mail.parknet.co.jp>

Linus Torvalds

unread,
Jul 28, 2009, 11:48:27 AM7/28/09
to OGAWA Hirofumi, Alan Cox, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

On Tue, 28 Jul 2009, OGAWA Hirofumi wrote:
>

> Just a quick hack though. Is this wrong/unpreferable way?

That seems to be right regardless of any other issues.

> n_tty_read() checks the pending buffer and consume it before
> input_available_p().

Why not move this _inside_ "input_available_p()"? There are only two
call-sites, and strictly speaking they both want it.

Look at n_tty_poll(), for example:

if (input_available_p(tty, TIME_CHAR(tty) ? 0 : MIN_CHAR(tty)))
mask |= POLLIN | POLLRDNORM;
if (tty->packet && tty->link->ctrl_status)
mask |= POLLPRI | POLLIN | POLLRDNORM;
if (test_bit(TTY_OTHER_CLOSED, &tty->flags))
mask |= POLLHUP;
if (tty_hung_up_p(file))
mask |= POLLHUP;

and notice what happens to somebody who uses select/poll when there might
be pending data that hasn't been handled yet, and the tty has been marked
TTY_OTHER_CLOSED or hung up. It would return only POLLHUP, and as a
result, that side would never even try to read the pending data because
poll implies that there is no data and it's EOF. Which is just wrong.

So _any_ time you check "is there input available?" you should always
check if there are other buffers. No?

Linus

Alan Cox

unread,
Jul 28, 2009, 12:42:47 PM7/28/09
to Linus Torvalds, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
On Tue, 28 Jul 2009 08:49:29 -0700 (PDT)
Linus Torvalds <torv...@linux-foundation.org> wrote:

>
>
> On Tue, 28 Jul 2009, Alan Cox wrote:
> >
> > Given the assumptions in emacs are wrong, low_latency fixes the real
> > world cases and we are standards compliant perhaps we are trying too
> > hard ?
>

> Alan, I really don't understand why you seem to argue _for_ bad code, and
> not just fixing it. You seem to think that it's ok to leave stuff in
> buffers and ignore it, just because the other end might have exited.
>
> That seems disingenious and stupid. Why argue for crap?

We don't ignore it. What goes in one end comes out the other after tty
processing (ldisc, echo etc). Reliably. Both the EOF fix and the
tty->low_latency fix cure that. [The low latency one also provides the
*exact* same semantics as we had prior to 2.6.31-rc as well]

If I understand Ogawa correctly then what emacs thinks is true is quite
different: Emacs thinks that

write(pty, "data", length)
close(pty)
exit()

will always ensure that the other end has already got the data before
close() completes - or to be more exact before the parent receives SIGCLD.

Unfortunately as both ends could do

write(pty, "enough to fill all the buffers but not block")
close(pty)

at the same time it doesn't strike me as a good idea because it will
deadlock.

Ogawa's latest experiment is actually quite interesting which is to also
run the ldisc processing off the back of the user context. Unfortunately
that breaks assorted assumptions in the kernel because the
close/hangup/ldisc change paths believe that if they kill the work queue
for feeding data into the ldisc then the ldisc will stop receiving
data. It believes that therefore the ldisc will not do things like call
the write method to echo data back to a device which has gone away, which
generally involves jumping through null pointers.

So to be clear: WE DO NOT LOSE DATA.

Linus Torvalds

unread,
Jul 28, 2009, 12:49:47 PM7/28/09
to Alan Cox, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

On Tue, 28 Jul 2009, Alan Cox wrote:
>
> We don't ignore it.

Yes we do. Look at Ogawa-san's patch. And read my email answer to it.

> What goes in one end comes out the other after tty
> processing (ldisc, echo etc). Reliably. Both the EOF fix and the
> tty->low_latency fix cure that. [The low latency one also provides the
> *exact* same semantics as we had prior to 2.6.31-rc as well]

I agree that the low-latency thing should fix things, and I applied it.
But I think that Ogawa's patch is fundamentally "correct" at a much higher
level. Rather than depend on low-latency being set, it just "Does The
Right Thing(tm)", by making sure that readers never even look at the EOF
bits etc until they have flushed the tty ldisc state.

> If I understand Ogawa correctly then what emacs thinks is true is quite
> different: Emacs thinks that
>
> write(pty, "data", length)
> close(pty)
> exit()
>
> will always ensure that the other end has already got the data before
> close() completes - or to be more exact before the parent receives SIGCLD.

. and depending on what emacs does with signals and it's select() loop,
that may actually be _entirely_reasonable_.

Imagine being in 'select()' (or read, for that matter), and getting EINTR
due to SIGCHLD. What is the correct expectations?

The correct expectation is that the select() (or read()) should have
returned any data that it saw _before_ it returns EINTR.

Linus

Alan Cox

unread,
Jul 28, 2009, 1:06:05 PM7/28/09
to Linus Torvalds, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
> Imagine being in 'select()' (or read, for that matter), and getting EINTR
> due to SIGCHLD. What is the correct expectations?

If you are asking that question I don't think you understand the bug
report.

> The correct expectation is that the select() (or read()) should have
> returned any data that it saw _before_ it returns EINTR.

read() handles that correctly, has always done so.

emacs from the traces does this

set O_NDELAY
wait for SIGCLD
read()
EAGAIN
shit_myself();

The EWOULDBLOCK is perfectly correctly reporting that at that precise
instant no data is available.

Correctly written code does this

loop: read()
EAGAIN
poll
goto loop

and in that case our code all works correctly.

If you put the whole thing on a timeline it looks like this (without
lowlatency being enabled but with the EOF thing fixed)


Slave Emacs Kernel

write "errors"
Queue to tty->buf
close
This is all the data
Other end closed
exit
SIGCLD
read pty
EWOULDBLOCK
shit myself
schedule n_tty ldisc
queue bytes to parent

died rather than waited

Had emacs used poll() properly then it ought to have all worked, although
a spot of review of that wouldn't be a bad thing.

Also calling into the n_tty ldisc side processing in the receiver
unfortunately opens up this stuff


** interrupt path **
data received
queue to tty->buf, and wake

** hangup **
closes the ldisc down
waits for the workqueue to complete
physical instance goes away

** a read already in progress **
(new reads will go via tty_hung_up_write())
process n_tty ldisc
(which is already closed)
echo char
write to non-existant device
jump to fishkill (or exploitville or wherever)

Because the hangup code, ldisc and open/close code all work on the basis
that

- a hang up means the caller will not call tty_flip_buffer_push
after calling tty_hangup.
- the only other path (the non low latency path) is the workqueue
which it takes care to kill off

Alan Cox

unread,
Jul 28, 2009, 1:08:24 PM7/28/09
to Linus Torvalds, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
> Put another way: our pty code is simply _buggy_ if it returns EINTR when
> there is actually data pending on a pty.

Good job it doesn't do that then - although be careful what "data
pending" means. If the buffer contains "wombat" and you are in ICANON
mode then there is no data pending, and poll() likewise will say there is
no data pending. Only when newline is hit do you have data pending (which
is why test t3 is buggy)

Alan

Linus Torvalds

unread,
Jul 28, 2009, 2:44:41 PM7/28/09
to Alan Cox, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

On Tue, 28 Jul 2009, Alan Cox wrote:
>

> If you are asking that question I don't think you understand the bug
> report.

I don't think YOU understand what I'm saying.

> > The correct expectation is that the select() (or read()) should have
> > returned any data that it saw _before_ it returns EINTR.
>
> read() handles that correctly, has always done so.
>
> emacs from the traces does this
>
> set O_NDELAY
> wait for SIGCLD
> read()
> EAGAIN
> shit_myself();

Go back and read my email.

Emacs is ENTIRELY PROPER in doing that. If it has gotten the SIGCHLD, then
it damn well should know that the data is buffered already, since the
child sure as hell isn't writing any more. So if it gets EAGAIN due to the
SIGCHLD, it can assume that there isn't going to be any more data.

My point is that a program _should_ be able to depend on simple causality
when it comes to ordering rules. If the child did a write() before
exiting, then we should see the data before SIGCHLD.

It's really that simple.

Linus

Linus Torvalds

unread,
Jul 28, 2009, 2:46:20 PM7/28/09
to Alan Cox, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

On Tue, 28 Jul 2009, Alan Cox wrote:

> > Put another way: our pty code is simply _buggy_ if it returns EINTR when
> > there is actually data pending on a pty.
>
> Good job it doesn't do that then - although be careful what "data
> pending" means. If the buffer contains "wombat" and you are in ICANON
> mode then there is no data pending, and poll() likewise will say there is
> no data pending. Only when newline is hit do you have data pending (which
> is why test t3 is buggy)

Alan, that's a total red herring. We're not talking t3. We're talking
emacs, and the newline is there.

You claim that emacs sh*ts itself when it gets EAGAIN, and you think
that's an emacs bug. And I think you're full of crap. We should NEVER EVER
get EAGAIN (due to the SIGCHLD, at least) if the app on the other side
wrote data that could be read.

Linus

Alan Cox

unread,
Jul 28, 2009, 2:56:54 PM7/28/09
to Linus Torvalds, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
> My point is that a program _should_ be able to depend on simple causality
> when it comes to ordering rules. If the child did a write() before
> exiting, then we should see the data before SIGCHLD.
>
> It's really that simple.

In which case you need the write to be synchronous to the ldisc
processing. Which is what tty->low_latency = 1 did. That provides your
required causality.

So what exactly is the problem. Setting ->low_latency has a small
performance impact which was why I was trying the other stuff, but that
is all.

As I said tty->low_latency = 1 gives you the previous semantics.

Alan

Linus Torvalds

unread,
Jul 28, 2009, 3:09:48 PM7/28/09
to Alan Cox, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

On Tue, 28 Jul 2009, Alan Cox wrote:
>

> In which case you need the write to be synchronous to the ldisc
> processing. Which is what tty->low_latency = 1 did. That provides your
> required causality.
>
> So what exactly is the problem.

The problem is that you seem to be arguing against the _nicer_ fix, that
also makes more conceptual sense, and that doesn't even depend on the
whole low-latency hackery.

And I don't see why you argue that.

Furthermore, you have been CONTINUALLY arguing that emacs is buggy.
Without any logic to back that up what-so-ever. You argued that just a few
minutes ago.

Why? Why are you making those outlandish claims? What I'm so unhappy about
is that your whole reaction to it - from the beginning - was to blame
anything but the patch that caused it, and that it was bisected down to.

Why? Why blame emacs? Why call user land buggy, when the bug was
introduced by you, and was in the kernel? Why are you fighting it? Why did
it take so long to admit that all the regressions were kernel problems?
Why were you trying to dismiss the regression report as a user-land bug
from the very beginning?

Let me quote one of your first emails as I got cc'd on this thread:

> > See the thread starting here: ("possible regression with pty.c commit")
> > http://lkml.org/lkml/2009/7/11/125
>
> Thanks for the pointer.
>
> Well, I thought we were expected to avoid breaking existing user space, even
> if that were buggy etc.

I don't know where you got that idea from. Avoiding breaking user space
unneccessarily is good but if its buggy you often can't do anything about
it.

That was you talking to Rafael, who tracks regressions, and trying to
argue that it wasn't a kernel bug (both in the double quoted thing and
in the final one).

And no, that was not a fluke. TODAY, you sent out an email saying that
EWOUDLBLOCK/EAGAIN would be "correct". Despite me having told you that it
_clearly_ is not correct, and despite you knowing that it breaks user
space.

WHY?

Quite frankly, I don't understand why I should even have to bring these
issues up. You should have tried to fix the problem immediately, without
arguing against fixing the kernel. Without blaming user space. Without
making idiotic excuses for bad kernel behavior.

The fact is, breaking regular user applications is simply not acceptable.
Trying to blame kernel breakage on the app being "buggy" is not ok. And
arguing for almost a week against fixing it - that's just crazy.

Linus

Alan Cox

unread,
Jul 28, 2009, 3:16:00 PM7/28/09
to Linus Torvalds, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

> Quite frankly, I don't understand why I should even have to bring these
> issues up. You should have tried to fix the problem immediately, without
> arguing against fixing the kernel. Without blaming user space. Without
> making idiotic excuses for bad kernel behavior.
>
> The fact is, breaking regular user applications is simply not acceptable.
> Trying to blame kernel breakage on the app being "buggy" is not ok. And
> arguing for almost a week against fixing it - that's just crazy.

I've been working on fixing it. I have spent a huge amount of time
working on the tty stuff trying to gradually get it sane without breaking
anything and fixing security holes along the way as they came up. I spent
the past two evenings working on the tty regressions.

However I've had enough. If you think that problem is easy to fix you fix
it.

Have fun.

I've zapped the tty merge queue so anyone with patches for the tty layer
can send them to the new maintainer.

--- MAINTAINERS~ 2009-07-23 15:36:41.000000000 +0100
+++ MAINTAINERS 2009-07-28 20:09:32.200685827 +0100
@@ -5815,10 +5815,7 @@
S: Maintained

TTY LAYER
-P: Alan Cox
-M: al...@lxorguk.ukuu.org.uk
-S: Maintained
-T: stgit http://zeniv.linux.org.uk/~alan/ttydev/
+S: Unmaintained
F: drivers/char/tty_*
F: drivers/serial/serial_core.c
F: include/linux/serial_core.h

Greg KH

unread,
Jul 28, 2009, 3:57:54 PM7/28/09
to Alan Cox, Linus Torvalds, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
On Tue, Jul 28, 2009 at 08:15:53PM +0100, Alan Cox wrote:
>
> > Quite frankly, I don't understand why I should even have to bring these
> > issues up. You should have tried to fix the problem immediately, without
> > arguing against fixing the kernel. Without blaming user space. Without
> > making idiotic excuses for bad kernel behavior.
> >
> > The fact is, breaking regular user applications is simply not acceptable.
> > Trying to blame kernel breakage on the app being "buggy" is not ok. And
> > arguing for almost a week against fixing it - that's just crazy.
>
> I've been working on fixing it. I have spent a huge amount of time
> working on the tty stuff trying to gradually get it sane without breaking
> anything and fixing security holes along the way as they came up. I spent
> the past two evenings working on the tty regressions.
>
> However I've had enough. If you think that problem is easy to fix you fix
> it.
>
> Have fun.
>
> I've zapped the tty merge queue so anyone with patches for the tty layer
> can send them to the new maintainer.

Do you have a pointer to your previous queue before you zapped it so
that others might be able to pick it up?

thanks,

greg k-h

Theodore Tso

unread,
Jul 28, 2009, 4:47:55 PM7/28/09
to Greg KH, Alan Cox, Linus Torvalds, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
On Tue, Jul 28, 2009 at 12:56:48PM -0700, Greg KH wrote:
>
> Do you have a pointer to your previous queue before you zapped it so
> that others might be able to pick it up?
>

It's also available from linux-next as commit:

024ea7873598686f3e02002ca23c2e8fa3fddd30

- Ted

Greg KH

unread,
Jul 28, 2009, 5:02:23 PM7/28/09
to Theodore Tso, Alan Cox, Linus Torvalds, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
On Tue, Jul 28, 2009 at 04:47:20PM -0400, Theodore Tso wrote:
> On Tue, Jul 28, 2009 at 12:56:48PM -0700, Greg KH wrote:
> >
> > Do you have a pointer to your previous queue before you zapped it so
> > that others might be able to pick it up?
> >
>
> It's also available from linux-next as commit:
>
> 024ea7873598686f3e02002ca23c2e8fa3fddd30

Hm, that's the wrong end of the tree, I think you mean
fbfa97a856a4ce2fa86058cff84f589eb7f1364e, right?

thanks,

greg k-h

Alan Cox

unread,
Jul 28, 2009, 7:46:31 PM7/28/09
to Linus Torvalds, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
> Let me quote one of your first emails as I got cc'd on this thread:

Which is not about the emacs bug but old versions of kdesu relying upon
data boundaries happening to occur in the "right" places.

> And no, that was not a fluke. TODAY, you sent out an email saying that
> EWOUDLBLOCK/EAGAIN would be "correct". Despite me having told you that it
> _clearly_ is not correct, and despite you knowing that it breaks user
> space.

I did not as you seem to think ever advocate not fixing the emacs problem.
Clearly we need to fix the behaviour because apps rely on it and its not
an outright "duh.." on the app side its a reasonable expectation in the
simple case.

Instead I got a whole collection of mail from you most of which indicated
you hadn't even understood the problem.

BTW: The tty->low_latency fix doesn't work, because the ->write method
can be called from an IRQ and that means we can't use ->low_latency=1 as
we take mutexes.

If you don't take the mutexes you revert the fixes that stop high speed
data sessions such as 3G Modem randomly hanging until you disconnect from
your provider and try again.

So the current state of play is
as is breaks emacs
revert pty changes breaks ppp,
tty->low_latency breaks ppp
Ogawa's patch can lock up and races including exploitable jumps
into freed memory
blocking in close deadlocks if both ends get stuck that way

The best suggestion I can come up with for the new maintainer is to
implement a block in close() on the slave side only, and probably with an
interruptible wait. Set TTY_EOFPENDING when you enter close, sleep on
TTY_EOF being set when the workqueue flushing data into the ldisc
finishes shovelling. Providing exit() closes the file handles before
sending the signal that'll then give expected semantics. A pty master
close is "magic" anyway as it triggers a hangup event.

> The fact is, breaking regular user applications is simply not acceptable.
> Trying to blame kernel breakage on the app being "buggy" is not ok. And
> arguing for almost a week against fixing it - that's just crazy.

The security exploits on 2.6.27 don't work on 2.6.30, please fix
the regression so I get root again. Get real, that is what you just
claimed and its donkey poo. There is a point at which you fix stuff and a
point at which you don't.

The thing I advocated not fixing is that in some cases kdesu fails
because some old versions at least do

read(fd, buf, lots);
if (buf starts with something I want) {
while(read more data)
look for pattern
}

but never looks for pattern in the first read data after the start it
wanted - ie it relied upon magic happy chance buffer boundary
preservation. Thats well beyond what is sane to try and preserve unless
it is easy to do which its not.

The emacs assumption is wrong, actually always untrue for some cases
(fork, child inherits pty, parent exits for one) but worth preserving for
the cases it happens to work. I've never argued otherwise.

Alan

Alan Cox

unread,
Jul 28, 2009, 7:49:08 PM7/28/09
to Theodore Tso, Greg KH, Linus Torvalds, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
On Tue, 28 Jul 2009 18:02:03 -0400
Theodore Tso <ty...@mit.edu> wrote:

> On Tue, Jul 28, 2009 at 02:01:21PM -0700, Greg KH wrote:
> > On Tue, Jul 28, 2009 at 04:47:20PM -0400, Theodore Tso wrote:
> > > On Tue, Jul 28, 2009 at 12:56:48PM -0700, Greg KH wrote:
> > > >
> > > > Do you have a pointer to your previous queue before you zapped it so
> > > > that others might be able to pick it up?
> > > >
> > >
> > > It's also available from linux-next as commit:
> > >
> > > 024ea7873598686f3e02002ca23c2e8fa3fddd30
> >
> > Hm, that's the wrong end of the tree, I think you mean
> > fbfa97a856a4ce2fa86058cff84f589eb7f1364e, right?
>

> It all depends on your point of view, I guess. The "last" commit in
> the branch is 024ea78, so you can see the whole queue if you use the
> command "git log 024ea78"; hence, it's the most useful if you're
> trying to find the patch series via git.
>
> The whole set of tty patches can be found via:
>
> git log e00b95d..024ea78
>
> (where e00b95d is the parent of fbfa978).

I'll fish them out tomorrow. What is in the -next tree is only some of
the bits.

We still have races on the USB side too btw I put a frequency generator
on the carrier line of my pl2303 with an app continually opening the
port and it oopses after about 10 minutes. On the bright side so does
2.6.29, 2.6.27 ....

Alan

Linus Torvalds

unread,
Jul 28, 2009, 8:11:51 PM7/28/09
to Alan Cox, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

On Wed, 29 Jul 2009, Alan Cox wrote:
>
> BTW: The tty->low_latency fix doesn't work, because the ->write method
> can be called from an IRQ and that means we can't use ->low_latency=1 as
> we take mutexes.

Ok. So the end result is that Ogawa-san's fix is the right one. Then we
can revert the low_latency=1 thing for pty's entirely. No?

And this is also the one that _looks_ the sanest - ie we do it on the read
side, where it matters, rather than on the write side or the flush side
(where the proper flushing can _also_ fix the problem, but where it's much
more problematic, and where it's a lot less direct about what we care
about).

This is just Ogawa's patch, redone against current -git, and with commit
3a54297478e6578f96fd54bf4daa1751130aca86 reverted (Ogawa's patch already
undid the non-low_latency part of it).

Now, I wonder if some _other_ line discipline might want to do that same
tty_flush_to_ldisc thing in their "do I have data" logic, but I didn't
look any closer.

So does this work for everyone? I haven't tested it yet myself, but this
is the patch that "looks" right.

Linus

---
drivers/char/n_tty.c | 1 +
drivers/char/pty.c | 2 --
drivers/char/tty_buffer.c | 13 +++++++++++++
include/linux/tty.h | 1 +
4 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/drivers/char/n_tty.c b/drivers/char/n_tty.c
index ff47907..973be2f 100644
--- a/drivers/char/n_tty.c
+++ b/drivers/char/n_tty.c
@@ -1583,6 +1583,7 @@ static int n_tty_open(struct tty_struct *tty)

static inline int input_available_p(struct tty_struct *tty, int amt)
{
+ tty_flush_to_ldisc(tty);
if (tty->icanon) {
if (tty->canon_data)
return 1;
diff --git a/drivers/char/pty.c b/drivers/char/pty.c
index 3850a68..6e6942c 100644
--- a/drivers/char/pty.c
+++ b/drivers/char/pty.c
@@ -52,7 +52,6 @@ static void pty_close(struct tty_struct *tty, struct file *filp)


return;
tty->link->packet = 0;
set_bit(TTY_OTHER_CLOSED, &tty->link->flags);

- tty_flip_buffer_push(tty->link);
wake_up_interruptible(&tty->link->read_wait);
wake_up_interruptible(&tty->link->write_wait);


if (tty->driver->subtype == PTY_TYPE_MASTER) {

@@ -208,7 +207,6 @@ static int pty_open(struct tty_struct *tty, struct file *filp)


clear_bit(TTY_OTHER_CLOSED, &tty->link->flags);
set_bit(TTY_THROTTLED, &tty->flags);
retval = 0;

- tty->low_latency = 1;
out:
return retval;
}
diff --git a/drivers/char/tty_buffer.c b/drivers/char/tty_buffer.c
index 810ee25..3108991 100644
--- a/drivers/char/tty_buffer.c
+++ b/drivers/char/tty_buffer.c
@@ -462,6 +462,19 @@ static void flush_to_ldisc(struct work_struct *work)
}

/**
+ * tty_flush_to_ldisc
+ * @tty: tty to push
+ *
+ * Push the terminal flip buffers to the line discipline.
+ *
+ * Must not be called from IRQ context.
+ */


+void tty_flush_to_ldisc(struct tty_struct *tty)
+{
+ flush_to_ldisc(&tty->buf.work.work);
+}
+

+/**


* tty_flip_buffer_push - terminal
* @tty: tty to push

*
diff --git a/include/linux/tty.h b/include/linux/tty.h
index 1488d8c..e8c6c91 100644
--- a/include/linux/tty.h
+++ b/include/linux/tty.h
@@ -394,6 +394,7 @@ extern void __do_SAK(struct tty_struct *tty);


extern void disassociate_ctty(int priv);
extern void no_tty(void);
extern void tty_flip_buffer_push(struct tty_struct *tty);
+extern void tty_flush_to_ldisc(struct tty_struct *tty);
extern void tty_buffer_free_all(struct tty_struct *tty);
extern void tty_buffer_flush(struct tty_struct *tty);
extern void tty_buffer_init(struct tty_struct *tty);

Greg KH

unread,
Jul 28, 2009, 8:13:52 PM7/28/09
to Alan Cox, Theodore Tso, Linus Torvalds, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

Yeah, that's a scary test :)

I've gotten everything from linux-next right now, if you could be so
kind as to send me the rest of the bits you have pending, I'll be glad
to take them over and shephard them forward over time.

thanks,

greg k-h

Linus Torvalds

unread,
Jul 28, 2009, 8:27:21 PM7/28/09
to Alan Cox, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

On Tue, 28 Jul 2009, Linus Torvalds wrote:
>
> So does this work for everyone? I haven't tested it yet myself, but this
> is the patch that "looks" right.

This thing (on top of current -git) seems to pass Ogawa's tests at least
on my machine (both on SMP, and when the processes are limited to just a
single CPU).

Of course, since he wrote both the patch _and_ the tests, that doesn't
come as a huge surprise - you'd expect Ogawa's patch to fix the testcase
he has.

I use neither kdesu nor GNU emacs, so people who see the problem should
test this patch, and maybe we can put _this_ particular issue behind us.

Linus

Alan Cox

unread,
Jul 28, 2009, 8:34:20 PM7/28/09
to Linus Torvalds, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
> Ok. So the end result is that Ogawa-san's fix is the right one. Then we
> can revert the low_latency=1 thing for pty's entirely. No?

No

> So does this work for everyone? I haven't tested it yet myself, but this
> is the patch that "looks" right.

Looks pretty but breaks hangup events and hotplug.

The rules for hangup and thus hot unplug etc are

- The driver ensures that it will not call
tty_flip_buffer_push/flush_to_ldisc again for this port until re-opened
- The driver calls tty_hangup
- tty_hangup ensures that tty_flip_buffer_push cannot occur again
(by killing the workqueue)
- resources may well then get freed before close()

The same rules apply for an ldisc change via TIOCSETD

Ogawa's patch violates this by calling tty_flush_to_ldisc. If that
bridges a hangup then it will call into the ldisc for the dead port and
that in turn will call the write method of the driver which will in some
cases jump to free memory.

To make that work (and I agree its a very sane looking expression of the
intent) you would need to ensure that your new tty_flush_to_ldisc routine
was interlocked against already being hung up, tty_hangup and against an
ldisc change. I think (haven't verified) you get ldisc_change for free as
you are in the ldisc code so the ldisc ref is held and a set_ldisc will
block.

The hangup is tty_io:do_tty_hangup which calls tty_ldisc:
tty_ldisc_hangup which calls ld->ops->hangup which n_tty doesn't
currently have but which it could grow and which may block.

So my guess is you need to make your new flush to ldisc routine

interlock versus do_tty_hangup
check if the TTY_HUPPED flag is clear
only if so and do_tty_hangup cannot be running actually run the
ldisc code


the hangup code is still evil lock_kernel (and in parts "prayer_lock()"
code)

Linus Torvalds

unread,
Jul 28, 2009, 9:05:25 PM7/28/09
to Alan Cox, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

On Wed, 29 Jul 2009, Alan Cox wrote:
>

> The rules for hangup and thus hot unplug etc are
>
> - The driver ensures that it will not call
> tty_flip_buffer_push/flush_to_ldisc again for this port until re-opened

That's just bogus.

The work is already scheduled to be flushed by a timer. The only thing we
do is to make it happen _immediately_ (rather than later) when we do that
tty_flush_to_ldisc().

IOW, it's not changing what the kernel _does_. It's just moving something
that will be done to a slightly earlier point.

If that is wrong as per hangup code, then the bug is in the hangup
handling, not in the tty_flush_to_ldisc().

> - The driver calls tty_hangup
> - tty_hangup ensures that tty_flip_buffer_push cannot occur again
> (by killing the workqueue)
> - resources may well then get freed before close()

They had better not be, since all the data structures touched are inside
the 'tty_struct' (which we're dereferencing in other ways anyway in that
whole routine).

So the only thing that the hangup code needs to do is to make that the
"tty->buf.work.work" function pointer is a nop. And it does, as far as I
can tell.

> The same rules apply for an ldisc change via TIOCSETD
>
> Ogawa's patch violates this by calling tty_flush_to_ldisc. If that
> bridges a hangup then it will call into the ldisc for the dead port and
> that in turn will call the write method of the driver which will in some
> cases jump to free memory.

What you describe is just crazy talk. If Ogawa's fix really causes
problems, then the hangup code is broken, not Ogawa's trivial patch to
make sure the work is done when trying to read a tty.

So regardless, by now we have moved from "trivial bug that bites people in
real life" to "theoretical bug that looks impossible to trigger".

That's already a big step forward - along with making the code make more
sense. Which is always good in itself.

> The hangup is tty_io:do_tty_hangup which calls tty_ldisc:
> tty_ldisc_hangup which calls ld->ops->hangup which n_tty doesn't
> currently have but which it could grow and which may block.

Well, put this way: the only thing that actually stops the outstanding
timer (for the delayed work) is the tty_ldisc_halt() call in
tty_ldisc_hangup(). If that _isn't_ called, then your argument is
pointless, since the tty_flush_to_ldisc() will be done by a timer later
(and Ogawa's patch thus clearly introduces nothing new).

And when it _is_ called, it also clears TTY_LDISC, so now tty_ldisc_ref()
will return NULL, so then flush_to_ldisc() will be a no-op.

So as far as I can tell, we already protect against this whole case: if we
call flush_to_ldisc() after the tty has been HUP'ed, it just won't _do_
anything (unless the work hasn't been canceled, but in that case the timer
would have done the same thing, so nothing new is introduced).

Linus

Linus Torvalds

unread,
Jul 28, 2009, 9:24:17 PM7/28/09
to Alan Cox, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

On Tue, 28 Jul 2009, Linus Torvalds wrote:
>

> And when it _is_ called, it also clears TTY_LDISC, so now tty_ldisc_ref()
> will return NULL, so then flush_to_ldisc() will be a no-op.
>
> So as far as I can tell, we already protect against this whole case: if we
> call flush_to_ldisc() after the tty has been HUP'ed, it just won't _do_
> anything (unless the work hasn't been canceled, but in that case the timer
> would have done the same thing, so nothing new is introduced).

Btw, if you worry about the fact that this all could happen concurrently
(ie the hangup is done on one cpu, just as the other cpu is doing that
flush_to_ldisc() thing), then again, Ogawa's patch doesn't actually change
anything. The synchronous flush_to_ldisc() (done by Ogawa's patch) could
equally have been an asynchronous (done by a timer), and the timer may
already have triggered - so 'tty_ldisc_halt()' doing a cancel on the
delayed work is too late.

So I don't think there are any new races wrt concurrency there either,
even though we do not take any locks in the tty_flush_to_ldisc() case.
Because the timer wouldn't have taken any locks either..

Of course, if "tty_ldisc_halt()" (to remove any pending timer) and
"tty_ldisc_wait_idle()" (to make sure nothing else is executing right
then) is not sufficient, then there's a possible problem there if you hit
the timing just right, but again, that would be true _regardless_ of
Ogawa's changes as far as I can tell.

IOW, the whole argument really hinges on the fact that calling
flush_to_ldisc() manually (without any locking) is really not
fundamentally any different from the delayed work doing it from a timer.

And when we _do_ disable the timer, we also make that flush_to_ldisc()
function be a no-op, so the "tty_ldisc_halt()" effectively stops both the
timer and (conceptually) the manual call case too.

So now we have one remaining case, namely the case of the ldisc then being
re-initialized and TTY_LDISC is set again. But at that point, calling
flush_to_ldisc() had better be ok again, wouldn't you agree?

Gene Heskett

unread,
Jul 28, 2009, 10:20:44 PM7/28/09
to Alan Cox, OGAWA Hirofumi, Linus Torvalds, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
On Saturday 25 July 2009, Alan Cox wrote:
>Actually try this:
>
>
>commit b0e6bdde87725a5d46273ecc4bd00c54bd675848
>Author: Alan Cox <al...@linux.intel.com>
>Date: Sat Jul 25 15:00:04 2009 +0100
>
> pty: ensure writes hit the reader before close
>
> This is elegant in all the wrong ways. Put the pty into low latency mode
> (which we can do as we always post bytes from user context). The
> tty_flip_buffer_push then always calls into the ldisc which means we clear
> the ldisc buffer before we set the TTY_OTHER_CLOSED flag.
>
> Means pty has subtle knowledge of tty internals we really don't want it
> to, but it fixes the problem for the moment.
>
> Signed-off-by: Alan Cox <al...@linux.intel.com>
>
>diff --git a/drivers/char/pty.c b/drivers/char/pty.c
>index 6e6942c..87d729b 100644
>--- a/drivers/char/pty.c
>+++ b/drivers/char/pty.c
>@@ -47,10 +47,12 @@ static void pty_close(struct tty_struct *tty, struct
> file *filp) }
> wake_up_interruptible(&tty->read_wait);
> wake_up_interruptible(&tty->write_wait);
>+
> tty->packet = 0;
> if (!tty->link)

> return;
> tty->link->packet = 0;
>+ tty_flip_buffer_push(tty->link);
> set_bit(TTY_OTHER_CLOSED, &tty->link->flags);

> wake_up_interruptible(&tty->link->read_wait);
> wake_up_interruptible(&tty->link->write_wait);
>@@ -207,6 +209,7 @@ static int pty_open(struct tty_struct *tty, struct file

> *filp) clear_bit(TTY_OTHER_CLOSED, &tty->link->flags);
> set_bit(TTY_THROTTLED, &tty->flags);
> retval = 0;
>+ tty->low_latency = 1;
> out:
> return retval;
> }

Tracing this kdesu thread down per advice from Rafael W. This patch doesn't
seem to do anything about bz 13841. There are more patches in this thread,
I'll give them some testing too.

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
The NRA is offering FREE Associate memberships to anyone who wants them.
<https://www.nrahq.org/nrabonus/accept-membership.asp>

"Microsoft is the epitome of innovation and product quality."

-- This testimonial paid for by Microsoft.

Gene Heskett

unread,
Jul 28, 2009, 10:51:15 PM7/28/09
to Linus Torvalds, Alan Cox, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
On Tuesday 28 July 2009, Linus Torvalds wrote:
>On Wed, 29 Jul 2009, Alan Cox wrote:
>> BTW: The tty->low_latency fix doesn't work, because the ->write method
>> can be called from an IRQ and that means we can't use ->low_latency=1 as
>> we take mutexes.
>
>Ok. So the end result is that Ogawa-san's fix is the right one. Then we
>can revert the low_latency=1 thing for pty's entirely. No?
>
>And this is also the one that _looks_ the sanest - ie we do it on the read
>side, where it matters, rather than on the write side or the flush side
>(where the proper flushing can _also_ fix the problem, but where it's much
>more problematic, and where it's a lot less direct about what we care
>about).
>
>This is just Ogawa's patch, redone against current -git, and with commit
>3a54297478e6578f96fd54bf4daa1751130aca86 reverted (Ogawa's patch already
>undid the non-low_latency part of it).
>
>Now, I wonder if some _other_ line discipline might want to do that same
>tty_flush_to_ldisc thing in their "do I have data" logic, but I didn't
>look any closer.
>
>So does this work for everyone? I haven't tested it yet myself, but this
>is the patch that "looks" right.
>
> Linus

It doesn't seem to do anything for bz 13841 here. Sorry, I wish it did.

Thanks Linus.

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
The NRA is offering FREE Associate memberships to anyone who wants them.
<https://www.nrahq.org/nrabonus/accept-membership.asp>

Small animal kamikaze attack on power supplies

Linus Torvalds

unread,
Jul 29, 2009, 12:50:39 AM7/29/09
to Gene Heskett, Alan Cox, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

On Tue, 28 Jul 2009, Gene Heskett wrote:
>
> It doesn't seem to do anything for bz 13841 here. Sorry, I wish it did.

No, bz 13841 isn't about pty's, it's about usb serial. The patches in this
thread would mainly be pty-related, and would affect other tty drivers
(like your usb one) only purely by chance.

could you bisect the 13841 behavior?

Linus

Linus Torvalds

unread,
Jul 29, 2009, 12:54:38 AM7/29/09
to Gene Heskett, Alan Cox, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

On Tue, 28 Jul 2009, Linus Torvalds wrote:
>
> could you bisect the 13841 behavior?

Oh - and have you checked current -git? There's commit c56d300086, which
may well have fixed some problems with openign a serial USB device.

Gene Heskett

unread,
Jul 29, 2009, 1:01:00 AM7/29/09
to Linus Torvalds, Alan Cox, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
On Wednesday 29 July 2009, Linus Torvalds wrote:
>On Tue, 28 Jul 2009, Gene Heskett wrote:
>> It doesn't seem to do anything for bz 13841 here. Sorry, I wish it did.
>
>No, bz 13841 isn't about pty's, it's about usb serial. The patches in this
>thread would mainly be pty-related, and would affect other tty drivers
>(like your usb one) only purely by chance.
>
>could you bisect the 13841 behavior?
>
> Linus

I may have to bite the bullet and spend a couple days doing that. I don't
know as my f10 installed git is new enough to work with the recent changes in
it, I've rx'd several notices from Junio about new versions that fedora hasn't
offered as updates.

Link to a tutorial plz?

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
The NRA is offering FREE Associate memberships to anyone who wants them.
<https://www.nrahq.org/nrabonus/accept-membership.asp>

One picture is worth 128K words.

Gene Heskett

unread,
Jul 29, 2009, 1:04:22 AM7/29/09
to Linus Torvalds, Alan Cox, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
On Wednesday 29 July 2009, Linus Torvalds wrote:
>On Tue, 28 Jul 2009, Linus Torvalds wrote:
>> could you bisect the 13841 behavior?
>
>Oh - and have you checked current -git? There's commit c56d300086, which
>may well have fixed some problems with openign a serial USB device.
>
> Linus

No I haven't Linus, been running on your vanilla releases. I'd say I know
just enough about this to be dangerous (to myself) :)

Syntax to do that if my git is new enough?

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
The NRA is offering FREE Associate memberships to anyone who wants them.
<https://www.nrahq.org/nrabonus/accept-membership.asp>

The average Ph.D thesis is nothing but the transference of bones from
one graveyard to another.
-- J. Frank Dobie, "A Texan in England"

Aneesh Kumar K.V

unread,
Jul 29, 2009, 3:02:14 AM7/29/09
to Linus Torvalds, Alan Cox, OGAWA Hirofumi, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
On Tue, Jul 28, 2009 at 05:26:16PM -0700, Linus Torvalds wrote:
>
>
> On Tue, 28 Jul 2009, Linus Torvalds wrote:
> >
> > So does this work for everyone? I haven't tested it yet myself, but this
> > is the patch that "looks" right.
>
> This thing (on top of current -git) seems to pass Ogawa's tests at least
> on my machine (both on SMP, and when the processes are limited to just a
> single CPU).
>
> Of course, since he wrote both the patch _and_ the tests, that doesn't
> come as a huge surprise - you'd expect Ogawa's patch to fix the testcase
> he has.
>
> I use neither kdesu nor GNU emacs, so people who see the problem should
> test this patch, and maybe we can put _this_ particular issue behind us.
>

The patch also fix the "compile in emacs" bug.
http://bugzilla.kernel.org/show_bug.cgi?id=13815

-aneesh

Gene Heskett

unread,
Jul 29, 2009, 3:46:47 AM7/29/09
to Andrew Morton, Linus Torvalds, Alan Cox, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML
On Wednesday 29 July 2009, Andrew Morton wrote:
>On Wed, 29 Jul 2009 01:00:45 -0400 Gene Heskett <gene.h...@verizon.net>
wrote:

>> Link to a tutorial plz?
>
>http://landley.net/writing/git-quick.html
>
>That used to be at http://www.kernel.org/doc/local/git-quick.html but some
>dope removed it.

The ./configure line fails, on page 2 of that document, so I moved the .config
files from 31-rc4 into that tree, and I'm using my own script to do the build
and install, fiddling with the version numbers to track, calling the first
build 2.6.31-rc3.5. The build seems to be going ok, but I had to pull in the
firmware/radeon tree from my own 31-rc4 in order to get the firmware my script
& the .config expects.

As my script bounces in and out of the src tree, it needs the src tree named
consistently. So I had to restart it a couple of times to fix all those
gotcha's. But now I am ready to reboot, starting with me some zz's. The
build didn't spit any more stuff than usual. 5 section miss-matches I've been
looking at for over a year, and some recent adds of rnonce.data stuff that are
fussing about not being initialized. I'll test for good/bad in the morning.


--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
The NRA is offering FREE Associate memberships to anyone who wants them.
<https://www.nrahq.org/nrabonus/accept-membership.asp>

The new Congressmen say they're going to turn the government around. I
hope I don't get run over again.

Alan Cox

unread,
Jul 29, 2009, 4:59:34 AM7/29/09
to Linus Torvalds, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
> > - The driver ensures that it will not call
> > tty_flip_buffer_push/flush_to_ldisc again for this port until re-opened
>
> That's just bogus.

I didn't invent it, thats how it works and its not an area I've touched.

> If that is wrong as per hangup code, then the bug is in the hangup
> handling, not in the tty_flush_to_ldisc().

I wouldn't argue with that - I merely pointed out they need synchronizing


>
> > - The driver calls tty_hangup
> > - tty_hangup ensures that tty_flip_buffer_push cannot occur again
> > (by killing the workqueue)
> > - resources may well then get freed before close()
>
> They had better not be, since all the data structures touched are inside
> the 'tty_struct' (which we're dereferencing in other ways anyway in that
> whole routine).

You are only looking at pty. That code is used for all the real physical
tty devices too. With real devices the underlying physical device and its
structures can get dumped. When you run the n_tty ldisc you call back out
to the drivers for echo etc.

> So the only thing that the hangup code needs to do is to make that the
> "tty->buf.work.work" function pointer is a nop. And it does, as far as I
> can tell.

What happens if the hangup occurs just after you start running the ldisc
on another CPU ?

> So regardless, by now we have moved from "trivial bug that bites people in
> real life" to "theoretical bug that looks impossible to trigger".

Actually all the hangup races turn up for people. Not often but now and
then. Also because you have vhangup() you can cause them in software by
leaving one app spinning in a loop hanging up and opening stuff while you
try and make it break.

> Well, put this way: the only thing that actually stops the outstanding
> timer (for the delayed work) is the tty_ldisc_halt() call in
> tty_ldisc_hangup(). If that _isn't_ called, then your argument is
> pointless, since the tty_flush_to_ldisc() will be done by a timer later
> (and Ogawa's patch thus clearly introduces nothing new).
>
> And when it _is_ called, it also clears TTY_LDISC, so now tty_ldisc_ref()
> will return NULL, so then flush_to_ldisc() will be a no-op.

IFF the hangup doesn't occur while you are entering flush_to_ldisc()

Consider a real tty for a bit

CPU0 CPU1
n_tty methods
flush_to_ldisc
get ldisc ref
INTERRUPT
tty_hangup
do_tty_hangup
ldisc work tty_ldisc_hangup
RESET_TERMIOS is false
tty->ops->hangup()
[usb]serial_hangup()
[usb]serial_do_down()
close physical
driver

tty->ops->write
[usb]serial_write
WARN()


So as I said before you need to fix flush_to_ldisc and the hangup running
against one another. At the very least I think you need a
tty_ldisc_wait_idle(tty); just before

if (tty->driver->flags & TTY_DRIVER_RESET_TERMIOS) {

so that you stall the hangup until n_tty exits the ldisc.


(The other case is calling ld->ops->hangup then calling ld->ops->write
which no ldisc seems to care about)

Alan Cox

unread,
Jul 29, 2009, 7:07:24 AM7/29/09
to Linus Torvalds, Gene Heskett, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton, Alan Stern
On Tue, 28 Jul 2009 21:49:05 -0700 (PDT)
Linus Torvalds <torv...@linux-foundation.org> wrote:

>
>
> On Tue, 28 Jul 2009, Gene Heskett wrote:
> >
> > It doesn't seem to do anything for bz 13841 here. Sorry, I wish it did.
>
> No, bz 13841 isn't about pty's, it's about usb serial. The patches in this
> thread would mainly be pty-related, and would affect other tty drivers
> (like your usb one) only purely by chance.
>
> could you bisect the 13841 behavior?

See the thread in 13821 on linux-usb list which extends from the "why do
we leak ttyUSB numbers" into this as well. Its been pinned down and the in
tree fix is confirmed to do the job for that case. Not perfectly because
it needs a further small fix as

a) it should call drv->close() within the mutex on the error path if the
tty_block_til_ready() fails without which you get a leak on a few drivers
but nothing too harmful. (Noted by Alan Stern inspecting the patch)

b) it fails the 'variable frequency generator on carrier pin' test - but
that isn't a regression as such as all 2.6 kernels I've tried crash
during that test with USB. (Running my 'extreme violence' test setup)

I suspect that doing

retval = tty_port_block_til_ready(&port->port, tty, filp);
if (retval == 0) {
if (!first)
usb_serial_put(serial);
return 0;
}
mutex_lock(&port->mutex);


if (tty_hung_up_p(filp)) {
mutex_unlock(&port->mutex);
return retval;
}
dev->close(port);

...

fixes both. The race basically is

open
drop mutex so we can wait for the port
wait for the port
[hangup]
[serial_do_down]
block_til_ready reports an error
take mutex
duplicate serial_do_down


Most serial drivers don't try and do open clean up in open() instead they
do it in close() which is called by the tty layer on an open failure
(always has been) and which turns out to be a useful design
decision as it means the driver doesn't have to tie itself in knots and
tty_port_close_start() handles all the mechanics. Plus they use
ASYNC_INITIALIZED as flag to avoid double shutdowns on a device.

Simply deleting most of the clean up from serial_open and letting close()
do its job would I think clean that up no end but not in an -rc perhaps.

Alan Cox

unread,
Jul 29, 2009, 7:17:49 AM7/29/09
to Linus Torvalds, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
> IOW, the whole argument really hinges on the fact that calling
> flush_to_ldisc() manually (without any locking) is really not
> fundamentally any different from the delayed work doing it from a timer.

Which means the hangup path should be doing a cancel_delayed_work() in
the case where its not resetting the termios as well and since it isn't
you can't actually (fingers crossed) make the problem worse, its merely as
broken as before.

Ok

> And when we _do_ disable the timer, we also make that flush_to_ldisc()
> function be a no-op, so the "tty_ldisc_halt()" effectively stops both the
> timer and (conceptually) the manual call case too.
>
> So now we have one remaining case, namely the case of the ldisc then being
> re-initialized and TTY_LDISC is set again. But at that point, calling
> flush_to_ldisc() had better be ok again, wouldn't you agree?

The ldisc re-init via SETD should be fine as it locks versus hangup
anyway and always has to.

Linus Torvalds

unread,
Jul 29, 2009, 11:50:01 AM7/29/09
to Alan Cox, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

On Wed, 29 Jul 2009, Alan Cox wrote:
> >

> > > - The driver calls tty_hangup
> > > - tty_hangup ensures that tty_flip_buffer_push cannot occur again
> > > (by killing the workqueue)
> > > - resources may well then get freed before close()
> >
> > They had better not be, since all the data structures touched are inside
> > the 'tty_struct' (which we're dereferencing in other ways anyway in that
> > whole routine).
>
> You are only looking at pty. That code is used for all the real physical
> tty devices too. With real devices the underlying physical device and its
> structures can get dumped. When you run the n_tty ldisc you call back out
> to the drivers for echo etc.

I actually only meant the "flush_to_ldisc()" part. We'll never get any
further than that due to the reasons I outlined.

> > So the only thing that the hangup code needs to do is to make that the
> > "tty->buf.work.work" function pointer is a nop. And it does, as far as I
> > can tell.
>
> What happens if the hangup occurs just after you start running the ldisc
> on another CPU ?

But Alan, that was my point: Ogawa's patch in no way changes any existing
behavior. The case you mention ALREADY HAPPENS, regardless of Ogawa's
patch.

If a timer goes off at the same time hangup happens, you have the exact
same situation. You can have one CPU doing hangup processing, and one CPU
having just triggered the timeout and doing flush_to_ldisc.

> Consider a real tty for a bit
>
> CPU0 CPU1
> n_tty methods
> flush_to_ldisc
> get ldisc ref
> INTERRUPT
> tty_hangup
> do_tty_hangup

Yes. Consider exactly that. And notice how it happens with or without
Ogawa's patch. Just instead of "n_tty methods", you have "delayed-work
timer".

> So as I said before you need to fix flush_to_ldisc and the hangup running
> against one another. At the very least I think you need a
> tty_ldisc_wait_idle(tty); just before
>
> if (tty->driver->flags & TTY_DRIVER_RESET_TERMIOS) {
>
> so that you stall the hangup until n_tty exits the ldisc.

The tty_ldisc_hangup() already does tty_ldisc_wait_idle() after disabling
the timer (and clearing the TTY_LDISC bit), so that all seems fine
already. No?

Linus

Alan Cox

unread,
Jul 29, 2009, 11:55:00 AM7/29/09
to Linus Torvalds, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
> The tty_ldisc_hangup() already does tty_ldisc_wait_idle() after disabling
> the timer (and clearing the TTY_LDISC bit), so that all seems fine
> already. No?

We only do tty_ldisc_wait_idle if tty->driver->flags & TTY_RESET_TERMIOS
is set that I can see ?

Linus Torvalds

unread,
Jul 29, 2009, 12:06:33 PM7/29/09
to Alan Cox, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

On Wed, 29 Jul 2009, Alan Cox wrote:

> > The tty_ldisc_hangup() already does tty_ldisc_wait_idle() after disabling
> > the timer (and clearing the TTY_LDISC bit), so that all seems fine
> > already. No?
>
> We only do tty_ldisc_wait_idle if tty->driver->flags & TTY_RESET_TERMIOS
> is set that I can see ?

I agree, but that's also the only time we do that 'tty_ldisc_halt()' that
cancels the timer. So if there is something that depends on the
flush_to_ldisc() not happening, the bug was pre-existing since it never
got rid of the flush to begin with, so the flush_to_ldisc() would happen
at some random later time as a result of the delayed-work timer.

That said, I suspect that from a sanity standpoint, I suspect something
like the appended migth be a good idea. But I think it's an independent
issue (and unless somebody can actually point to an actual problem, I'd
only apply something like this during the merge window, not after -rc4).

Linus
---
drivers/char/tty_ldisc.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/char/tty_ldisc.c b/drivers/char/tty_ldisc.c
index acd76b7..640b100 100644
--- a/drivers/char/tty_ldisc.c
+++ b/drivers/char/tty_ldisc.c
@@ -778,6 +778,8 @@ void tty_ldisc_hangup(struct tty_struct *tty)
if (ld->ops->hangup)
ld->ops->hangup(tty);
tty_ldisc_deref(ld);
+ tty_ldisc_halt(tty);
+ tty_ldisc_wait_idle(tty);
}
/*
* FIXME: Once we trust the LDISC code better we can wait here for
@@ -794,8 +796,6 @@ void tty_ldisc_hangup(struct tty_struct *tty)
mutex_lock(&tty->ldisc_mutex);
if (tty->ldisc) { /* Not yet closed */
/* Switch back to N_TTY */
- tty_ldisc_halt(tty);
- tty_ldisc_wait_idle(tty);
tty_ldisc_reinit(tty);
/* At this point we have a closed ldisc and we want to
reopen it. We could defer this to the next open but

Alan Cox

unread,
Jul 29, 2009, 12:39:47 PM7/29/09
to Linus Torvalds, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
> tty_ldisc_deref(ld);
> + tty_ldisc_halt(tty);
> + tty_ldisc_wait_idle(tty);

Needs to be


mutex_lock(&tty->ldisc_mutex);
if (tty->ldisc)

....
}
mutex_unlock(&tty->ldisc_mutex)

We can't wait for the reference to go away if we hold one, and if we
don't hold one the ldisc reference might go away during the hangup if we
close()

Odds of hitting it minimal however. Or maybe we need a smarter
tty_ldisc_wait_self_idle(ld) ?

Gene Heskett

unread,
Jul 29, 2009, 1:41:07 PM7/29/09
to Alan Cox, Linus Torvalds, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton, Alan Stern

I have now done a bisect, but I'm not convinced its correct. Since the
/configure line didn't work in that tutorial, I fell back to using my own
"makeit" script which largely automates the build and installation of a
kernel, all I have to do is make sure its internal $VER matches what is in the
Makefile.

And there by hangs the tail of some mistakes. As I ran the bisect, the
Makefile regressed all the way to 2.6.30 and screwed up my make a few times
because the Makefile version was not stable from bisect run to bisect run.
The final, no choices left reboot was bad cuz it Ooops when that script was
run the last 3 times, but there were no choices left to try.
Here is that 'git bisect log >../bisect-final.log'

git bisect start
# good: [6847e154e3cd74fca6084124c097980a7634285a] Linux 2.6.31-rc3
git bisect good 6847e154e3cd74fca6084124c097980a7634285a
# bad: [4be3bd7849165e7efa6b0b35a23d6a3598d97465] Linux 2.6.31-rc4
git bisect bad 4be3bd7849165e7efa6b0b35a23d6a3598d97465
# bad: [ae42b9e1ca8969d52e51f5e461b2e89e180943dd] Merge branch 'for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/vapier/blackfin
git bisect bad ae42b9e1ca8969d52e51f5e461b2e89e180943dd
# bad: [301d95c4dade09388f94258ee797d2d650dc00b5] Merge
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
git bisect bad 301d95c4dade09388f94258ee797d2d650dc00b5
# bad: [301d95c4dade09388f94258ee797d2d650dc00b5] Merge
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
git bisect bad 301d95c4dade09388f94258ee797d2d650dc00b5
# good: [e9e961c9a818a2f24711af493b907a8e40a69efc] Merge branch 'i2c-for-2631-
rc3' of git://aeryn.fluff.org.uk/bjdooks/linux
git bisect good e9e961c9a818a2f24711af493b907a8e40a69efc
# good: [63f7a330014ad29b662638caabd8e96fe945b9ed] Merge branch 'timers-fixes-
for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
git bisect good 63f7a330014ad29b662638caabd8e96fe945b9ed
# bad: [b983d0deb0e28f8880cdea79def575d96a27e603] Merge branch 'tracing-fixes-
for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
git bisect bad b983d0deb0e28f8880cdea79def575d96a27e603
# bad: [c20b08e3986c2dbfa6df1e880bf4f7159994d199] sched: Fix rt_rq-
>pushable_tasks initialization in init_rt_rq()
git bisect bad c20b08e3986c2dbfa6df1e880bf4f7159994d199
# bad: [a1ba4d8ba9f06a397e97cbd67a93ee306860b40a] sched_rt: Fix overload bug
on rt group scheduling
git bisect bad a1ba4d8ba9f06a397e97cbd67a93ee306860b40a

What is needed for a mistake free bisect is a 'doesn't matter as long as it
matches' Makefile version that survives all the way through a bisect run.
When I was done, I had to go back to my own 2.6.30 and 2.6.31-rc3 trees and do
a full rebuild and reinstall in order to have a clean boot newer that 2.6.29.

Because of these errors, and the fact that the final 3 2.6.30's it built were
bad, I am not convinced I have a valid bisect, particularly since the final 3
builds were called 2.6.30, and they were all=bad because of an Oopsen at the
scripts invocation from rc.local, not the i/o error that makes it loop forever
using 98+% of all 4 cores in my phenom and thrashing the disk a bit.

If this version thing can be handled in a cleaner manner, I'll be glad to
repeat the bisect from scratch, it turns out not to be as complex as I feared.

As it stands, its a pita because its destroying perfect good fallback boot
setups. I should be able to set the extraversion to -rc3.5 in both the
Makefile and in my makeit script, and run the whole bisect without molesting
the rest of the system, creating one set of bootfiles that are simply
overwritten with the next bisect run.

Thanks for listening to the rant. What do I do next?

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
The NRA is offering FREE Associate memberships to anyone who wants them.
<https://www.nrahq.org/nrabonus/accept-membership.asp>

Every program has (at least) two purposes:
the one for which it was written and another for which it wasn't.

Frans Pop

unread,
Jul 29, 2009, 2:28:43 PM7/29/09
to Gene Heskett, al...@lxorguk.ukuu.org.uk, torv...@linux-foundation.org, hiro...@mail.parknet.co.jp, aneesh...@linux.vnet.ibm.com, r...@sisk.pl, ray...@madrabbit.org, linux-...@vger.kernel.org, ak...@linux-foundation.org, st...@rowland.harvard.edu
> What is needed for a mistake free bisect is a 'doesn't matter as long
> as it matches' Makefile version that survives all the way through a
> bisect run.

I have a wrapper script I use for kernel builds that takes care of that
(it also supports cross building and building some out-of-tree modules).
Some snippets from that script below.

BISECTING=
if [ -e .git/BISECT_LOG ]; then
BISECTING=1
fi
[...]
if [ "$BISECTING" ]; then
# The version in the next line may need updating before a bisect
sed -i "s/^SUBLEVEL = .*/SUBLEVEL = 31/" Makefile
sed -i "s/^EXTRAVERSION =.*/EXTRAVERSION = -bisect/" Makefile
fi
[...]
make ...
[...]
if [ "$BISECTING" ]; then
# Revert Makefile to avoid errors on 'git bisect good/bad'
git checkout Makefile
fi

I use the deb-pkg target and also set the .deb package version in the
second hunk:
KERNELDEBREVISION=$(grep "^git[- ]bisect" .git/BISECT_LOG | wc -l)

This way I end up with a nice series of packages whose numbering matches
the steps in .git/BISECT_LOG:
linux-image-2.6.31-bisect_1_amd64.deb
linux-image-2.6.31-bisect_2_amd64.deb
linux-image-2.6.31-bisect_3_amd64.deb
..

Hope that help.

Cheers,
FJP

Gene Heskett

unread,
Jul 29, 2009, 2:43:36 PM7/29/09
to Frans Pop, al...@lxorguk.ukuu.org.uk, torv...@linux-foundation.org, hiro...@mail.parknet.co.jp, aneesh...@linux.vnet.ibm.com, r...@sisk.pl, ray...@madrabbit.org, linux-...@vger.kernel.org, ak...@linux-foundation.org, st...@rowland.harvard.edu
On Wednesday 29 July 2009, Frans Pop wrote:
>> What is needed for a mistake free bisect is a 'doesn't matter as long
>> as it matches' Makefile version that survives all the way through a
>> bisect run.
>
>I have a wrapper script I use for kernel builds that takes care of that
>(it also supports cross building and building some out-of-tree modules).
>Some snippets from that script below.
>
>BISECTING=
>if [ -e .git/BISECT_LOG ]; then
> BISECTING=1
>fi
>[...]
>if [ "$BISECTING" ]; then
> # The version in the next line may need updating before a bisect
> sed -i "s/^SUBLEVEL = .*/SUBLEVEL = 31/" Makefile
> sed -i "s/^EXTRAVERSION =.*/EXTRAVERSION = -bisect/" Makefile
>fi
>[...]
>make ...
>[...]
>if [ "$BISECTING" ]; then
> # Revert Makefile to avoid errors on 'git bisect good/bad'
> git checkout Makefile
Ahh, I see that now, which I was objecting to below. I'll go quietly. :)

>fi
>
>I use the deb-pkg target and also set the .deb package version in the
>second hunk:
>KERNELDEBREVISION=$(grep "^git[- ]bisect" .git/BISECT_LOG | wc -l)
>
>This way I end up with a nice series of packages whose numbering matches
>the steps in .git/BISECT_LOG:
>linux-image-2.6.31-bisect_1_amd64.deb
>linux-image-2.6.31-bisect_2_amd64.deb
>linux-image-2.6.31-bisect_3_amd64.deb
>...

>
>Hope that help.
>
>Cheers,
>FJP

Yes, some of it will. But thanks to fedora's broken disk partitioner,
something I've been screaming about for a damned decade, my /boot partition
isn't big enough to absorb a whole chain of those, hence the fixed version
request.

This script would appear to need a restore function for the Makefile version
because one of my 'git bisect bad's returned that it couldn't switch branches
because of the handmade Makefile changes I'd done on the first build. How
have you been handling that?

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
The NRA is offering FREE Associate memberships to anyone who wants them.
<https://www.nrahq.org/nrabonus/accept-membership.asp>

When you don't know what to do, walk fast and look worried.

Linus Torvalds

unread,
Jul 29, 2009, 3:08:19 PM7/29/09
to Alan Cox, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton

On Wed, 29 Jul 2009, Alan Cox wrote:
>

> Odds of hitting it minimal however. Or maybe we need a smarter

> tty_ldisc_wait_idle(ld) ?

Just adding the ldisc_mutex around the call sounds like the simplest
solution.

That said, looking at the callers of tty_ldisc_wait_idle(), it looks like
we have other similar problems already in tty_ldisc_release(), which also
calls it without holding the lock, both for the "self" case and then
recursively for o_tty.

Moving the mutex_lock() up a bit in tty_ldisc_release() looks like the
trivial solution, although I suspect there are any deadlock issues there
(ie refs that won't go away because we hold the lock and the thing we are
waiting for needs the lock to release the ldisc!).

So making tty_ldisc_wait_idle() more careful adn work without the lock
would definitely be the safer thing to do. Requiring the lock looks
potentially pretty dangerous.

Linus

Gene Heskett

unread,
Jul 29, 2009, 3:10:03 PM7/29/09
to Aneesh Kumar K.V, Ray Lee, Rafael J. Wysocki, LKML, Andrew Morton, torv...@linux-foundation.org, al...@lxorguk.ukuu.org.uk
On Friday 24 July 2009, Aneesh Kumar K.V wrote:
>On Thu, Jul 23, 2009 at 05:21:36PM -0700, Ray Lee wrote:
>> On Thu, Jul 23, 2009 at 4:45 PM, Rafael J. Wysocki<r...@sisk.pl> wrote:
>> > A recent kernel change broke kdesu (from KDE 4.2) on my test boxes.
>> > ISTR a discussion about that, but I can't find it right now. Any
>> > clues?
>>
>> See the thread starting here: ("possible regression with pty.c commit")
>> http://lkml.org/lkml/2009/7/11/125
>
>I am also facing a similar problem.
>
>http://bugzilla.kernel.org/show_bug.cgi?id=13815
>
>-aneesh

I just did a git bisect reset master;git pull and built a new 2.6.31-rc4. The
problem with my bash script has now been resolved. Many thanks.

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
The NRA is offering FREE Associate memberships to anyone who wants them.
<https://www.nrahq.org/nrabonus/accept-membership.asp>

Nothing cures insomnia like the realization that it's time to get up.

Gene Heskett

unread,
Jul 29, 2009, 3:19:43 PM7/29/09
to Frans Pop, al...@lxorguk.ukuu.org.uk, torv...@linux-foundation.org, hiro...@mail.parknet.co.jp, aneesh...@linux.vnet.ibm.com, r...@sisk.pl, ray...@madrabbit.org, linux-...@vger.kernel.org, ak...@linux-foundation.org, st...@rowland.harvard.edu
On Wednesday 29 July 2009, Frans Pop wrote:
>On Wednesday 29 July 2009, Gene Heskett wrote:
>> Yes, some of it will. But thanks to fedora's broken disk partitioner,
>> something I've been screaming about for a damned decade, my /boot
>> partition isn't big enough to absorb a whole chain of those, hence the
>> fixed version request.
>
>My /boot isn't big enough for a full series either, but my $HOME is :-)
>
>The fact that the packages I create all have the same package name and
>only the package version differs, means that I do get the complete series
>but only have one version installed in /boot at any time (as installing a
>new version overwrites the version from the previous bisect iteration).
>
>No idea if you can do the same on an RPM-based system easily.

I do not normally use rpms to do that. My makeit script handles all of that,
keeping one old version of everything in /boot and /lib/modules that I can
revert by hand if the new kernel is a showstopper.

Whenever I let rpm get access to my grub.conf, I always have to go back in
with vim & fix the stanza numbers I keep track of the boots with. Grub
doesn't really care, but I do. :) rpm should be so neat...

Anybody that wants this script (actually there are 2 of them), I can attach
them, makeit and buildit26 (which unpacks and applies the patches plus running
a make oldconfig) are about 100 lines each.

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
The NRA is offering FREE Associate memberships to anyone who wants them.
<https://www.nrahq.org/nrabonus/accept-membership.asp>

I just got out of the hospital after a speed reading accident.
I hit a bookmark.
-- Steven Wright

Valdis.K...@vt.edu

unread,
Jul 30, 2009, 10:19:25 AM7/30/09
to Gene Heskett, Frans Pop, al...@lxorguk.ukuu.org.uk, torv...@linux-foundation.org, hiro...@mail.parknet.co.jp, aneesh...@linux.vnet.ibm.com, r...@sisk.pl, ray...@madrabbit.org, linux-...@vger.kernel.org, ak...@linux-foundation.org, st...@rowland.harvard.edu
On Wed, 29 Jul 2009 15:19:19 EDT, Gene Heskett said:

> Whenever I let rpm get access to my grub.conf, I always have to go back in
> with vim & fix the stanza numbers I keep track of the boots with. Grub
> doesn't really care, but I do. :) rpm should be so neat...

Just to be accurate - this isn't rpm's fault, it's a program called 'grubby'.

I understand the reasoning - if you just installed a new kernel, you probably
want it to be the one started at next boot. But it would be nice if grubby
had a configure option for that in /etc/sysconfig/grubby or someplace...

Gene Heskett

unread,
Jul 30, 2009, 11:35:29 AM7/30/09
to Valdis.K...@vt.edu, Frans Pop, al...@lxorguk.ukuu.org.uk, torv...@linux-foundation.org, hiro...@mail.parknet.co.jp, aneesh...@linux.vnet.ibm.com, r...@sisk.pl, ray...@madrabbit.org, linux-...@vger.kernel.org, ak...@linux-foundation.org, st...@rowland.harvard.edu

Humm, kewl. And the src's for grubby are where? Hopefully in C so thois old
fart can grok...

Thanks.

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
The NRA is offering FREE Associate memberships to anyone who wants them.
<https://www.nrahq.org/nrabonus/accept-membership.asp>

The best laid plans of mice and men are held up in the legal department.

Valdis.K...@vt.edu

unread,
Jul 30, 2009, 2:40:30 PM7/30/09
to Gene Heskett, Frans Pop, al...@lxorguk.ukuu.org.uk, torv...@linux-foundation.org, hiro...@mail.parknet.co.jp, aneesh...@linux.vnet.ibm.com, r...@sisk.pl, ray...@madrabbit.org, linux-...@vger.kernel.org, ak...@linux-foundation.org, st...@rowland.harvard.edu
On Thu, 30 Jul 2009 11:35:03 EDT, Gene Heskett said:
> Humm, kewl. And the src's for grubby are where? Hopefully in C so thois old
> fart can grok...

% rpm -qi grubby

should provide pointers for the .src.rpm and the upstream source...

(That trick should work for *every* RPM - if you find one it doesn't work
for, that's probably a reportable bug to RedHat.)

Alan Cox

unread,
Jul 30, 2009, 7:16:30 PM7/30/09
to Alan Cox, Theodore Tso, Greg KH, Linus Torvalds, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
> I'll fish them out tomorrow. What is in the -next tree is only some of
> the bits.
>
> We still have races on the USB side too btw I put a frequency generator
> on the carrier line of my pl2303 with an app continually opening the
> port and it oopses after about 10 minutes. On the bright side so does
> 2.6.29, 2.6.27 ....

Just an update - I've not posted them yet as I've been putting the
various branches into some kind of sane order with the tested stuff first
and the "in progress" serial -> tty_port stuff last.

Greg KH

unread,
Jul 31, 2009, 10:21:39 AM7/31/09
to Alan Cox, Theodore Tso, Linus Torvalds, OGAWA Hirofumi, Aneesh Kumar K.V, Rafael J. Wysocki, Ray Lee, LKML, Andrew Morton
On Fri, Jul 31, 2009 at 02:49:01PM +0100, Alan Cox wrote:
> tar ball at
>
> http://zeniv.linux.org.uk/~alan/ttydev.tar.gz
>
> Series is in order of degree of working. It should all be sane down to
> vt-lockactive. The stuff down to tty-usb-serial-termiosbits has had some
> basic testing.
>
> tty-usb-cleanup-open hangs still on the second open and is a work in
> progress but shows where the tty close side interface was going in order
> to remove all the posix logic from drivers.
>
> beyond that is the last bits of the serial cleanup which are definitely
> "work in progress", and the f81216 helper which needs reworking into the
> drivers proper.

Thanks for providing this. I'll pull these into my tree and then tell
Stephen about it for linux-next.

Thanks again for doing the tty work, it is much appreciated.

greg k-h

0 new messages