Read() on serial port produces "Interrupted system call"

Guillaume Dargaud

unread,

Aug 19, 2009, 11:53:38 AM8/19/09

to

Hello all,
I'm at a loss here.
I control 7 different custom serial devices from an embedded system.
One of them is causing me hell.
Often when I read from it with read(), I get an error "Interrupted system
call".
But it seems to work fine from a minicom or microcom terminal.

Here's the config:

if (0 >= (Serial->fd = open(Serial->DevicePath,
O_RDWR | O_NOCTTY | O_NDELAY )) ) {
SimpleLog_Write(SL_ERROR, __func__, "Opening device %s failed: r=%d,
errno=%d, %s",
Serial->DevicePath, Serial->fd, errno, strerror(errno));
return Serial->fd;
}

fcntl(Serial->fd, F_SETFL, 0 /* FNDELAY */ );

struct termios options;
tcgetattr(Serial->fd, &options);

options.c_lflag &= ~ICANON;
options.c_lflag &= ~(ECHO | ECHOCTL | ECHONL);
options.c_cflag |= HUPCL;

options.c_oflag |= ONLCR;
options.c_iflag &= ~ICRNL;

options.c_cc[VMIN] = 0;
options.c_cc[VTIME] = 1; // in multiple of 100ms timeout

options.c_cflag |= CRTSCTS;
options.c_iflag &= ~(IXON | IXOFF | IXANY);

options.c_cflag &= ~PARENB;
options.c_cflag &= ~CSTOPB;
options.c_cflag &= ~CSIZE;
options.c_cflag |= CS8;

cfsetospeed(&options, B9600);
cfsetispeed(&options, B9600);

tcsetattr(Serial->fd, TCSANOW, &options);

And here's the read sequence (I have to concat replies otherwise they come
in pieces of 3 to 15 characters, if anybody can tell me why, I'll be happy
too). Write instructions don't seem to have problems:

do {
Str[0]='\0';
errno=0;
n = read(Serial->fd, Str, 256);
if (n>0) { Str[n]='\0'; strcat(Buf, Str); } // Append lines together
if (n<0) SimpleLog_Write(SL_WARNING, __func__,
"port=%d, reply=\"%s\": failed r=%d, errno=%d, %s",
Port, Str, n, errno, strerror(errno));
else SimpleLog_Write(SL_DEBUG, __func__, "port=%d, reply=\"%s\"", Port,
Str);
} while (n>0); // Then try to read another line if available

--
Guillaume Dargaud
http://www.gdargaud.net/

Scott Lurndal

unread,

Aug 19, 2009, 1:47:49 PM8/19/09

to

"Guillaume Dargaud" <use_the_form_on...@www.gdargaud.net> writes:
>Hello all,
>I'm at a loss here.
>I control 7 different custom serial devices from an embedded system.
>One of them is causing me hell.
>Often when I read from it with read(), I get an error "Interrupted system
>call".

EINTR is returned when a read (before transferring any characters) is interrupted
by signal delivery. You should simply retry the read if errno == EINTR. Note
that the only defined error return value from read is '-1'; '< 0' is not the
correct test.

while (1) {
diag = read(...);
if (diag == -1) {
if (errno == EINTR) {
continue;
} else {
log error;
break;
}
} else {
copy 'diag' bytes to buffer
if all required bytes read, break;
}
// continue to read remaining bytes.
}

scott

Rainer Weikusat

unread,

Aug 19, 2009, 2:07:28 PM8/19/09

to

An IMO nice alternate way I have come to use regularly is to enclose
the read itself into an inner loop:

do
diag = read(...);
while (diag == -1 && errno == EINTR);

Marcel Bruinsma

unread,

Aug 19, 2009, 2:41:33 PM8/19/09

to

Rainer Weikusat schrieb:

> do
> diag = read(...);
> while (diag == -1 && errno == EINTR);

In addition, the OP might want to test against EAGAIN too.

IEEE Std 1003.1, section 11.1.7 :
« POSIX.1-2008 does not specify whether the setting of
O_NONBLOCK takes precedence over MIN or TIME settings.
Therefore, if O_NONBLOCK is set, read() may return
immediately, regardless of the setting of MIN or TIME.
Also, if no data is available, read() may either return 0,
or return -1 with errno set to [EAGAIN]. »

--
printf -v email $(echo \ 155 141 162 143 145 154 142 162 165 151 \
156 163 155 141 100 171 141 150 157 157 056 143 157 155|tr \ \\\\)
# Live every life as if it were your last! #

Rainer Weikusat

unread,

Aug 19, 2009, 3:16:02 PM8/19/09

to

Marcel Bruinsma <we-love-...@gmail.com> writes:
> Rainer Weikusat schrieb:
>
>> do
>> diag = read(...);
>> while (diag == -1 && errno == EINTR);
>
> In addition, the OP might want to test against EAGAIN too.

But probably not in a loop like this, since this would result in
executing repeated reads as fast as the computer can handle them until
data to read is actually available.

David Schwartz

unread,

Aug 19, 2009, 8:29:05 PM8/19/09

to

On Aug 19, 10:47 am, sc...@slp53.sl.home (Scott Lurndal) wrote:

> Note
> that the only defined error return value from read is '-1'; '< 0' is not the
> correct test.

==-1 and <0 are both equally correct tests. The only difference is
their behavior in cases where one's behavior does not matter. A test
is "not correct" if, and only if, it can result in incorrect behavior
in a case where there exists correct behavior. Since correct behavior
does not exist for a return value less than -1, behavior in this case
cannot be correct with any code.

Testing for <0 may be faster on some platforms, testing just the sign
bit. If you know this test is or may be faster, and you also know that
<-1 cannot occur (which the compiler cannot know, so this is not an
optimization the compiler can do), then <0 is a superior test. (Though
==-1 is, of course, also correct)

DS

Guillaume Dargaud

unread,

Aug 20, 2009, 6:38:52 AM8/20/09

to

> But probably not in a loop like this, since this would result in
> executing repeated reads as fast as the computer can handle them until
> data to read is actually available.

So how does one distinguishes between:
- the device is not sending anything (in which case I would like to exit
with an empty string)
- the read() keeps returning with EINTR and no data (does this mean that
chars are incoming but not yet ready for transfer to the calling routine) ?
Is it necessary to insert a timout then ?

Rainer Weikusat

unread,

Aug 20, 2009, 7:05:23 AM8/20/09

to

"Guillaume Dargaud" <use_the_form_on...@www.gdargaud.net>
writes:

>> But probably not in a loop like this, since this would result in
>> executing repeated reads as fast as the computer can handle them until
>> data to read is actually available.
>
> So how does one distinguishes between:
> - the device is not sending anything (in which case I would like to exit
> with an empty string)

I assume this is supposed to mean 'no data sent by a device is
presently buffered and awaiting a read'. And the answer to this
configure the corresponding logical file to operate in non-blocking
mode, do something else whenever a read has failed with errno ==
EAGAIN.

> - the read() keeps returning with EINTR and no data (does this mean that
> chars are incoming but not yet ready for transfer to the calling
> routine) ?

This means the process exited from the kernel because a signal was
pending and needed to be handled (user defined signal handler). Or
that's at least what it is supposed to mean (eg Linux-poll fails with
EINTR if job-control was used to suspend and continue the
process). Unless you intended to abort a blocking read-call, there is
really nothing more to do than to call read again or use sigaction
with SA_RESTART for this to be done automatically. Should your
application receive signals at a rate fast enough to prevent it from
doing anything except handling signals, a different problem would
exist.

> Is it necessary to insert a timout then ?

To accomplish what?

Scott Lurndal

unread,

Aug 20, 2009, 2:29:39 PM8/20/09

to

David Schwartz <dav...@webmaster.com> writes:
>On Aug 19, 10:47=A0am, sc...@slp53.sl.home (Scott Lurndal) wrote:
>
>> =A0Note
>> that the only defined error return value from read is '-1'; =A0'< 0' is n=
>ot the
>> correct test.
>
>=3D=3D-1 and <0 are both equally correct tests. The only difference is

POSIX defines read to return a ssize_t. ssize_t is defined as the value
-1 or any positive value. While the test for less than zero _may_ work
(depending on the architecture), it's not _correct_.

(I've actually run into issues with code assuming less than zero is the
same as -1 (particularly with lseek and mmap)).

scott

Loïc Domaigné

unread,

Aug 20, 2009, 3:12:27 PM8/20/09

to

Hi Scott,

> POSIX defines read to return a ssize_t. ssize_t is defined as the value
> -1 or any positive value. While the test for less than zero _may_ work
> (depending on the architecture), it's not _correct_.

POSIX states explicitly (Header spec for <sys/types.h>) that
* blksize_t, pid_t, and ssize_t shall be signed integer types.

> (I've actually run into issues with code assuming less than zero is the
> same as -1 (particularly with lseek and mmap)).

mmap returns a void*, so comparison with <0 makes little sense?

lseek() may return negative values for certain devices. In this case,
it is clear that you cannot rewrite ==-1 as <0.

I personally write error check ==-1. But I believe that David's
comment is correct as long as I am certain that the only negative
value returned is -1.

Or am I missing something?

Cheers,
Loïc
--
My Blog: http://www.domaigne.com/blog

"The future ain't what it used to be" -- Yogi Bera

David Schwartz

unread,

Aug 20, 2009, 7:23:18 PM8/20/09

to

On Aug 20, 11:29 am, sc...@slp53.sl.home (Scott Lurndal) wrote:

> POSIX defines read to return a ssize_t. ssize_t is defined as the value
> -1 or any positive value. While the test for less than zero _may_ work
> (depending on the architecture), it's not _correct_.

That makes no sense. If a ssize_t is *defined* as the value -1 or any
positive value, then checking for ==-1 is logically equivalent to
checking <0. How can one be right and the other wrong?

> (I've actually run into issues with code assuming less than zero is the
> same as -1 (particularly with lseek and mmap)).

Apples and oranges. Neither lseek nor mmap have the same return
semantics (no valid defined return values that can be negative).

DS