fdopen() on unix sockets and getline()

pozz

unread,

Mar 25, 2011, 11:57:31 AM3/25/11

to

I'm using Unix sockets to imlement a bidirectional communication
between two processes running on the some processor.

The protocol is very simple: it is based on textual messages ending
with newline character '\n'. So I tried to create a stream (FILE *)
from the socket file descriptor using fdopen.

After understanding I need two streams (one for reading and on for
writing), now I'm able to send messages over socket using fprintf.

Now the problem is the reception that I wanted to do with getline()
function or similar. The big problem is I don't want it to block the
process if a message has not been received.
I'd like to have a getline() function that always returns, with a
complete message (with ending newline) or with nothing. I'd avoid the
strategy to accumulate characters of partially arrived messages.

Is it possible?

Nobody

unread,

Mar 25, 2011, 2:32:34 PM3/25/11

to

On Fri, 25 Mar 2011 08:57:31 -0700, pozz wrote:

> After understanding I need two streams (one for reading and on for
> writing), now I'm able to send messages over socket using fprintf.
>
> Now the problem is the reception that I wanted to do with getline()
> function or similar. The big problem is I don't want it to block the
> process if a message has not been received.
> I'd like to have a getline() function that always returns, with a
> complete message (with ending newline) or with nothing. I'd avoid the
> strategy to accumulate characters of partially arrived messages.
>
> Is it possible?

There's not much point in wrapping the socket in a FILE* if you want to do
this. You are going to have to provide your own buffering anyhow, so you
may as well read directly from the socket with read().

Jens Thoms Toerring

unread,

Mar 25, 2011, 3:00:55 PM3/25/11

to

As far as I can see your question boils down to if you can stuff
back all the data into a stream you already have read from it
(since, in order to know if the other side has send a '\n' you
need to read everything that has been send either up to the
'\n' or to the end of what you can get at that moment). And
the simple answer is normally no since the standard C input
functions only guarantee that you can "push back" (using
ungetc()) a single character but not more - the implemen-
tation you use (i.e. your libc) may allow more but relying
on that will make your code non-portable.

On the other hand why do you object to storing partial messa-
ges? You need to allocate memory for reading anyway, so what
is the problem of keeping it allocated in your getline() func-
tion and appending to it the next time you call it when no
complete message was received? All you need is a pointer, a
variable for the length of the allocated buffer and an index
(for the next free position in the buffer), all declared as
static in your getline() function to organize things proper-
ly.
Regards, Jens
--
\ Jens Thoms Toerring ___ j...@toerring.de
\__________________________ http://toerring.de

pozz

unread,

Mar 26, 2011, 1:45:48 PM3/26/11

to

Il 25/03/2011 20:00, Jens Thoms Toerring ha scritto:
> On the other hand why do you object to storing partial messa-
> ges? You need to allocate memory for reading anyway, so what
> is the problem of keeping it allocated in your getline() func-
> tion and appending to it the next time you call it when no
> complete message was received? All you need is a pointer, a
> variable for the length of the allocated buffer and an index
> (for the next free position in the buffer), all declared as
> static in your getline() function to organize things proper-
> ly.

Actually I'm not using FILE* but file descriptor and read/write
functions. They work, but waiting for a message (ending with a newline)
is not a trivial task.

I need a buffer, a pointer and an index for the next free position in
the buffer, of course. But the function should not be blocking and, in a
general way, I should correctly manage receiving sequence of characters
like:
foo\nbar
This is a complete message (foo) and an incomplete message (bar) because
it isn't ended with a newline.

After read()ing all the available characters of the above sequence, I
have to check not only the last character, but all the newly arrived
characters looking for a newline.
I have to write code similar to the following:
nbytes = read(fd, &buffer[idx], sizeof(buffer) - idx));
if (nbytes > 0) {
int i;
for (i = idx; i < idx + nbytes; i++) {
if (buffer[i] == '\n') {
buffer[i] = '\0';
/* buffer is a null-terminated string
* containing the received message */
manage_message(buffer);
memcpy(&buffer, &buffer[++i], idx + nbytes - i);
nbytes = idx + nbytes - i;
idx = 0;
}
}
idx += nbytes;
}
And I didn't checked for two complte messages received with a single read().

Another approach is to receive one byte at a time:
while(read(fd, &c, 1) == 1) {
if (c == '\n') {
buffer[idx] = '\0';
manage_message(buffer);
idx = 0;
}
}
But I think this code is somewhat slower and more inefficient, because I
read one byte at a time, even if I already received 100 characters.

I know it's not impossible, but I hoped I could use line-buffered
streams to receive lines of text on a socket, with getline() or similar.

Barry Margolin

unread,

Mar 26, 2011, 2:04:26 PM3/26/11

to

In article <iml8oa$8tv$1...@nnrp.ngi.it>, pozz <pozz...@gmail.com> wrote:

> I know it's not impossible, but I hoped I could use line-buffered
> streams to receive lines of text on a socket, with getline() or similar.

Couldn't you do the getline()'s in a separate thread, so you don't mind
if it blocks when there's nothing to return?

--
Barry Margolin, bar...@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***

William Ahern

unread,

Mar 26, 2011, 3:36:36 PM3/26/11

to

pozz <pozz...@gmail.com> wrote:
<snip>

> I know it's not impossible, but I hoped I could use line-buffered
> streams to receive lines of text on a socket, with getline() or similar.

You've spent more time arguing that it's too difficult than actually coding.
You can't use the buffered I/O facilties because they cannot handle EAGAIN;
end of story. (I don't think fmemopen(), will do the job, but haven't put
too much effort into it.)

Here's what I came up with in 20 minutes, then spent a few minutes
debugging. (I'm waiting on lunch to finish cooking.) It's not the most
greatest in the world (and I have my own toolbox of I/O routines I would use
for this sort thing, including a small FIFO implementation), but it's always
worth the practice.

#include <stddef.h> /* size_t */
#include <stdlib.h> /* realloc(3) */

#include <string.h> /* memmove(3) memcpy(3) */

#include <errno.h> /* ENOMEM EPIPE errno */

#include <sys/types.h> /* ssize_t */
#include <unistd.h> /* read(2) */

struct buffer {
unsigned char *base;
size_t p, count, size;
_Bool eof;
};

int grow(struct buffer *buf) {
size_t nsize = (buf->size)? (buf->size * 2) : 256;
void *tmp;

if (nsize < buf->size)
return ENOMEM; /* overflow */

if (!(tmp = realloc(buf->base, nsize)))
return errno;

buf->base = tmp;
buf->size = nsize;

return 0;
} /* grow() */

int fill(struct buffer *buf, int fd) {
ssize_t n;
int error;

if (!(buf->count < buf->size) && (error = grow(buf)))
return error;

fill:
n = read(fd, &buf->base[buf->count], buf->size - buf->count);

if (n > 0) {
buf->count += n;

return 0;
} else if (n == 0) {
buf->eof = 1;

return EPIPE;
} else if (errno == EINTR) {
goto fill;
} else {
return errno;
}
} /* fill() */

void rebase(struct buffer *buf) {
memmove(buf->base, &buf->base[buf->p], buf->count - buf->p);
buf->count = buf->count - buf->p;
buf->p = 0;
} /* rebase() */

int copyout(char **lp, size_t *len, struct buffer *buf) {
size_t llen = buf->p;
void *tmp;

if (*len < llen) {
if (!(tmp = realloc(*lp, llen)))
return errno;

*lp = tmp;
}

memcpy(*lp, buf->base, llen);
*len = llen;

rebase(buf);

return 0;
} /* copyout() */

int getln(char **lp, size_t *len, struct buffer *buf, int fd) {
int error;

do {
while (buf->p < buf->count) {
if (buf->base[buf->p++] == '\n') {
if ((error = copyout(lp, len, buf)))
buf->p--; /* unget */
return error;
}
}
} while (!(error = fill(buf, fd)));

if (error == EPIPE && buf->eof)
return copyout(lp, len, buf);

return error;
} /* getln() */

/*
* MAIN
*/

#include <stdio.h> /* FILE popen(3) fileno(3) fwrite(3) fprintf(3) */

#include <string.h> /* strerror(3) */

#include <errno.h> /* errno */

#include <fcntl.h> /* F_GETFL F_SETFL O_NONBLOCK fcntl(2) */

#include <poll.h> /* POLLIN struct pollfd poll(2) */

int waitfd(int fd) {
struct pollfd pfd = { fd, POLLIN };

if (-1 == poll(&pfd, 1, 100)) return errno;

return 0;
} /* waitfd() */

int main(void) {
char *ln = 0;
size_t len = 0;
struct buffer buf = { 0 };
int flags, error;
FILE *fp;
int fd;

fp = popen("head /etc/services | while read LN; do echo $LN; sleep .1; done;", "r");
fd = fileno(fp);
flags = fcntl(fd, F_GETFL, &flags);
fcntl(fd, F_SETFL, flags | O_NONBLOCK);

do {
while (!(error = getln(&ln, &len, &buf, fd)) && !buf.eof) {
fwrite(ln, 1, len, stdout);
}

if (error == EAGAIN)
error = waitfd(fd);

if (error)
fprintf(stderr, "%s\n", strerror(error));
} while (!buf.eof);

return 0;
}

William Ahern

unread,

Mar 26, 2011, 4:03:46 PM3/26/11

to

William Ahern <wil...@wilbur.25thandclement.com> wrote:
> pozz <pozz...@gmail.com> wrote:
> <snip>
> > I know it's not impossible, but I hoped I could use line-buffered
> > streams to receive lines of text on a socket, with getline() or similar.

> You've spent more time arguing that it's too difficult than actually coding.
> You can't use the buffered I/O facilties because they cannot handle EAGAIN;
> end of story. (I don't think fmemopen(), will do the job, but haven't put
> too much effort into it.)

> Here's what I came up with in 20 minutes, then spent a few minutes
> debugging. (I'm waiting on lunch to finish cooking.) It's not the most
> greatest in the world (and I have my own toolbox of I/O routines I would use

"most greatest". *sigh*

Ian Collins

unread,

Mar 26, 2011, 11:52:07 PM3/26/11

to

On 03/27/11 06:45 AM, pozz wrote:
>
> I know it's not impossible, but I hoped I could use line-buffered
> streams to receive lines of text on a socket, with getline() or similar.

If your coding skill stretch to C++, it is a fairly simple exercise
(that I use in C++ training) to wrap a socket with a streambuff and use
the standard std::getline() to read lines of text from a socket.

--
Ian Collins

William Ahern

unread,

Mar 27, 2011, 12:13:16 AM3/27/11

to

Can getline() be interrupted? The OP needed non-blocking behavior, and this
is typically the achilles heel of most I/O libraries.

Ian Collins

unread,

Mar 27, 2011, 12:35:26 AM3/27/11

to

Ah yes, so I see...

That would indeed be a problem. I would probably assemble the incoming
message in another thread. Another possibility would be to use a
datagram socket.

--
Ian Collins

pozz

unread,

Mar 27, 2011, 5:42:52 AM3/27/11

to

Il 26/03/2011 19:04, Barry Margolin ha scritto:
>> I know it's not impossible, but I hoped I could use line-buffered
>> streams to receive lines of text on a socket, with getline() or similar.
>
> Couldn't you do the getline()'s in a separate thread, so you don't mind
> if it blocks when there's nothing to return?

Do you want to know why? I'm not so skilled to make some code with
separate threads :-(

pozz

unread,

Mar 27, 2011, 5:45:31 AM3/27/11

to

Il 26/03/2011 20:36, William Ahern ha scritto:
>> I know it's not impossible, but I hoped I could use line-buffered
>> streams to receive lines of text on a socket, with getline() or similar.
>
> You've spent more time arguing that it's too difficult than actually coding.

:-) You're right, but, as I wrote, I hoped I could use getline()
facilities. I'll make some code if it isn't possible, of course.

> Here's what I came up with in 20 minutes, then spent a few minutes
> debugging.

> [...]

Thank you for the code. I'll have a look at it.

Ersek, Laszlo

unread,

Mar 27, 2011, 2:36:19 PM3/27/11

to

On Fri, 25 Mar 2011, pozz wrote:

> I'm using Unix sockets to imlement a bidirectional communication
> between two processes running on the some processor.

> I'd like to have a getline() function that always returns, with a

> complete message (with ending newline) or with nothing. I'd avoid the
> strategy to accumulate characters of partially arrived messages.

This is probably overkill, but you could define a context-free grammar,
and assign an action routine to the rule that reduces a sequence of tokens
(single bytes) to the <full-line> non-terminal. Then you could just read
your socket, and feed whatever arrives to the parser. It will buffer
partial messages, execute the action routine for you, and it will never
repeat work already done (ie. rescan partial messages at append time).

http://www.hwaci.com/sw/lemon/lemon.html

"In Lemon, the tokenizer calls the parser."

I never used Lemon, but it seems very applicable to what you're trying to
do, except that it is probably overkill. OTOH, if you have designed a
line-oriented protocol for ease of implementation, perhaps now you can
lift that restriction (if you find some benefit in doing so). One benefit
might be that you could easily add parenthesized expressions (if they make
any sense to your protocol), and the parser would handle those for you
with the same effort.

lacos

Nobody

unread,

Mar 28, 2011, 12:29:45 PM3/28/11

to

On Sat, 26 Mar 2011 18:45:48 +0100, pozz wrote:

> But I think this code is somewhat slower and more inefficient, because I
> read one byte at a time, even if I already received 100 characters.

So don't do it. Read as much data at a time as you have space for in the
buffer. If the buffer fills up with an incomplete message, enlarge it.

> I know it's not impossible, but I hoped I could use line-buffered
> streams to receive lines of text on a socket, with getline() or similar.

Buffered I/O isn't compatible with non-blocking I/O.

> And I didn't checked for two complte messages received with a single
> read().

1. Check whether there is a newline in the buffer. If there is, remove
the data up to the newline from the buffer and return it. Otherwise ...

2. If the buffer is full, enlarge it by some arbitrary amount.

3. Read as much data is available and will fit into the buffer.

4. Check whether there is a newline in the buffer. If there is, remove
the data up to the newline from the buffer and return it. Otherwise ...

5. Return NULL.

This is reasonably simple, reasonably efficient, and doesn't grow the
buffer unnecessary (i.e. you don't bother reading more data until you have
no complete messages waiting in the buffer).