[lwip-users] http server and pbuf overflow

798 views
Skip to first unread message

Bernhard 'Gustl' Bauer

unread,
Jan 18, 2010, 6:11:22 AM1/18/10
to Mailing list for lwIP users
Hi,

I'm using lwip 1.3.0 and sometimes I have a pbuf overflow.

I'm not sure I understand the pbuf concept. I have some data transfers
that exceed 1 tcp packet.

Every time http_recv is called I check if this is the 1st packet. If it
is I set a pointer to this pbuf and do not free this pbuf. If it is not
the 1st packet I store the data and free this pbuf.

Then I check if this is the last packet. If it is I process all data and
then free the pbuf of the 1st packet.

When there is an tcp_err during a transfer I do free the 1st packet to.

Is there anything wrong with this?

TIA

Gustl


_______________________________________________
lwip-users mailing list
lwip-...@nongnu.org
http://lists.nongnu.org/mailman/listinfo/lwip-users

gold...@gmx.de

unread,
Jan 18, 2010, 11:41:14 AM1/18/10
to lwip-...@nongnu.org
Bernhard 'Gustl' Bauer wrote:
> Hi,
>
> I'm using lwip 1.3.0 and sometimes I have a pbuf overflow.
>
First of all, only to make sure what you mean, are you referring to the
PBUF_POOL running out of pbufs?

> I'm not sure I understand the pbuf concept.
You have to call pbuf_free() on every pbuf chain passed to your
application (via the receive callback). If can chain two chains together
using pbuf_cat(), you only have to call pbuf_free() on the head of the
chain.

> I have some data transfers
> that exceed 1 tcp packet.
>
> Every time http_recv is called I check if this is the 1st packet. If it
> is I set a pointer to this pbuf and do not free this pbuf. If it is not
> the 1st packet I store the data and free this pbuf.
>
What do you mean by 'store the data and free this pbuf'?

Simon

Bob Brusa

unread,
Jan 18, 2010, 11:42:40 AM1/18/10
to Mailing list for lwIP users
Am 18.01.2010, 12:11 Uhr, schrieb Bernhard 'Gustl' Bauer
<gu...@quantec.de>:

> Hi,
>
> I'm using lwip 1.3.0 and sometimes I have a pbuf overflow.
>
> I'm not sure I understand the pbuf concept. I have some data transfers
> that exceed 1 tcp packet.
>
> Every time http_recv is called I check if this is the 1st packet. If it
> is I set a pointer to this pbuf and do not free this pbuf. If it is not
> the 1st packet I store the data and free this pbuf.
>
> Then I check if this is the last packet. If it is I process all data and
> then free the pbuf of the 1st packet.
>
> When there is an tcp_err during a transfer I do free the 1st packet to.
>
> Is there anything wrong with this?
>
> TIA
>
> Gustl

Gustl, what if you miss these last buffers sometimes? I would change the
logic:
If you get a first buffer check if you have some old stuff around and
clean up.
Regards Bob

Bernhard 'Gustl' Bauer

unread,
Jan 19, 2010, 1:54:34 AM1/19/10
to Mailing list for lwIP users
gold...@gmx.de schrieb:

> Bernhard 'Gustl' Bauer wrote:
> First of all, only to make sure what you mean, are you referring to the
> PBUF_POOL running out of pbufs?

Yes

>> I'm not sure I understand the pbuf concept.
> You have to call pbuf_free() on every pbuf chain passed to your
> application (via the receive callback). If can chain two chains together
> using pbuf_cat(), you only have to call pbuf_free() on the head of the
> chain.

This chained pbufs are new to me. I treated every call from tcp_recv as
if it contained only 1 pbuf = packet.

>> I have some data transfers
>> that exceed 1 tcp packet.
>>
>> Every time http_recv is called I check if this is the 1st packet. If it
>> is I set a pointer to this pbuf and do not free this pbuf. If it is not
>> the 1st packet I store the data and free this pbuf.
>>
> What do you mean by 'store the data and free this pbuf'?

I copy the payload to a different location and call pbuf_free whit the
pointer to pbuf.

Gustl

Bernhard 'Gustl' Bauer

unread,
Jan 19, 2010, 1:59:28 AM1/19/10
to bob....@gmail.com, Mailing list for lwIP users
Bob Brusa schrieb:

> Am 18.01.2010, 12:11 Uhr, schrieb Bernhard 'Gustl' Bauer
> <gu...@quantec.de>:
>>
>> Every time http_recv is called I check if this is the 1st packet. If
>> it is I set a pointer to this pbuf and do not free this pbuf. If it is
>> not the 1st packet I store the data and free this pbuf.
>>
>> Then I check if this is the last packet. If it is I process all data
>> and then free the pbuf of the 1st packet.
>>
>> When there is an tcp_err during a transfer I do free the 1st packet to.
>>
>
> Gustl, what if you miss these last buffers sometimes? I would change the
> logic:
> If you get a first buffer check if you have some old stuff around and
> clean up.

Don't I get a tcp_err in this case? This would free pbuf.

I'll try to verify this.

Simon Goldschmidt

unread,
Jan 19, 2010, 2:07:18 AM1/19/10
to Mailing list for lwIP users

"Bernhard \'Gustl\' Bauer" wrote:
> This chained pbufs are new to me. I treated every call from tcp_recv as
> if it contained only 1 pbuf = packet.

For RX, that depends on your driver and might well be true for simple cases. However if you enable out-of-sequence queueing (TCP_QUEUE_OOSEQ, enabled by default), you can get multiple (queued) packets with one recv call. Just check if p->next is != NULL. Still, calling pbuf_free() on the first pbuf of this queue should be enough, so your pbuf leak is probably somewhere else.

How do you know you have a leak, anyway? Does your device stall? Or is it only temporary? Maybe you just configured the number of pbufs too low?

> Don't I get a tcp_err in this case? This would free pbuf.

That depends on the remote side. If you simply switch off the remote side or unplug its cable, you won't get a tcp_err (or not for some time, at least): tcp_err is called when the remote side sends a RST or your connection times out (and timing out can be a long time for TCP if the remote side simply doesn't answer at all).

Simon
--
Jetzt kostenlos herunterladen: Internet Explorer 8 und Mozilla Firefox 3.5 -
sicherer, schneller und einfacher! http://portal.gmx.net/de/go/atbrowser

Bernhard 'Gustl' Bauer

unread,
Jan 19, 2010, 2:20:10 AM1/19/10
to Mailing list for lwIP users
Simon Goldschmidt schrieb:

> How do you know you have a leak, anyway? Does your device stall? Or
> is it only temporary? Maybe you just configured the number of pbufs
> too low?

I let it run some time. then I close all browser windows and have a look
at the stats. pbuf.used doesn't drop to 0! If I let it run some time
used increases to avail.

>
>> Don't I get a tcp_err in this case? This would free pbuf.
>
> That depends on the remote side. If you simply switch off the remote
> side or unplug its cable, you won't get a tcp_err (or not for some
> time, at least): tcp_err is called when the remote side sends a RST
> or your connection times out (and timing out can be a long time for
> TCP if the remote side simply doesn't answer at all).

Good to know.

Gustl

Bernhard 'Gustl' Bauer

unread,
Jan 19, 2010, 7:47:04 AM1/19/10
to Mailing list for lwIP users
I searched my wireshark file for the contents of the pbuf that is still
not freed.

It is line no. 1802, but bytes 22h to 31h seam to switch places. I guess
it has been processed to a certain point.

Remote sends a POST (line 1786) periodically. If it isn't answered
within 2 sec. it is terminated.

The POST is probably not answered because of a miss of CS8900. Any idea
why 1802 is still buffered?

What causes line 1804? Is it triggered by 1802, or has 1784 been a miss too?

Gustl

debug_19_01_part.pcap

Bernhard 'Gustl' Bauer

unread,
Jan 19, 2010, 7:50:52 AM1/19/10
to Mailing list for lwIP users
Just recognized the different line numbers.

Bernhard 'Gustl' Bauer schrieb:


> I searched my wireshark file for the contents of the pbuf that is still
> not freed.
>
> It is line no. 1802, but bytes 22h to 31h seam to switch places. I guess
> it has been processed to a certain point.

line no. 23

> Remote sends a POST (line 1786) periodically. If it isn't answered
> within 2 sec. it is terminated.

line no. 7

> The POST is probably not answered because of a miss of CS8900. Any idea
> why 1802 is still buffered?

line no. 23

> What causes line 1804? Is it triggered by 1802, or has 1784 been a miss
> too?

line no. 25, 23 and 6

Bernhard 'Gustl' Bauer

unread,
Jan 19, 2010, 9:10:11 AM1/19/10
to Mailing list for lwIP users
I think I tracked the problem down. I had this situation several times:
(R = remote; L = LWIP)
1: R -> L Syn
2: L -> R Ack, Syn
3: R -> L Ack (probably missed)
4: R -> L POST Data (probably missed)
...
n: R -> L Fin, Ack

After line 1 and 2 pcb->state should be SYN_SENT. I assume line 3 and 4
are missed. The next received packet is line n (Fin, Ack), so
tcp_process does a tcp_rst. But the pbuf that contains line n is never
freed!

I think tcp_process should return an error to free pbuf.

Please correct me when I'm wrong.

Gustl

Bernhard 'Gustl' Bauer

unread,
Jan 21, 2010, 1:20:03 AM1/21/10
to Mailing list for lwIP users
Hi,

I checked the memory where pbuf pool is located. On power up it is zero
except for the ->next pointers. Some time later MEM PBUF_POOL used is at
3 (max=5) in spite there is no traffic. So I checked the memory again.
The top 3 pbufs (63, 62, 61) are like this:
->next=0
->tot_len=0
->len=0
->ref=1

pbuf (60) is like this:
->next=&pbuf[58]
->tot_len=0
->len=0
->ref=0

pbuf (59) is like this:
->next=&pbuf[59]
->tot_len=0
->len=0
->ref=0

All pbufs with ref=1 are not freed, all pbufs with ref=0 are freed. Is
this correct?

I crosschecked the pbufs with the attached wireshark file.
pbuf[63] = packet 55
pbuf[62] = packet 128
pbuf[61] = packet 99

In all 3 cases this is a FIN packet from remote after a corrupt
transfer. From the pcap file I can only guess whether ACK (42, 107, 83)
and POST (43, 108, 84) are missed, or passed on to my application.

I checked my http_recv(). I have 3 different exits:
1: pbuf_free(); tcp_abort(); return ERR_ABORT;
2: tcp_receved(); pbuf_free(); tcp_abort(); return ERR_ABORT;
3: tcp_receved(); pbuf_free(); return ERR_OK;
Is there anything wrong with an exit? Do I need tcp_recved() before
tcp_abort(); return ERR_ABORT; ?

Glad for any pointers.

Gustl


debug_20_01_a.pcap
debug_20_01_c.pcap

gold...@gmx.de

unread,
Jan 21, 2010, 2:03:16 AM1/21/10
to lwip-...@nongnu.org
Bernhard 'Gustl' Bauer wrote:
> I checked my http_recv(). I have 3 different exits:
> 1: pbuf_free(); tcp_abort(); return ERR_ABORT;
> 2: tcp_receved(); pbuf_free(); tcp_abort(); return ERR_ABORT;
> 3: tcp_receved(); pbuf_free(); return ERR_OK;
> Is there anything wrong with an exit? Do I need tcp_recved() before
> tcp_abort(); return ERR_ABORT; ?
>
tcp_abort currently shouldn't be used from one of the callback
functions: http://savannah.nongnu.org/bugs/?27871

Can you try replacing that with tcp_close(); and return ERR_OK;? In any
case, calling tcp_recved() won't hurt, too. (Although before tcp_abort,
it shouldn't be necessary - once the above bug is fixed.)

Bernhard 'Gustl' Bauer

unread,
Jan 21, 2010, 3:43:39 AM1/21/10
to Mailing list for lwIP users
gold...@gmx.de schrieb:

> tcp_abort currently shouldn't be used from one of the callback
> functions: http://savannah.nongnu.org/bugs/?27871
>
> Can you try replacing that with tcp_close(); and return ERR_OK;? In any
> case, calling tcp_recved() won't hurt, too. (Although before tcp_abort,
> it shouldn't be necessary - once the above bug is fixed.)

I did this. But the problem still exists. I checked again all exit
points and recognized that sometimes http_recv is called with p=NULL; !!
This happens when remote sends a TCP retransmission, or a FIN because of
a corrupt transfer. This is my shortened function:

static err_t
http_recv(void *arg, struct tcp_pcb *pcb, struct pbuf *p, err_t err)
{
char *data;
char *data1;
struct http_state *hs;

hs = arg;

if (err == ERR_OK && p != NULL) {

...

}

if (err == ERR_OK && p == NULL) {
close_conn(pcb, hs);
}
return ERR_OK;
}

If p=NULL was caused by a FIN the pbuf containing this FIN is never
freed! See port number 4784 in attached pcap.

What can I do about this?

Gustl


debug_21_01.pcap

Simon Goldschmidt

unread,
Jan 21, 2010, 4:45:05 AM1/21/10
to Mailing list for lwIP users
p == NULL is perfectly normal and tells your application that the remote side has sent a FIN. However, this should not lead to memory- or pbuf leaks... I'll see if I can reproduce that.

Simon

--
Haiti-Nothilfe! Helfen Sie per SMS: Sende UIHAITI an die Nummer 81190.
Von 5 Euro je SMS (zzgl. SMS-Geb�hr) gehen 4,83 Euro an UNICEF.

Bernhard 'Gustl' Bauer

unread,
Jan 21, 2010, 9:41:16 AM1/21/10
to Mailing list for lwIP users
Simon Goldschmidt schrieb:

> p == NULL is perfectly normal and tells your application that the
> remote side has sent a FIN. However, this should not lead to memory-
> or pbuf leaks... I'll see if I can reproduce that.

In debug_21_01.pcap is an example with p==NULL and FIN. But this is not
the situation when pbuf leaks! The leak is shown in debug_20_01_c.pcap!

There the [SYN, ACK] (106) from LWIP is not ACKed from remote (107, 108:
packet miss). My remote terminates after 2 sec. TCP_SYN_RCVD_TIMEOUT is
20 sec. So [FIN, ACK] (128) is received before TCP_SYN_RCVD_TIMEOUT runs
out. pcb->state should be still SYN_RCVD. Can you explain to me what
will happen if [FIN, ACK] (128) is received? It looks pretty similar
than the missing 107.

Gustl

Bernhard 'Gustl' Bauer

unread,
Jan 20, 2010, 7:39:27 AM1/20/10
to Mailing list for lwIP users
Hi,

I checked the memory where pbuf pool is located. On power up it is zero
except for the ->next pointers. Some time later MEM PBUF_POOL used is at
3 (max=5) in spite there is no traffic. So I checked the memory again.
The top 3 pbufs (63, 62, 61) are like this:
->next=0
->tot_len=0
->len=0
->ref=1

pbuf (60) is like this:
->next=&pbuf[58]
->tot_len=0
->len=0
->ref=0

pbuf (59) is like this:
->next=&pbuf[59]
->tot_len=0
->len=0
->ref=0

All pbufs with ref=1 are not freed, all pbufs with ref=0 are freed. Is
this correct?

I crosschecked the pbufs with the attached wireshark file.
pbuf[63] = packet 55

pbuf[62] = packet 1028
pbuf[61] = packet 999

In all 3 cases this is a FIN packet from remote after a corrupt

transfer. From the pcap file I can only guess whether ACK (42, 1007,
983) and POST (43, 1008, 984) are missed, or passed on to my application.

I checked my http_recv(). I have 3 different exits:
1: pbuf_free(); tcp_abort(); return ERR_ABORT;
2: tcp_receved(); pbuf_free(); tcp_abort(); return ERR_ABORT;
3: tcp_receved(); pbuf_free(); return ERR_OK;
Is there anything wrong with an exit? Do I need tcp_recved() before
tcp_abort(); return ERR_ABORT; ?

Glad for any pointers.

Gustl


debug_20_01.pcap

Bill Auerbach

unread,
Jan 21, 2010, 4:20:46 PM1/21/10
to Mailing list for lwIP users
Check to be sure your Ethernet driver isn't increasing pbuf->ref anywhere.
Some drivers do this so that the packet is freed only after a DMA transfer
is complete and not at the time low_level_output returns. If it's freed too
soon the packet can be corrupted if its pbuf is used before the DMA
finishes.

Bill

Bernhard 'Gustl' Bauer

unread,
Jan 22, 2010, 4:03:22 AM1/22/10
to Mailing list for lwIP users
Bill Auerbach schrieb:

> Check to be sure your Ethernet driver isn't increasing pbuf->ref anywhere.
> Some drivers do this so that the packet is freed only after a DMA transfer
> is complete and not at the time low_level_output returns. If it's freed too
> soon the packet can be corrupted if its pbuf is used before the DMA
> finishes.

I only use pbuf_alloc() and pbuf_free(). I never use pbuf_ref(), or
access pbuf->ref.

Bernhard 'Gustl' Bauer

unread,
Jan 27, 2010, 4:09:02 AM1/27/10
to Mailing list for lwIP users
Hi,

I switched to 1.3.2 and all seams to work fine now

Thanks

Gustl

Reply all
Reply to author
Forward
0 new messages