Seeing same app data multiple times in dataHandler(TCPStream &)

81 views
Skip to first unread message

sushrut...@gmail.com

unread,
Apr 12, 2014, 8:01:42 AM4/12/14
to lib...@googlegroups.com
Hi,

I am using this TCPStreamFollower::followstreams(...) overload to follow tcp streams. Here are the steps I am following:
  - build a RawPDU
  - build a TCP PDU
  - build a IP PDU
  - tcp.inner_pdu(rawpdu)
  - ip.inner_pdu(tcp)
  - push the ip PDU into a vector<PDU> vecPDU
  - call followstreams(vecPDU.begin,vecPDU.end,dataHandler,endHandler)

The above is as per a discussion in a previous post.

Here is what I see if I browse to a website for e.g. - www.eicar.org/download/eicar.com.txt. My packet source library correctly prints out all packets from TCP handshake start, app data arrival, TCP FIN. I am able to pass all these to followstreams without any issues. However, when dataHandler is called, I see it being called with the same app data multiple times. Specifically, I should expect the following data - http headers + payload of the eicar.com.txt. All I see is multiple invocations of dataHandler with just the http headers and never the payload. endHandler seems to get called fine.

I am more or less thinking that I may not be doing something but am unable to pinpoint what. Would you be able to provide some hints? Please let me know if you have follow up queries.

Rgds,
Sushrut.

sushrut...@gmail.com

unread,
Apr 12, 2014, 9:15:47 AM4/12/14
to lib...@googlegroups.com

Might have found the reason for this - though can you confirm? -- I had taken the latest master as you recommended a week or so back after pushing in the 'clear_state' fix in tcp_stream.h. However, when I checked the tcp_stream.h from the latest master I pulled, I found that the clear_state function was missing. I am going to try to manually add this fix as it doesn't look too big in scope and restricted to one file. Could this be the reason for the behaviour below?

I am working remotely today and have connectivity problems -  will try later to put in this fix, if you can confirm too, that would be great.

Thanks,
Sushrut.

Matias Fontanini

unread,
Apr 12, 2014, 10:54:26 AM4/12/14
to lib...@googlegroups.com
You are right, that was a bug introduced when I updated
TCPStreamFollower to properly handle overlapped fragments. I've just
pushed a fix into the master branch. Let me know if you have any other
problems.

Regards,
Matias

On 12/04/14 10:15, sushrut...@gmail.com wrote:
> Might have found the reason for this - though can you confirm? -- I had
> taken the latest master as you recommended a week or so back after pushing
> in the 'clear_state' fix in tcp_stream.h. However, when I checked the
> tcp_stream.h from the latest master I pulled, I found that the clear_state
> function was missing. I am going to try to manually add this fix as it
> doesn't look too big in scope and restricted to one file. Could this be the
> reason for the behaviour below?
>
> I am working remotely today and have connectivity problems - will try
> later to put in this fix, if you can confirm too, that would be great.
>
> Thanks,
> Sushrut.
>
> On Saturday, April 12, 2014 5:31:42 PM UTC+5:30, sushrut...@gmail.com wrote:
>> Hi,
>>
>> I am using this<http://libtins.github.io/docs/latest/d5/d1a/classTins_1_1TCPStreamFollower.html#ae6c14eea64cd57c67d45f578426fd454>TCPStreamFollower::followstreams(...) overload to follow tcp streams. Here
>> are the steps I am following:
>> - build a RawPDU
>> - build a TCP PDU
>> - build a IP PDU
>> - tcp.inner_pdu(rawpdu)
>> - ip.inner_pdu(tcp)
>> - push the ip PDU into a vector<PDU> vecPDU
>> - call followstreams(vecPDU.begin,vecPDU.end,dataHandler,endHandler)
>>
>> The above is as per a discussion in a previous
>> <https://groups.google.com/d/msg/libtins/GEfjkCAjiLQ/B7jKM0kMycQJ>post.

sushrut...@gmail.com

unread,
Apr 12, 2014, 11:30:14 AM4/12/14
to lib...@googlegroups.com
Ok thanks. Will pull the latest...

Reds,
Sushrut.

sushrut...@gmail.com

unread,
Apr 14, 2014, 2:39:43 AM4/14/14
to lib...@googlegroups.com

Hello,

I tried using the latest built libtins but I am still having problems. Now, I essentially am not seeing any dataHandler or endHandler invocations. I believe I may not have understood the lib usage properly. A detailed psuedocode of what I am doing is attached. Can you help me understand what I am doing wrong? I make sure that the program is running and only then do actions that result in a new tcp session creation as well as tear down. I built this code after taking a look at the unit test code I saw in "/libtins-master/tests/src/tcp_stream.cpp".

  - Is the vector that I am using need to be global in scope? (I tried making it global in scope -  after that the behaviour I saw was exactly similar to what I wrote in the first post in this thread. I basically see the dataHandler being invoked multiple times for the HTTP GET response but post that there is a small text file being downloaded that I never get in any dataHandler invocation. endHandler also gets invoked.)
  - Should libtin's client own the responsibility of clearing out this vector (if it is global in scope)? Note that if I clear out the global vector after invoking the follow_stream function, I see no dataHandler or endHandler invocations.
  - Does follow_stream expect all possible packets of a particular session to be present when it is invoked - I believe that may not be the case? I ask because this is essentially what the unit test code in /tests/tcp_stream.cpp does, afaik.

Rgds,
Sushrut.
psuedocode.txt

Matias Fontanini

unread,
Apr 14, 2014, 7:41:39 AM4/14/14
to lib...@googlegroups.com
Hi,

On 14/04/14 03:39, sushrut...@gmail.com wrote:
> - Is the vector that I am using need to be global in scope? (I tried
> making it global in scope - after that the behaviour I saw was exactly
> similar to what I wrote in the first post in this thread. I basically see
> the dataHandler being invoked multiple times for the HTTP GET response but
> post that there is a small text file being downloaded that I never get in
> any dataHandler invocation. endHandler also gets invoked.)

It's okay that the follower is global.
> - Should libtin's client own the responsibility of clearing out this
> vector (if it is global in scope)? Note that if I clear out the global
> vector after invoking the follow_stream function, I see no dataHandler or
> endHandler invocations.

After follow_streams processes a packet. you can erase it, since the
stream follower will keep any information it needs to reassemble the
PDU. So clearing the vector after processing it is fine.
> - Does follow_stream expect all possible packets of a particular session
> to be present when it is invoked - I believe that may not be the case? I
> ask because this is essentially what the unit test code in
> /tests/tcp_stream.cpp does, afaik.

No, you can give it one packet at a time and that will work as well.

The best you could do would be to generate a pcap file and attach it
here. That way, I could see if there's a problem in the packet
generation or the stream reassembly. The problem may be just packet
generation itself.

You can do that by modifying your code a little bit and use PacketWriter:


// global variable
PacketWriter writer("output.pcap", PacketWriter::ETH2);

callback_fn(....)
{
// same initialization of ip, tcp and rawpdu variables as you showed
EthernetII eth = EthernetII() / ip / tcp / rawpdu;
writer.write(eth);

}

A pcap file "output.pcap" will be generated in your current directory.
If you could upload it, that would be great.

Regards,
Matias

sushrut...@gmail.com

unread,
Apr 14, 2014, 10:13:44 AM4/14/14
to lib...@googlegroups.com
Thanks Matias. Here is the attached pcap file - I added the code exactly as you showed below. Interestingly, when I opened it in Wireshark, I saw all packets appearing as "Fragmented IP protocol". Is this because I am missing something while forming the pdu's?

Rgds,
Sushrut.
output.zip

sushrut...@gmail.com

unread,
Apr 14, 2014, 10:18:02 AM4/14/14
to lib...@googlegroups.com
To add more color - what I am doing is essentially pointing my browser to  "www.eicar.org/download/eicar.com.txt".

Rgds,
Sushrut.

Matias Fontanini

unread,
Apr 14, 2014, 11:12:00 AM4/14/14
to lib...@googlegroups.com
That is indeed a problem. But I don't think it's libtins fault. It looks
like the other library indicates that all IP PDUs have a fragment offset
or 512. Note that that field is encoded as 0x40 in the IP header. This
is a very common value for the "flags" field. Are you sure you are
parsing the IP and TCP header correctly? Why don't you use the
constructors provided by libtins?

IP ip((const uint8_t*)ipHeader, sizeof(*ipHeader));

That ip PDU will contain all the fields in the parsed buffer. If there
is a TCP PDU in the buffer after the IP one, it will be parsed as well,
so the "ip" variable will contain the parsed tcp packet as its inner_pdu.

It also looks like there are other troubles with your other library. I
removed the fragment offset issue and it still looks like there are some
sequence number holes(just by looking at it with wireshark).

I'd advise you to first, successfully parse the packets. Keep dumping
them to a pcap file until you see that the output seems correct. I don't
think libtins is causing these issues.

Regards,
Matias

On 14/04/14 11:13, sushrut...@gmail.com wrote:
> Thanks Matias. Here is the attached pcap file - I added the code exactly as
> you showed below. Interestingly, when I opened it in Wireshark, I saw all
> packets appearing as "Fragmented IP protocol". Is this because I am missing
> something while forming the pdu's?
>
> Rgds,
> Sushrut.
>
> On Monday, April 14, 2014 5:11:39 PM UTC+5:30, Matias Fontanini wrote:
>> Hi,
>>

sushrut...@gmail.com

unread,
Apr 14, 2014, 11:20:54 AM4/14/14
to lib...@googlegroups.com
Appreciate the quick response Matias! Looks like I'll have revalidate all the header parsing code. I do not think the problem is with the packet source library, most likely, my interpretation of its packets. Will let you know how it goes....

Rgds,
Sushrut.

sushrut...@gmail.com

unread,
Apr 15, 2014, 5:14:13 AM4/15/14
to lib...@googlegroups.com
How critical are the following for proper reassembly? I am not currently setting them as I don't get them readily from my packet source lib (I can of course further parse them out if they are required):

for IP --> eol, noop, security, lsrr, ssrr, record_route, stream_identifier.
for TCP --> mss, winscale, sack_permitted, sack, timestamp, altchecksum

Rgds,
Sushrut.

Matias Fontanini

unread,
Apr 15, 2014, 8:39:52 AM4/15/14
to lib...@googlegroups.com
As I mentioned in another post, only sequence numbers and the actual
payload are used to reassemble streams. Of course, if some packet had,
for examplme, IP fragmentation and you didn't set the IP fields
correctly, then that packet would not be parsed.
Reply all
Reply to author
Forward
0 new messages