Reassembly Of Severly Fragmented TCP Data

Sep 1, 2010 at 3:40 AM

Hey everyone,

I'm trying to write a parser for a proprietary TCP protocol which is in use at my company. Basically, we are being fed data from the internet and the service on our end needs to listen and send an acknowledgment (ACK) message when a full data message has been received (or a NACK upon error). Currently, I am having problems parsing/reassembling the data packets because they are severely fragmented. In some cases, i'm only receiving a single byte in the TCP payload. I have read the whole "From wire to window" tutorial, read most of the NPL help, and used the "Parsing Protocol in TCP Stream" post (http://nmparsers.codeplex.com/Thread/View.aspx?ThreadId=203503) here on codeplex as a guide for reassembly, all of which have helped, but i'm still not able to successfully parse this data.

I'm dealing with 3 different messages:

ACK Message: MessageType (2 bytes; always 'IM'), Length (2 bytes), Message (3 bytes; always 'ACK'), Checksum (1 byte) // length includes all fields, so length should always be 0x08 for this message

NACK Message: MessageType (2 bytes; always 'IM'), Length (2 bytes), Message (4 bytes; always 'NACK'), Checksum (1 byte) // length includes all fields, so length should always be 0x09 for this message

INFO Message: MessageType (2 bytes; always 'IM'), SenderID (10 bytes), Length (2 bytes), DATA (Variable Length, at least  29 bytes) Checksum (1 byte)    //  The minimum length of this message type is 44 bytes

 

So, I have no trouble capturing and parsing the ACK and NACK messages since the capture is running on the server they are being sent from (thus no fragmenting). However, reassembling the INFO messages is not working at all. Basically, the "Parsing Protocol in TCP Stream" post assumes that you have received enough data in the current frame to make a determination as to which message type you're dealing with and the length of it. What i'm stumbling over is how can I "buffer" data so that i have enough to even get to the offset that contains the length.

Am I just not doing the reassembly correctly or is there a good description of how reassembly works that i've missed somewhere?

 

Any help would be appreciated!

Thanks,

-Dave

 

 

 

 

 

 

 

Sep 1, 2010 at 11:19 AM

Hi Dave,

Thanks for using Netmon and your interest in writing your own parser.

Before I can help, I would like to request some more information:

1. Can you identify INFO message without looking at the content of the Message field? for example, is it sent from client to server, but the other two from server to client?

2. Is the MessageType/SenderID/Length fields being fragmented? Or just the DATA/Checksum field?

BTW, if you still have the chance to change the message format, I recommend to always follow the TLV (Type Length Value) sequence to design the field layout. In this way it would be much easier to process the message, and get better compatibility between versions.

Thanks

Luther

Sep 1, 2010 at 2:19 PM

Thanks for the reply Luther.

 

    To answer your first question, you are correct. the INFO messages only originate from the (remote) client to our server and the ACK/NACK messages are sent from our server to the client. As for the 2nd question, the whole INFO message is fragmented. TCP is ensuring that I am getting the message in the proper order, but it is not reassembling the payload into one properly formatted message. Some of the time, a single frame will only contain 1 byte of data. However, if you go frame to frame all of the data is there. For example:

Frame 1: Data: I

Frame 2: Data: N

Frame 3: Data: 0

Frame 4: Data: 1

(and so on)

At most, I need to buffer 14 bytes of the incoming client data to be able to get at the length field.

 

-Dave

 

 

 

Sep 1, 2010 at 2:22 PM

Oh and in regards to your comment about the field ordering, this is a protocol from a 3rd party, so as much as I also want to change it, I cant ;)

Sep 2, 2010 at 3:44 AM

Luther,

    Just in case I haven't swamped you with enough information yet, I took some more time today to review the NPL documentation and read through the HTTP.npl parser, which helped clear up a number of things in my mind, but I have a few general questions that I'd really appreciate some help with.

 

The reassembly that Network Monitor is able to do via PayloadStart requires that the protocol parser be able to detect the start of a data frame (via IsFirst) and its' total length given only the frame it is looking at. There is no way to go back to previous frames, look forward, or glob frames together, Correct?

Is the only way to look backward in the frame list to set properties when they are encountered and use them in future frames?

Do I have a fundamental mis-understanding of how these parsers are supposed to reassemble frames, or is it just me?

Thanks (yet again),

-Dave

Sep 2, 2010 at 1:55 PM

Hi Dave,

Firstly let me try to answer your questions:

The reassembly that Network Monitor is able to do via PayloadStart requires that the protocol parser be able to detect the start of a data frame (via IsFirst) and its' total length given only the frame it is looking at. There is no way to go back to previous frames, look forward, or glob frames together, Correct?

[Luther] In PayloadStart invocation, you can use either fields inside the same frame, or values stored previously. To pass values between frames, you can use Conversation or Global variables. But you cannot "look forward".

Is the only way to look backward in the frame list to set properties when they are encountered and use them in future frames?

[Luther] In general you cannot directly access other frames, but like I replied in last question, you can store some value and use it in later frames.

 

Unfortunately Netmon reassembly engine/syntax is designed to reassemble the frames with payload fragmented. There is no easy way to "store the fragments in buffer". If it's possible, can you share your capture data with me so I can personally take a took and see what I can do. You may erase sensitive data by editing the capture, make sure to uncheck "hex readonly" option in edit menu before editing. You may open a bug in the issue tracker and upload the file.

In the mean time, you may also take a look at other options, including Netmon API (Use C#/C to read parsed data and continue to process as you want) and Netmon Experts http://nmexperts.codeplex.com/ (Custom plugin to interact with netmon UI) to get more control and flexibility.