Design of SocketReplay and capture scheme

This chapter details the design of SocketReplay and capture scheme that are total solution of traffic capture and replay in large-scale environment.

3.1 Design Goals

The objective of capture is to store valuable traffic that is enough to trigger events. In other words, this capture scheme ignores part of payload of some packets which have low probability of triggering events and decreases the replayed traffic volume to trigger events.

The three objectives for traffic replay: (1) compact with incomplete TCP connections due to capture loss or above capture scheme; (2) statefully replay at TCP/IP layer because most of DUTs including NAT, proxy and IPS (Intrusion Prevention System) modify TCP/IP headers; and (3) replay selective packet traces to reproduce events. It helps designers to analyze easily what sessions or connections trigger events.

3.2 Low-Storage capture scheme

In this work, the capture scheme uses three thresholds (N, M, P) for the payload length of a connection being captured. The threshold N defines the number of bytes of data should be stored for each connection, the threshold M defines the number of bytes of data should be stored for each packet of a connection after length of stored data exceeds the threshold N, and the threshold P defines the number of packets should be stored after exceeding the threshold N. The N is set because we believe most of events can be triggered within the first bytes per connection. The M is set

because we believe after N bytes of connection, most events can be analyzed by application header and most application header is only small bytes per packet. The P is set because we believe most events can be triggered by first packets of each connection. Figure 3 illustrates an example, if the payload length of first three packets already meets the threshold of N bytes, the following P packets will be capture only first M bytes of payload and the other will be ignored.

Packet Sequence 1 2 3 4 … P+3 P+4 …etc

Payload N bytes

M bytes M bytes M bytes

Ignored data

Figure 3. Ignored data of three thresholds (N, M, P)

3.3 SocketReplay

SocketReplay is a stateful traffic replay tool that is suitable in large-scale environment. There are several stages described as follows. Loss-recovery reconstructs complete streams from capture scheme. Stateful replay minics hosts to generate traffic without breaking protocol semantic. After triggering events from stateful replay, selective replay narrow down the scale of replayed packet trace to reproduce events.

3.3.1 Loss-Recovery

Loss-recovery is an stage that parses the incomplete connections, i.e., broken

streams which come from previous capture scheme or capture loss, into complete streams so that SocketReplay can replay the connection with the original length of the stream by inserting dummy bytes. The length of ignored data can be calculated from the TCP/IP header of the packet. The payload length of capture loss can be found by inspecting sequence number and acknowledge number of each packets.

For example, figure 4 shows an established connection of host A and host B. In real environment, the six packets are transmitted to destination successfully. During the capture, the fourth packet is lost due to capture loss. The following described the mechanism packet by packet as Fig. 4 illustrated. (1) The 1st packet is queued because we are not sure whether this packet can reach the destination. (2) The sequence number of 2nd packet is checked to see whether these two segments are overlapped. Again, this packet is queued because we’re not sure whether the packet can reach the destination. (3) The ACK packet of host B verifies successful transmissions of 1st and 2nd packets. Therefore, we put 20 bytes of data into the stream. (4) The 4th packet is lost. (5) The sequence number of 2rd and 5th packet is not continuous. It can be happened when 4th and 5th packets are out of order.

Therefore, we are not sure whether the 4th packet is lost. (6) This ACK verifies the 4th packet is lost and the data of 5th packet is transmitted successfully. Therefore, we put 20 dummy bytes and the data of 5th packet into the stream. After theses operation, the steam contains 50 bytes.

Packet No. 1 2 3 4 5 6

Hosts AÆB A Æ B B Æ A A Æ B A Æ B B Æ A Seq. , Ack. a, b a+10, b b, a+10 a+20, b a+40, b b, a+50 Data length 10 bytes 10 bytes 0 byte 20 bytes 10 bytes 0 byte

Figure 4. An example of loss-recovery for Established TCP connection with one packet capture loss

3.3.2 Stateful Replay

Because loss-recovery constructs a complete stream, this stage focuses on emulating all TCP and UDP connections. Previous work [8, 9] proposes methods to determine packet order and emulate TCP connections. This work solves packet order by the sequence of data inserted into the stream in the loss-recovery stage. Also, this work follows the work [5] using socket API to emulate TCP and UDP connections.

3.3.3 Selective Replay

After the stage of stateful replay, some events are triggered. To analyze how an event is triggered, i.e., to analyze what sessions, connections, or packets trigger an event, it is better to replay selectively from large packet traces. This work can achieve selective replay according to the event information and replay log from stateful replay.

An event may include time information, connection information, and message of errors or alerts. Also, the time of connection established and closed can be obtained by replay log. Therefore, this work selects the potential connections to test whether it can reproduce the event. If it cannot reproduce the event, including more connections is needed. Figure 5 shows an example of an event includes the time information and the address of connection 5. The procedure of selected connection to replay is described as follows: (1) SocketReplay replays connection 5 to see whether it can reproduce the event. (2) If it cannot, SocketReplay includes connections with the same IP addresses of connection 5. In this case, it includes connection 1 and 5. (3) If it still cannot, SocketReplay includes established connections at time t. In this case, it includes connection 1, 2, and 5. (4) If it still cannot, SocketReplay includes last connections that are closed before time t. In this case, it includes connection 1, 2, 3, and 5. (5) If it still cannot, SocketReplay includes more connections that are closed before time t.

Note that if the event does not provide connection information, SocketReplay will

skip step 1 and step 2.

Figure 5. An example of selective replay

在文檔中低儲存空間消耗的錄製真實流量與回復有效狀態的重播真實流量技術 (頁 16-21)