DESIGN OF A TIGHTLY-COUPLED ARCHITECTURE - 一個對於防毒、防廣告信、入侵偵測以及內容過濾的整合性代理伺服器架構

3.1 Solution Ideas

As mentioned in Chapter 2, a loosely integrated architecture would bring many overheads during packet processing: inter-process communications between different processes, duplicate kernel/user space interactions made by different operations in different applications, and process forking for serving many clients concurrently.

In this chapter, we propose a tightly integrated architecture to reduce the above overheads. First, in order to reduce redundant inter-process communications, we design our new architecture as a standalone proxy server without cooperated server support. Second, we integrate applications that deal with the same application protocol into one. Therefore, duplicate kernel/user space interactions in different operations will be eliminated. Third, the architecture is modified as a single process proxy server. We use the following two solutions to serve many clients concurrently. The first is using the select() system call, which examines the I/O descriptor sets to see if any descriptors in the sets are ready for reading, writing, or have exceptional condition pending. Hence, we can do I/O multiplexing on many socket descriptors to serve many clients concurrently. The second is multi-threading. Threads are more light-weighted processes and are expected to serve more clients [24].

3.2 The New Architecture

Figure 2 shows our tightly integrated architecture and traffic flow. We separate our architecture into two parts: HTTP and SMTP. In the HTTP part, we replace DansGuardian with a self-developed Web proxy server, Webfd, which has simple URL filtering and keyword blocking functions. It uses the select() system call to serve many clients concurrently, unlike DansGuardian which forks new processes to server new clients. The select() system call is more scalable and efficient than context switch between processes. To better filter Web content, we also supplement Webfd with the content filtering part extracted from

DansGuardian.

FIGURE 2 Our integration architecture and the new packet flows.

On the other hand, to reduce duplicate kernel/user space interactions and to provide intrusion prevention instead of intrusion detection, we rewrite Snort as a shared library called by Webfd to detect or prevent intrusions. Snort originally sniffs packets with libcap that copies packets passing through the network interface from kernel to user space. Snort can only detect the intrusion on the network link. Because Webfd is a proxy, after receiving reassembled packets from the TCP/IP protocol stack, it calls the rewritten Snort shared library to detect possible intrusions. If an intrusion is found, the packets can be blocked. Making Snort as a shared library allows proxies of other protocols to call it to detect and prevent intrusions. Any proxy could fill the data structure Packet in the Snort library and start the detection engine. Furthermore, there is only one copy of Snort residing in memory.

In our architecture, the intrusion detection and prevention of a new protocol has to rely on the proxy of the protocol that extracts the content to be inspected. However, it is not worth

port 25

running a new proxy if we just want to detect the intrusion through such a protocol. Snort originally uses libcap that offers a simple way to sniff packets, an easier and more efficient approach for intrusion detection. We hence can just run Snort in sniffing mode if we only want to detect the intrusions. After our integration, the Web traffic flow in a gateway form the client side to the server side becomes as follows:

(1) The kernel passes packets to localhost:880.

(2) The request is received by Webfd through the TCP/IP protocol stack.

(3) Webfd checks whether the URL of the request is permitted to access.

(4) Webfd calls Snort library to check whether the content of the request contains intrusion signatures.

(5) Webfd makes a connection to the Web server if necessary.

In the SMTP part, we modify AMaViS as a standalone mail proxy server. Furthermore, to be more scalable, we alter AMaViS from a multi-process proxy server to a multi-thread proxy server. We adopt multi-thread instead of the system call, select(), in the mail proxy server because the processing time of the mail is much longer than the processing times of the Web request and response. In our benchmarking, the processing times of the Web request and response are at most tens and hundreds of milliseconds, respectively. However, the process time of the mail is hundreds to several thousands of milliseconds. This would degrade the concurrency of the mail proxy server. The mail proxy server hence is modified as a multi-thread proxy server. We also run ClamAV in the daemon mode to save the time of loading ClamAV and its signatures. AMaViS hence communicates with ClamAV daemon by socket. The original SpamAssassin is still as a PERL library residing in the memory. AMaViS also calls the rewritten Snort shared library to detect possible intrusions. After the integration, the mail traffic flow in a gateway from the client side to the server side becomes as follows:

(1) The kernel passes packets to localhost:10024.

(2) The mail is received by AMaViS through the TCP/IP protocol stack.

(3) AMaViS calls Snort library to check whether the content of the mail contains intrusion signatures.

(4) AMaViS calls SpamAssassin to check mail.

(5) AMaViS sends message to ClamAV to scan the attached files.

(6) AMaViS relays the mail to the next mail server.

After the integration, the redundant inter-process communications and the duplicate user/kernel space interactions are eliminated. Comparing to the old architecture, there are totally two user/kernel space interactions in steps (2) of both parts, one inter-process communications in step (5) of the SMTP part, and one file system access in step (2) of the SMTP part. Furthermore, there are only two server processes. The improvement of the integration will be shown in the next chapter.

3.3 Implementation

There are three major changes in the new implementation: standalone AMaVis, Snort as a shared library, and Webfd.

3.3.1 Standalone AMaViS

In AMaViS, there are two important modifications that make AMaViS a multi-threaded and standalone server. By standalone, we mean that AMaViS can serve clients without the mail server support. First, AMaViS originally uses the Net::Server::PreForkSimple module.

The Net::Server [25] is an extensible, generic Perl server engine. We hence can easily extend the Net::Server or modify its sub classes directly and replace Net::Server::PreForkSimple with it. We use the threads [26] module, a new module in Perl 5.8 to implement the multi-threaded server. We modified the subroutine loop() in the Net::Server::Fork module. In this subroutine, it forks a process to serve a new client when a request is coming. We replace this piece of code with the one that creates a new thread. Second, AMaViS originally forwards the checked mail to the local mail server in the subroutine mail_via_smtp_single of

Amavis::Out. We hence modify this piece of code to relay the mail to the destined mail server.

3.3.2 Snort as a Shared Library

To make Snort a shared library, we compile each source file with the option –fPIC, and finally use the command, ld –share, to make this shared library. Next, in order to do intrusion detection or prevention with Snort, we call fpInitDetectionEngine() to initialize the detection engine and CreateDefaultRules() to create default rules at the beginning of Webfd. When a request or response arrives, the data structure of Packet is filled with the content of the request or response, and then the function fpEvalPacket() will analyze Packet. Last, the detection result will be returned to the caller, i.e. Webfd or AMaViS in this case.

3.3.3 Webfd with the Features of DansGuardian

Part of DansGuardian is compiled as a library called by Webfd. The request inspection of DansGuardian is in the class OptionContainer. The inspection to check whether the specified keywords are in the URL by regular expression is in the function inBannedRegExpURL(). The inspection to check whether the URL and the site are permitted are in the functions

inBannedURLLIst() and inBannedSiteList(), respectively. Webfd calls these functions

sequentially when processing the request. On the other hand, the response inspection of DansGuardian is in the class NaughtyFilter. The inspection to check whether the statistical score of the specified keywords appearing in the response exceeds the threshold is in the function checkme(). Similarly, Webfd calls this function when processing the response.

在文檔中一個對於防毒、防廣告信、入侵偵測以及內容過濾的整合性代理伺服器架構 (頁 13-18)