1 Chapter Introduction
1.5 Thesis Organization
In Chapter 2, we discuss the background of some intrusion detection/prevention frameworks such as Network Access Protection [4][5] and the core technology used to build our library. In Chapter 3, we show the system architecture of NSML. In Chapter 4, we explain the implementation details for each component, and how they cooperate.
In Chapter 5, we introduce the programming interface of NSML and make performance evaluations. Finally, in Chapter 6, we give the conclusion and future works.
Chapter 2 Background
2.1. Network Security management Features
There are many different features in network security area, such as firewall, bandwidth control, traffic analysis, spy ware detection/removal, etc. In this paper, we discuss the former 3 issues. The last one, i.e. , spy ware detection/removal is not taken into consideration.
2.1.1. Firewall and Application Firewall
A firewall is a hardware or software designed to permit or deny data transmission to computers or devices with different trust levels. A simple firewall can only specify whether a connection from certain host port to another is legal or not. An application firewall (software) provides more information than a traditional firewall. It not only gives a way to monitor the payload in an application view rather than low level packet view, but also gives the relations of all incoming or outgoing messages with running applications. Therefore, NSML will build an application firewall.
2.1.2. Bandwidth Control
Though network cost is much lower than before, still large enterprises can afford the cost of unlimited network traffic. For smaller companies, maybe a 12M/2M ADSL is enough. However, due to the asymmetric download/upload bandwidth and the
popular P2P file sharing software such as eMule, BitTorrent, etc, the upload bandwidth often totally used by P2P software, which leads to the intolerably network speed. Therefore, the bandwidth should be controlled. This work is purposed last year by CH Chiu [6] in our lab (DCSLab of CIS NCTU).
2.1.3. Real-Time Traffic Analysis
Most virus and Trojans intrude the system through operating system’s open service. For example, the Blaster worm [7] sends RPC request with buffer overflow and exploit code to TCP port 135 on Microsoft Operating Systems to open the backdoor. The message we received or we sent may contain malicious data. Hence, the monitor and filtering of traffic in real-time is necessary. There are many different real-time traffic analysis algorithm published like PAYL [8]. However, because the analysis is performed in real-time, it must lower the network performance and take CPU time. Therefore, it has always been a tradeoff to guarantee the system performance or to lower the false positive rate and false negative rate.
2.2. Microsoft Network Access Protection, NAP
Microsoft proposes the Network Access Protection [4][5] to provide an extendible framework for secure network environment. The framework periodically performs a series of “Health Check” on clients. Network administrator can define the policy to decide what client is said to be healthy. Once one of the health checks is not passed and violates the policy setting, the client is isolated from the network until it
goes through remediation process and then passes all the checks again. The check is made on client System Health Agents, SHAs, and checked on server side System Health Validators, SHVs. Both the development of component SHA and SHV is open to 3rd party companies. Therefore the network administrator can install the validators they need. The following is the state transition between different NAP client states.
Health State Validation Network Access Limitation
Automatic Remediation
Figure 2.2.1 NAP States
Also, NAP defines two components: Quarantine Enforcement Client, QEC and Quarantine Enforcement Server,QES. Each QES in server side is corresponding to a QEC in client side. Each pair is defined for different type of network access. For example, there is a QES for DHCP configuration and a QES for VPN connections.
The states of health gathered from SHA will be collect by QEC and transferred to QES, and then dispatched to corresponding SHV. The result gathered from SHVs will be applied to the policy setting to see if a client should be isolated or not. If it should be, QES will send a signal to the client’s QEC, then the client is isolated. The
following is figures from the NAP Architecture showing the relationship between the server and client. Currently, NAP supports DHCP, IPSec, and VPN connections.
Figure 2.2.2 NAP Server and Client Architecture
NAP is a powerful framework to enhance the network security, but it cannot work alone, it needs the contributions of different validators.
2.3. Microsoft Winsock Service Provider API, Winsock SPI
The Winsock Service Provider Interface, or Winsock SPI, is a specialized interface of Winsock used to create providers. Traditional Winsock APIs have corresponding service provider APIs in SPI. On one hand, the network event/message will be passed to SPI before they are passed to Winsock Applications. On the other
hand, the event/message sent by the applications will also be passed to SPI before they are transmitted.
Winsock SPI Winsock Application Network
Figure 2.3 Winsock SPI
That is, we can hook Winsock through SPI functions. Each Winsock function has its corresponding hooking function in SPI. For example, a call to send() will be redirect to WSPSend() in SPI, developers can perform logics in WSPSend() to decide whether or not to let the send() call complete, or change its behavior. Because SPI is built on application layer, we can view connections in a high level rather than the packet level. Therefore, we know the application names, process ID, data buffers, etc, which common firewalls cannot do.
2.4. Related Works
Sygate [9] (acquisitioned by Symantec) is a complete solution about the network security issues, including personal firewall, real-time traffic analysis, secure remote desktop, etc. However, it lacks of bandwidth control and programmable policy. Also, it is commercial and not extendible. The detailed architecture and mechanism is not published.
Chapter 3 System Architecture
Figure 3.1 System Architecture
NSML runs in two modes: server mode and standalone mode. In server mode, new rules are pushed from server side to client side and applied immediately. The alert and traffic of each process will be logged. The log will be transferred to server side. In this mode, client does not take responsibility for storing the rules. In standalone mode, however, the rules should be stored and loaded locally.
3.1. NSDLL
A Winsock SPI filter appears as a dynamic linking library (DLL). Once the filter is installed, NSDLL will be injected to all Winsock applications. Every running Winsock applications have an instance of NSDLL. NSDLL works as a filter between
the Winsock application and the operating system. Every Winsock function calls will be passed to the corresponding hook function in the DLL. The DLL perform logic to decide whether to let the calls complete or change its behavior. Also, the DLL collects the traffic information and produce logs. The logic and logs are stored in shared tables.
In the worst case, once the DLL found a malicious operation, it can close the dangerous connection immediately to protect the system.
3.2. NS Software
Network Security Software, NS Software, is applications using NSML to manage the network, for example, an application firewall with graphical user interface.
Though the firewall logic is done by NSML, yet we leave the action taken after we’ve got a warning to the NS software be means of registering callback functions. The graphical firewall above may decide to pop a message box to alert the user that there is a rule violation or so. NS software can read structured network information from shared tables. In server mode, the rules are automatically loaded into shared tables through CoreService. However, in standalone mode, NS software should take the responsibility to store the rules locally and load them itself.
3.3. NSLib
NSLib provides the programming interface for NS software developers.
Developers are able to read and manipulate the connections and rules. NSLib is also responsible for the communication between NS Software and CoreService.
3.4. Shared Tables
Shared tables are an essential part of NSML. Different kind of rules, current running connections, and logs are stored in the tables during runtime. NSDLLs look up the rules runtime to perform the logic. Therefore, the rules will be applied once they are added into the tables. Logs are generated by NSDLLs and stored here temporarily, waiting for CoreService to handle. Of course, NS software is able to read the logs, too.
3.5. CoreService
CoreService is a windows service which acts as a “message router” between the client/server, service/process. In server mode, when the client boots up and the service is running, it tries to connect to the server. Once it succeeds, the server will push the latest rules to CoreService to add them to shared table. After the rules are applied, CoreService periodically queries the shared tables for logs and sends the logs to the server side. Also, the service is responsible for notify the NS software that registered callbacks when corresponding rules are violated.
3.6. NSServer
NSServer is responsible for storing rules in Database and pushing rules to the clients in server mode. A tool is provided to manage the server and the clients.
Chapter 4 Implementation Details
In this chapter, we discuss the implementation details of NSML. We choose C++
as our programming language. Firstly, we discuss the design of different types of shared tables. Then, we introduce 4 kinds of rules: LegalConnRule, IllegalConnRule, BindingRule, and SignatureRule. Next, we give two kinds of logs: AlertLog and TrafficLog. Then, we talk over the Client-Server interaction in server mode. Moreover, we face the performance issue caused by the large amount of traffic logs, and propose the solution. Next, we discuss the implementation of callbacks between different process spaces. Finally, we introduce the server side data base schema.
4.1. Shared Tables
Because NSML hooks the applications in user mode, our system needs a method to share information between processes. Thus, shared memory is used to share rule and logs between In NSDLLs, NSLib, and CoreService. There are two types of shared table in our system: common shared table (for rules), and special shared table (for logs). They are alike in most ways, with slightly different policy for arranging elements.
4.1.1. Shared Tables for Rules
Due to the large size of rules and logs, we choose “Named File-Mapping Object”
in Windows platform. Each shared table is assigned an unique ID. When a process
wants to access a table, it follows the following process to bind shared memory:
1. Try to open the file-mapping object with unique ID. If we succeed, go to step 3.
2. Because the object does not exist, we create a new file-mapping object with unique ID.
3. Map the object to acquire a memory reference.
Because the tables might be accessed by different processes concurrently, a unique mutex is opened to avoid the race condition for each shared table.
4. Try to open the mutex object with unique ID. If we fail, go to step 5.
5. Create the mutex object with unique ID.
Currently, the shared tables are implemented by fixed-length arrays. Each entry of the array is said to be a “slot”. The slot size is changeable for different tables. The shared memory we get from the system has logically continuous address. Therefore, we can treat the tables like a real array and the entry type can be any structure we defined. Each slot has a flag to specify whether the slot is used or not. To remove a rule, we just turn the slot flag off. To add a new rule, we need to find a slot with “off”
flag. In this version, the array size is limited and the system needs to traverse the entire array to find certain element. The worst case to find certain element is O(n).
To implement a new shared table, just define a new class inherit from CSharedTable class. The class handles the shared memory and mutex issue for the derived class.
4.1.2. Shared Tables for Logs
In order to reduce the load on the server side and improve the performance on the client side, we define another class CLogTable with the same shared memory technology as CSharedTable. The difference between CSharedTable and CLogTable is that entries in CLogTable must be contiguous while CSharedTable has no such restriction. The reason is due to the efficiency of reading logs. Logs are not like rules, they are generated constantly and should be ordered by time. If we apply the same policy as CSharedTable on CLogTable, the logs will be disordered and hard to anaysis.
Therefore, we arrange the slots to form a circular queue. In order to operate the circular queue, two indexes are used:
BeginIndex: point to the start of contiguous elements in the table.
EndIndex: point to the next available slot in the table.
Figure 4.1.2 The memory arrangement of CLogTable, shaded slot stands for non-empty slots
The element are pushed into the slot pointed by EndIndex and popped from
Slot 1 Slot 2 Slot 3 …… Slot N-1 BeginIndex EndIndex
Slot N EndIndex
BeginIndex
the slots BeginIndex.
If EndIndex+1=BeginIndex the table is full. Also, because the elements are always pushed into EndIndex and popped from BeginIndex, the worst case to push or pop a log is O(1).
CLogTable supports two protected funtions:
void *pushLog(): return a pointer to next available slot
void *popLogs(int *size): return a pointer to a contiguous block of logs and its size
To create a new log class, define a new class extend CLogTable and pass the table name, table size, and entry size to constructor of CLogTable. Also, define the specific function of pop and push, which invoke the original pushLog() and popLogs() internally and cast the pointer to specific type. Because the logs are stored in a circular queue and for performance concern, the logs should be returned may be divided into 2 blocks. 1 block ranges from the BeginIndex to the real end of the table, and 1 block ranges from the head of the table to EndIndex.
4.1.3. Avoid Race Condition
To avoid race condition of shared tables, we create unique mutex for each table.
We lock the table before reading and writing, and unlock the table after the action is done. To avoid overhead, locking is not recommended if the rare condition never
happens according to the usage.
4.2. Rules
We define 4 kind of rules for different purposes. They are LegalConnRule, IllegalConnRule, BindingRule, and SignatureRule. Because we perform the rule logic in application layer, we know the Process ID, Process Name, etc, of the related Winsock Application. This gives us more power to write more specific rule according to different applications. For example, the browser IE should not connect to a destination port rather than 80. If it does, it might be intruded.
4.2.1. LegalConnRule and IllegalConnRule
LegalConnRule specifies what connections are legal and safe. It may be a rule for incoming message or outgoing message. On the other hand, IllegalConnRule specifies what connections are illegal. It also tells if the illegal connections should be blocked or not, and if we would like to get a warning or not. LegalConnRule is a
“white list”, while “IllegalConnRule” is a “black list”.
To enhance the safety, we choose not to run the risk of being intruded. Therefore, the policy for LegalConnRule and IllegalConnRule is as following:
1. Does the connection appear in the “black list”? If it is, the connection is blocked or warned according to the rule. Both blocking and warning produce
AlertLogs. If it is not, go to step 2.
2. does the connection appear in the “white list”? If it is, it is regard as a safe connection. If it is not, we still let the connection pass the filter, and an AlertLog is produced to indicate it.
4.2.2. BindingRule
BindingRule is used to indicate what applications should not listen to certain ports. For example, the telnet.exe program should not bind on any port, if it does, we know it is infected. This is helpful in some cases, especially for IE. IE supports Browser Helper Object to let 3rd party developers add more functions like toolbar, etc.
However, the user cannot usually tell whether a helper object is safe or not. They just install the modules they interested without knowing if there is a backdoor opened! If we set up a rule for IE so that it cannot bind on any ports, the problem is solved.
4.2.3. SignatureRule
SignatureRule specifies the pattern might be malicious. Usually, it takes much CPU time and lowers the network performance to perform complex analysis on the traffic in real-time. Therefore, we perform simple analysis on the traffic to guarantee the performance. Every SignatureRule defines 2 important parameters: pattern and maxOffset. Once a connection is established, NSDLL scans the SignatureRule Table for rules matching the source and destination. If the matching rules are found, two traffic monitors are added into the incoming monitor list and outgoing monitor list,
individually. Then, the connection begins to transfer data. Both the data received and the data sent will be checked for the pattern. Also, there is a offset counter initialized to zero and increase during the check. Once the counter exceeds the maxOffset of the rule, the monitors are removed from the incoming and outgoing monitor lists.
Figure 4.2.3.1 Add monitors to connections
Connec
Scan SignatureRules
Matched
No
The connection is safe Yes
Add monitors to incoming and outgoing lists.
Initialize counter to zero
Accept
Recv Send
Figure 4.2.3.2 Filter traffic according to the monitors
4.3. AlertLogs and TrafficLogs
There are two types of pre-defined logs in NSML: AlertLogs and TrafficLosg.
Both AlertLogs and TrafficLogs are generated by NSDLL of the Winsock Application.
AlertLogs are generated when there are rules violated. For example, if we have a BindingRule specifies the browser IEXPLORER.EXE cannot bind on any port. Once IEXPLORER.EXE really binds on certain port, the AlertLog is produced to record this abnormal behavior.
No
Increase Counter
The connection is malicious Yes
Counter > maxOffset?
Remove the monitors from lists Yes
Wait for next Send/Recv No
TrafficLogs are produced when the application invokes one of the following Winsock function: socket(), closesocket(), accept(), connect(), bind(), send(), recv(), sendto(), recvfrom(). TrafficLogs record the detailed actions of Winsock applications in time order. If the system is intrueded, we have clues to analysis the system’s behavior.
4.4. Client-Server Interaction in Server Mode
Figure 4.4 Client-Server Interaction
In server mode, the server is responsible for pushing and updating rule to each client. When the client computer boot up and the CoreService is running, CoreService will try to connect to NSServer. Once it succeeds, NSServer sends the RULE_ADD message and starts pushing rules to the client. The rules received by CoreService will be added to shared rule tables and applied by NSDLLs immediately. After the rules are all sent, NSServer sends a RULE_END message to tell the client it is over. If a existing rule is modified, NSServer sends RULE_UPDATE message to client. Then the updated rule is sent. After that, a
Server
RULE_END message is sent. The NSServer maintains a list of online computers with NSML installed. Network administrator can use tool to monitor the network status of the clients.
4.5. The Performance Issue
As mentioned in Section 3.1, Inject DLL is responsible for producing traffic logs and alert logs to record the behavior of a Winsock Application. Those logs are
As mentioned in Section 3.1, Inject DLL is responsible for producing traffic logs and alert logs to record the behavior of a Winsock Application. Those logs are