1 Introduction
Advances in networking technology, computation resource and data storage become more important in our information society. Enterprises start to expand their storage and computing centers in order to speed up processing huge data. Since 2004, Google starts to publish some papers about cloud/distributing technologies. Cloud computing begin to become more popular and common. A lot of companies launch newly service about cloud in the internet.
Traditionally, security software always runs on local-side computer. Hence, the computer must own ability to execute the security software. However, cloud computing changes the security software model. Users may use cloud ability to execute security software to inspect their computers and then the cloud responses the result to the user. It increasingly reduces local-side computer ability and is suitable for mobile devices.
Although there is similar cloud-based security software now, it does not consider the privacy issues in the software. Users may leak some secret information to the service providers even the service provider is very famous and fairly. We observe this problem and then we want to solve this problem. Hence, we propose an approach to solve the privacy issues in cloud. Of course, the approach does not influence execution of cloud service. In other words, we propose an approach that not only provides security service but also protects privacy of users in the cloud.
In this paper we propose an application of cloud computing for intrusion detection. The user sends a suspected data to the cloud server for detecting whether the data contains a malicious signature. For security, we would like the user to keep privacy of the data. That is, we want the cloud server to detect whether a malicious signature is inside the data and the server does not know what the data is. Finally, we implement the system according to our approach and then do some experiments to explain our system.
2
1. Cloud Software as a Service (SaaS) 2. Cloud Platform as a Service (PaaS) 3. Cloud Infrastructure as a Service (IaaS)
Four Deployment Models
1. Private Cloud 2. Community Cloud 3. Public Cloud 4. Hybrid Cloud
Table 1: The NIST Definition of Cloud Computing
Source: "The NIST Definition of Cloud Computing." in NIST, 2009.
1.1 Cloud Computing
Cloud computing [1] has become increasingly popular and common in recent years. More and more companies, such as Amazon, Microsoft and Google, allocate a lot of resources to research and develop cloud applications and services. These companies establish large-scale data centers to provide computing power and storage for users, for example, Amazon EC2, Microsoft Live Mesh, Google Gmail, etc. These convenient and useful services and software quickly began to be widely used in network.
Since 2004, Google start to publish some papers about cloud computing. Google File System (GFS) [6], MapReduce [4] and BigTable [3] are Google’s three core technologies for distributed/cloud-computing systems. GFS is a scalable, distributed and fault tolerant file
3
system. MapReduce is a programming model for processing huge data on large cluster of distributed computers. BigTable is used for managing large distributed data storage system. It is a compressed, high-performance, high scalability database system. According to these three core technologies, cloud applications are to mushroom like bamboo shoots after a spring rain.
According to National Institute of Standards and Technology (NIST) [9], cloud computing is composed of five essential characteristics, three service models, and four deployment models. Generally speaking, Cloud computing is a model for providing on-demand services for users who do not need to care about how computers are managed and storage is located.
It is treated by users as a utility such that users pay for the amount they use. Cloud computing is revolutionizing the IT industry.
One use of cloud computing is to let the cloud servers do massive computation for users.
There are two benefits. The first is to release heavy computation and storage need from the client side so that the hardware demand in the client side is less. The second is to reduce bandwidth use on the networks since only the computation result is sent back to the client side. In other words, users can access these cloud services with any devices anytime and anywhere.
1.2 Intrusion Detection System
With the advances in network technology, more and more network applications and newly websites come with the tide of fashion. Web surfing has become the recreation in our life.
However, our computers in the internet may be under threat of attacks such as virus, worm or Trojan. Hence, detecting malicious software or behaviors has become an increasingly challenging problem.
4
Figure 1: The Units of Intrusion Detection System
There are many tools that can help us to resist network attacks. An intrusion detection system (IDS) [8] plays an important role in network security. It is a useful tool to inspect network traffics, monitor system behavior and detect malicious attacks. An intrusion detection system is separated into three parts: the detector, the rule dataset and the detection engine. The detector is used to collect system information such as packet sniffer or log recorder. The rule dataset are signature of predefined known attacks. The detection engine determines whether the presence of attack according to detector collections and rule dataset.
In general, there are two types of intrusion detection approaches: anomaly-based IDS and signature-based IDS. Anomaly-based IDS defines a set of normal activities beforehand and assumes that the malicious attacks are different form normal activities. When the network traffics are different from threshold value of normal activities, the anomaly-based IDS will notify the system administrator. The advantage of anomaly-based IDS is that it can detect unknown attacks. However, it may have wrong judgments; it is called false-positive. The other type is signature-based IDS. It has to previously define known attacks, it is called signatures. The method is signatures matching which effectively detects computer against malicious attacks. It is a useful way to detect known attacks, but it cannot discover unknown attacks.
5
Figure 2: Rate the Benefits of the Cloud Source: IDC Enterprise Panel
Although these two approaches can resist malicious attacks, many problems still remain.
First, intrusion detection system must be often updated to withstand malicious attacks, thanks to the malicious attacks make rapid progress. Second, intrusion detection system must be having a good performance, so that it can be effectively detect and ward off malicious attacks. Finally, intrusion detection system must be quickly deployed to each computer, so that administrators can easily manage and monitor each computer.
1.3 Privacy Issue in Cloud
Cloud brings a lot of benefits in our life. According to IDC market intelligence, the enterprise thinks that "Pay only for what you use" and "Fast to deploy" are the better advantages in the cloud. However, there are some problems for users according to IDC market intelligence. Security is the most important problem for users in the cloud. That is, most customers still cannot believe that the cloud has sufficient security environment.
Therefore, most of the information which is stored in the cloud is not important.
6
Figure 3: Rate the Challenges of the Cloud Source: IDC Enterprise Panel
Privacy is a fundamental human right. The privacy issue for cloud service is a big challenge.
It is hard to design a cloud service to decrease privacy risk. Processing and transferring sensitive information limit usage of cloud services. In fact, there are a lot of cloud storage services in recent years, such as Dropbox, ASUS Web Storage and Windows Live SkyDrive.
These cloud service do not consider the privacy problem. The data in the storage servers may be leaked or misused by the service providers. For example, the recent event of recording user locations through iPhone use by Apple raises serious concern about location privacy of users.
1.4 Motivation
We discover that cloud computing has become more popular in recent years. A lot of large enterprises launch their newly cloud service and provide a charge mechanism to earn money.
"Pay-as-you-use" is the biggest advantage for these enterprises. However, they do not notice that their cloud service may invade the privacy. Users take advantage of these cloud service,
7
but they must transfer some information to the cloud provider. The information may be misused or leaked by the cloud provider. Hence, we provide an approach to solve the privacy problem in cloud service.
Intrusion detection system is a security tool for users to protect their computers against the malicious attacks from the internet. However, most of intrusion detection system must download up-to-date signatures. It is a boring and tedious action. Furthermore, intrusion detection system can decrease system execution performance and it usually runs on a personal computer. That is, it is not suitable for cheaper and light devices.
Cloud intrusion detection system is a good solution to solve performance problems in user-side. However, intrusion detection system need to inspect system information, it may be leaked privacy of users. Hence, we adopt our approach to achieve privacy preserving in the cloud intrusion detection system. In this paper, it is called privacy-preservation cloud-based intrusion detection system.
1.5 Our Contribution
In this paper we propose an application of cloud computing for intrusion detection. The (virus) signatures are put into the cloud side. The user sends a suspected data W to the cloud server for detecting whether W contains a malicious signature. For security, we would like the user to keep privacy of W. That is, we want the cloud server to detect whether a malicious signature is inside W and the server does not know what W is. We call this privacy-preservation cloud-based intrusion detection system. This system is suitable for mobile users who use mobile platforms, such as smartphone, etc., where the communication bandwidth and storage size are limited. The user does not need to download up-to-date virus signatures. After receiving an encrypted suspected data C from a user, the cloud server compares it to the signature database and finds out a possible set L of matched signatures.
8
Then, L is sent to the user who filters L out to see whether W contains a signature in the server’s signature database. Since the final confirmation of existence of a signature is done in the client side, the server does not know exactly what W is. Thus, the privacy of W is kept up to some extent.
We have implemented our design as follows. We use Hadoop as the cloud platform and a Linux operating system as the client. We convert Snort rules into signatures that are put into Hadoop. We design an encryption method of encrypting W into C by a hidden keyword search technique. Our result shows that the system design meets the requirements of privacy and efficiency.
9