Flexible and Robust Service Management in a Smart Home
Definition 6. (Robust Pervasive Service) A Pervasive Service ps is robust if and only if the following statement holds:
3. Persistent Manager Nodes assumption (A3): Manager nodes will not ex- ex-perience failure, of which such assumption is made since we don’t consider the
robustness issues of Manager Node for the time being. In real cases, a simple yet effective solution to the reliability issues of Manager Nodes is to handle them by means of techniques in the implementation level. For example, one can develop a program (or use the watchdog services provided by the underlying OS) to de-tect the failures of Manager Nodes. In real world, many popular mission critical enterprise systems adopt this approach. For example, Oracle WebLogic Clus-ter uses similar design, in which a ”Domain” is a logical division of application, which contains a cluster of ”Managed Servers”. Each domain is administrated by an ”Admin Server”, which is responsible for detecting the failures of Managed Servers in the same Domain. Actual services are provided by Managed Servers, but Managed Servers belonging to the same Domain are not necessarily located in the same host. In other words, a host contains Managed Servers may be-long to different Domains. In each host, there is a ”Node Manager”, which is responsible for monitoring and recovering all Managed Servers. Apparently, the design mentioned above does not take care of reliability issues of Admin Servers and Node Managers, which are in fact guaranteed by the watchdog services pro-vided by the underlying OS. The Nanny Servers of IBM WebSphere Servers use
similar approaches. To sum up, the rationale behind using hierarchical architec-ture (Manager-Worker) is that the possibility of manager node failure is much less than that of Worker Nodes in practice, since 1) actual heavy-loaded user tasks are handled by Worker Nodes; 2) the quantity of Manager Nodes is less than that of Worker Nodes; 3) The failures of Manager Nodes can be detected and be recovered by using mechanisms provided by their underlying OS/Plat-form. Consequently, we make this assumption in theoretical level, which however can be replaced by employing either consensus protocols or implementation level techniques. We are recently designing consensus-based protocols that make the failures of Manager Nodes detectable and recoverable without centralized coor-dinators [78]. When failure detection protocols for Manager Nodes are absent, one simple yet effective solution is to use the watchdog services provided by the underlying platform to detect and to recover the failed Manager Nodes.
4. Composable service assumption (A4): All services are composable. In other words, for each PS, for all types specified in the Service Template of the PS, there is at least one node of such type exists in the system. If this assumption is not hold, then it is impossible to recover the PS. The PSMP failure detection is shown in Figure 3.10. The behaviors of PSM, PHM, and Worker Node are formally defined as follows.
Protocol 6. (Worker Node Heartbeat) A Worker Node performs heartbeat by emitting PA periodically. The Worker Node attribute hbp is a pre-defined interval between each heartbeat.
HBW[w], sleep(w.hbp) → P A[w]; HBW[w] (3.16) Protocol 7 reveals how PSM emits suspecting message. There are two processes running in parallel, one for refreshing Tps and the other for timeout eviction (EV ).
In (3.17),|| is used to combine two parallel processes. In Protocol 7, Refresh, T imeout, and Remove are operations of PSM (see Table 3.2).
Protocol 7. (PSM Node Suspecting) PSM Node Suspecting protocol checks if there is an affiliated node stops performing heartbeat. If PSM does not receive any heartbeat for more than a pre-defined interval, it sends a suspecting message indicating a possible node failure.
then Remove(w)→ ˆm!psmpsuspect[w]
→ EVP SM[ps]
(3.18)
After a node is suspected, PHM stops the node and then sends a leave announce-ment on behalf of it (Figure 3.10, step 5). These operations are described as follows.
Protocol 8. (PHM Shutdown Suspects) PHM Shutdown Suspect protocol stops the suspected nodes according to the incoming suspect messages. The PHM also emits LA on behalf of the suspected nodes.
SSUP HM[ph] , ˆm?psmpsuspect[w]
After a failure is detected, PSM is aware that the service is not alive, since ServiceAlive returns false. Thus, according to Protocol 2, a new service composition procedure is then triggered to recover the PS. Finally, we can define PSMP by composing the above protocols together. The robustness of PSMP will be validated in Section 3.3.1.
Protocol 9. (Pervasive Service Management Protocol, PSMP) PSMP is a composite protocol that describes interactions between PSM, PHM, and Worker Nodes to realize reliable Pervasive Services.
P SM [ps], P A[ps]; (SCP SM[ps]||NSUP SM[ps]) (3.20) P HM [ph], P A[ph]; (SCP HM[ph]||SSUP HM[ph]) (3.21) W [w], P A[w]; (SCW[w]||HB[w]||LM[w]) (3.22)
3.2.4 Security
This sub-section presents the mechanisms used to ensure several security issues in Per-SAM/PSMP. The costs of employing security mechanisms are: 1) the efficiency of services is degraded, and 2) setting up security policies, authentication, and authoriza-tion is labor intensive and may cause inconveniences to users. These mechanisms are independent of the original design and therefore they are optional.
Confidentiality
Since PSMP is designed based on HTTP, it is able to ensure data confidentiality based on SSL/TLS [49] and WS-Security [9]. In fact, the UPnP security profile [54] adopts this approach. However, the devices (nodes) in Smart Homes typically have limited computing resources such as network bandwidth, CPU, and memory. As a result, Symmetric-Key Encryption mechanisms such as DES/Triple DES [124, 25] or AES [45] are considered more feasible. For example, ZigBee [16] uses AES encryption with
128-bit key length. The major challenge of using a Symmetric-Key Encryption is how to transmit the secret key over a unsecured network. In the residential mode, ZigBee chooses to ignore the potential vulnerability.
One possible solution is to distribute the secret key using Asymmetric-Key Encryp-tions. As a result, the following key exchanging procedure is proposed for ensuring data confidentiality in PSMP:
1. A new Manager Node called Security Manager which is responsible for keeping track of public keys as well as the security policies of PerNodes has to be developed and deployed.
2. PerNodes have to be configured so that each of them has a embedded private key as well as a corresponding public key. The key pairs is set up in a Security Console [54] (identical to the Security Manager in this thesis) and can be re-configured by users. Also, the user has to set up a secret key for symmetric encryption through the Security Console.
3. When performing PA, a node sends its public key without encryption to the multicast address.
4. When the Security Manager receives a public key embedded in a PA message, it encrypts the secret key by using the received public key and then sends the encrypted secreted key back to the newly joined node.
5. After the node receives the encrypted secret key, it decrypts the key by using its private key. Now, the node is able to send and receive encrypted data based on Symmetric-Key Encryption mechanisms such as AES by using the secreted key.
Figure 3.11 depicts the overall process of registering the public key and acquiring the secret key in PSMP.
Figure 3.11: Registering the public key and acquiring the secret key in PSMP
Figure 3.12: Sending and receiving data in PSMP
Integrity and Non-repudiation
Data integrity refers to the mechanisms that prevent the transmitted data from being corrupted or modified, whereas non-repudiation refers to sender of a message is actually the one claimed in that message. Integrity and non-repudiation are realized by using the message digest and digital signature mechanisms. To ensure data integrity and non-repudiation in PSMP, the sending node first obtains a message digest by using hash algorithms such as SHA (Secure Hash Algorithm) [51]. The digital signature can be generated by encrypting the message digest using the private key of the sending node, which is then placed in the header of the message before it is sent. After the receiving node receives the message, it first obtains a message digest from the decrypted message and then compares it with the one obtained by decrypting the digital signature. Finally, the receiving node can then ensure that the message is sent from a specific sender if the message digests are identical. Figure 3.12 depicts how PSMP ensures integrity and non-repudiation when sending and receiving encrypted data.
Authentication and Authorization
In order to support authentication and authorization, each PerNode has to be en-hanced according to the UPnP Security Ceremonies [54]. Specifically, every node has an additional DeviceSecurity Service which supports authentication and authorization functionalities. The security policies are also pre-configured by users in the Security Console.
3.3 Evaluation
This section reports the results of evaluating PerSAM and PSMP. The following sub-sections explain the evaluations with respect to robustness, recovery capability, perfor-mance, cost and limitation.
3.3.1 Robustness
The purpose of this sub-section is to show that services in PerSAM/PSMP are robust, that is, to validate that (3.15) holds, which is stated as follows: