• 沒有找到結果。

S PACE AND T IME R EQUIREMENT

CHAPTER 5 EXPERIMENTAL RESULTS

5.5 S PACE AND T IME R EQUIREMENT

Table 5.3 shows the relation between number of on-line sessions and the size of required log buffer. The required buffer size increases while the number of sessions increases. The reason is that we store unhandled data (mentioned in Chapter 4) in the log buffer. Intuitively, the amount of unhandled data increases when the load of Squid increases. Only 132 Kbytes of memory is required to log the state of one session while the proxy is serving 30 sessions.

Number of Sessions 5 10 15 20 25 30

Required Log Buffer (Kbytes) 2.78 28.97 31.06 80.55 127.22 131.94

Table 5.3: Relation between Number of Sessions and Required Log Buffer Size

Table 5.4 shows the relation between number of sessions need to recover and the total recovery time. We intentionally terminate the Squid process when proper number of sessions (e.g., 5, 10, 15) are served. Each client session requests a 25MB web page.

Obviously, to recover more sessions takes more time. And the recovery latency is acceptable.

Number of Sessions 5 10 15 20 30 50

Recovery Time (ms) 102.814 147.626 445.338 525.354 839.004 1751.086

Table 5.4: Relation between Number of Sessions and Recovery Time

C HAPTER 6

CONCLUSION AND FUTURE WORK

6.1 C

ONCLUSION

We purpose a subsystem to make a proxy service become fault tolerant. Transient faults on a proxy application (either caused by hardware, administrator or the proxy application itself) can be recovered in a way transparent to the clients, servers, and the proxy application.

The subsystem can recover the ongoing requests which are processed when the failure occurs.

In addition, the feature of transparency enables the fault tolerant proxy system functions without any support of the clients and the servers. The subsystem is implemented as a Linux kernel module. TCP traffic and system calls are intercepted in kernel-level to log the state.

The experimental results show low overhead for state logging and acceptable recovery latency.

6.2 F

UTURE

W

ORK

Currently, we focus on the errors happened on a proxy application only. In the future, we will address the operating system failures. Operating system crashes such as kernel panics could be detected by another machine, and the states could also be logged in that machine.

Therefore, we can use a logger machine to monitor the proxy. When the proxy fails, the state of the proxy would be migrated to another backup server. Instead of using an additional machine, the system crashes can also be detected by an intellectual network interface card [3].

With the help of fast system restart techniques such as LOBOS [18] which stores the state in a safe memory area that survives after restarting the system, the state recovery can be performed.

In addition, we plan to extend our fault tolerant mechanisms to other network services in the future. In network services such as Web Service [11] and peer-to-peer network [21], each host may play the role of client and server at the same time. This is similar to the proxy service. Therefore, we will evaluate the possibility to extend our approach to provide fault tolerance for those services.

REFERENCES

[1] Navid Aghdaie, Yuval Tamir, “Client-Transparent Fault-Tolerant Web Service,” 20th IEEE International Performance, Computing, and Communications Conference, Phoenix, AZ, pp. 209-216, Apr. 2001.

[2] Navid Aghdaie, Yuval Tamir, “Fast Transparent Failover for Reliable Web Service,” In Proceedings of the International Conference on Parallel and Distributed Computing and Systems, Marina del Rey, California, pp. 757-762, Nov. 2003.

[3] Alacritech, “Alacritech quad port server accelerator,” available at http://www.alacritech.com/html/100x4.html

[4] L. Alvisi, T. C. Bressoud, A. El-Khashab, K. Marzullo, D. Zagorodnov, “Wrapping Server-side TCP to Mask Connection Failures,” In Proceedings of the IEEE INFOCOM, Anchorage, Alaska, pp. 329-337, Apr. 2001.

[5] Apache Software Foundation, “The Apache Web Server,” available at http://www.apache.org/.

[6] T. Briso, “DNS Support for Load Balancing,” IETF RFC 1794, April 1995.

[7] A. Brown, D. A. Patterson, “To Err is Human,” In Proceedings of the 2001 Workshop on Evaluating and Architecting System dependabilitY, Göteborg, Sweden, July 2001.

[8] V. Castelli, R. E. Harper, P. Heidelberger, S. W. Hunter, K. S. Trivedi, K. Vaidyanathan, W. P. Zeggert, “Proactive Management of Software Aging,” IBM JRD, Vol. 45, No. 2, Mar. 2001.

[9] Cisco Systems Inc., “Cisco DistributedDirector,” available at http://www.cisco.com/univercd/cc/td/doc/product/iaabu/distrdir/dd2501/ovr.htm

[10] Cisco Systems Inc., “Web Cache Communication Protocol,” available at http://www.cisco.com/en/US/tech/tk122/tk717/tech_protocol_family_home.html

[11] Harvey M. Deitel, “Web Services: A Technical Introduction,” Prentice Hall, Aug. 2002.

[12] P. Enriquez, A. Brown, D. A. Patterson, “Lessons from the PSTN for Dependable Computing,” In Proceedings of the 2002 Workshop on Self-Healing, Adaptive and self-MANaged systems (SHAMAN), New York, June 2001.

[13] R. Fielding, J. Gettys, J. Mogul, H.Frystyk, L. Masinter, P. Leach, T. Berners-Lee,

“Hypertext Transfer Protocol -- HTTP/1.1,” RFC 2616, Jun. 1999.

[14] S. Horman, “Creating Redundant Linux Servers,” In Proceedings of the 4th Annual Linux Expos, Durham NC, May 1998.

[15] S. Iyer, A. Rowstron, P. Druschel, “Squirrel: A Decentralized, Peer-to-Peer Web Cache,”

21th ACM Symposium on Principles of Distributed Computing (PODC), Monterey, California, Jul. 2002.

[16] Van Jacobson, Craig Leres, Steve McCanne, “tcpdump,” available at http://www.tcpdump.org/.

[17] Mindcraft Inc., “WebStone: the Benchmark for Web Servers,” available at

http://www.mindcraft.com/benchmarks/webstone/.

[18] Ron Minnich, “LOBOS: (Linux OS Boots OS) Booting a Kernel in 32-bit Mode,” The Fourth Annual Linux Showcase and Conference, Atlanta GA, Oct. 2000.

[19] Netscape, “Navigator Proxy Auto-Config File Format,” available at http://wp.netscape.com/eng/mozilla/2.0/relnotes/demo/proxy-live.html, Mar. 1996.

[20] D. Oppenheimer, D. A. Patterson, “Why do Internet services fail, and what can be done about it?” In Proceedings of the 10th ACM SIGOPS European Workshop, Saint-Emilion, France, Sep. 2002.

[21] Andy Oram, “Peer-to-Peer: Harnessing the Benefits of a Disruptive Technology,”

O’Reilly, Mar. 2001.

[22] Alex C. Snoeren, Hari Balakrishnan, “An End-to-End Approach to Host Mobility,” In Proceedings of the 6th Annual ACM/IEEE International Conference on Mobile Computing and Networking, pp. 155–166, Boston, Massachusetts, Aug. 2000.

[23] Alex C. Snoeren, David G. Andersen, Hari Balakrishnan, “Fine-Grained Failover Using Connection Migration,” In Proceedings of the 3rd USENIX Symposium on Internet Technologies and Systems (USITS '01), Mar. 2001.

[24] K. Srinivasan, “M-TCP: Transport Layer Support for Highly Available Network Services,” Technical Report DCS-TR459, Rutgers University, Oct. 2001.

[25] D. Wessels, “Squid Web Proxy Cache,” available at http://www. squid-cache.org/.

[26] C. S. Yang, M. Y. Luo, “Realizing Fault Resilience in Web-Server Cluster”, In Proceedings of the 2000 ACM/IEEE Conf. on Supercomputing (CDROM), p.21-es, Nov.

2000.

[27] C. S. Yang, M. Y. Luo, “Constructing Zero-Loss Web Services”, In Proceedings of IEEE INFOCOM 2001, pp. 1781-1790, Apr. 2001.

[28] Dmitrii Zagorodnov, Keith Marzullo, Lorenzo Alvisi, Thomas C. Bressoud,

“Engineering fault-tolerant TCP/IP servers using FT-TCP,” In Proceedings of IEEE Intl.

Conf. on Dependable Systems and Networks (DSN), pp. 22-26, Apr. 2003.

相關文件