• 沒有找到結果。

Structures

A distributed system is a collection of processors that do not share memory or a clock. Instead, each processor has its own local memory. The processors com-municate with one another through various communication networks, such as high-speed buses or telephone lines. In this chapter, we discuss the gen-eral structure of distributed systems and the networks that interconnect them.

We contrast the main differences in operating-system design between these types of systems and the centralized systems with which we were concerned previously. Detailed discussions are given in Chapters 17 and 18.

Exercises

16.1 What is the difference between computation migration and process migration? Which is easier to implement, and why?

Answer: Process migration is an extreme form of computation mi-gration. In computation migration, an RPC might be sent to a remote processor in order to execute a computation that could be more effi-ciently executed on the remote node. In process migration, the entire process is transported to the remote node, where the process continues its execution. Since process migration is an extension of computation migration, more issues need to be considered for implementing pro-cess migration. In particular, it is always challenging to migrate all of the necessary state to execute the process, and it is sometimes difficult to transport state regarding open files and open devices. Such a high degree of transparency and completeness is not required for computa-tion migracomputa-tion, where it is clear to the programmer that only a certain section of the code is to be executed remotely and the programmer.

16.2 Contrast the various network topologies in terms of the following at-tributes:

a. Reliability

111

b. Available bandwidth for concurrent communications c. Installation cost

d. Load balance in routing responsibilities

Answer: A fully-connected network provides the most reliable topol-ogy as if any of the links go down, it is likely there exists another path to route the message. A partially-connected network may suffer from the problem that if a specific link goes down, another path to route a message may not exist. Of the partially-connected topologies, various levels of reliability exist. In a tree-structured topology, if any of the links goes down, there is no guarantee that messages may be routed. A ring topology requires two links to fail for this situation to occur. If a link fails in a star network, the node connected to that link becomes dis-connected from the remainder of the network. However, if the central node fails, the entire network becomes unusable.

Regarding available bandwidth for concurrent communications, the fully connected network provides the maximum utility followed by partially connected networks. Tree-structured networks, rings, and star networks have a linear number of network links and therefore have lim-ited capability with regards to performing high-bandwidth concurrent communications. Installation costs follow a similar trend with fully connected networks requiring a huge investment, while trees, rings, and stars requiring the least investment.

Fully connected networks and ring networks enjoy symmetry in the structure and do not suffer from hot spots. Given random commu-nication patterns, the routing responsibilities are balanced across the different nodes. Trees and stars suffer from hotspots; the central node in the star and the nodes in the upper levels of the tree carry much more traffic than the other nodes in the system and therefore suffer from load imbalances in routing responsibilities.

16.3 Even though the ISO model of networking specifies seven layers of functionality, most computer systems use fewer layers to implement a network. Why do they use fewer layers? What problems could the use of fewer layers cause?

Answer: A certain network layered-protocol may achieve the same functionality of the ISO in fewer layers by using one layer to imple-ment functionality provided in two (or possibly more) layers in theISO model. Other models may decide there is no need for certain layers in theISOmodel. For example, the presentation and session layers are ab-sent in theTCP/IPprotocol. Another reason may be that certain layers specified in theISO model do not apply to a certain implementation.

Let’s use TCP/IPagain as an example where no data-link or physical layer is specified by the model. The thinking behindTCP/IPis that the functionality behind the data link and physical layers is not pertinent toTCP/IP it merely assumes some network connection is provided -whether it be Ethernet, wireless, token ring, etc.

A potential problem with implementing fewer layers is that certain functionality may not be provided by features specified in the omitted layers.

Exercises 113 16.4 Explain why doubling the speed of the systems on an Ethernet segment may result in decreased network performance. What changes could help solve this problem?

Answer: Faster systems may be able to send more packets in a shorter amount of time. The network would then have more packets traveling on it, resulting in more collisions, and therefore less throughput relative to the number of packets being sent. More networks can be used, with fewer systems per network, to reduce the number of collisions.

16.5 What are the advantages of using dedicated hardware devices for routers and gateways? What are the disadvantages of using these de-vices compared with using general-purpose computers?

Answer: The advantages are that dedicated hardware devices for routers and gateways are very fast as all their logic is provided in hardware (firmware.) Using a general-purpose computer for a router or gateway means that routing functionality is provided in software -which is not as fast as providing the functionality directly in hardware.

A disadvantage is that routers or gateways as dedicated devices may be more costly than using off-the-shelf components that comprise a modern personal computer.

16.6 In what ways is using a name server better than using static host tables?

What problems or complications are associated with name servers?

What methods could you use to decrease the amount of traffic name servers generate to satisfy translation requests?

Answer: Name servers require their own protocol, so they add com-plication to the system. Also, if a name server is down, host information may become unavailable. Backup name servers are required to avoid this problem. Caches can be used to store frequently requested host information to cut down on network traffic.

16.7 Name servers are organized in a hierarchical manner. What is the pur-pose of using a hierarchical organization?

Answer: Hierarchical structures are easier to maintain since any changes in the identity of name servers require an update only at the next level name server in the hierarchy. Changes are therefore localized.

The downside of this approach, however, is that the name servers at the top level of the hierarchy are likely to suffer from high loads. This problem can be alleviated by replicating the services of the top-level name servers.

16.8 Consider a network layer that senses collisions and retransmits imme-diately on detection of a collision. What problems could arise with this strategy? How could they be rectified?

Answer: Delegating the retransmission decisions to the network layer might be appropriate in many settings. In a congested system, im-mediate retransmissions might increase the congestion in the system, resulting in more collisions and lower throughput. Instead, the decision of when to retransmit could be left to the upper layers, which could delay the retransmission by a period of time that is proportional to the current congestion in the system. An exponential backoff strategy is the most commonly used strategy to avoid over-congesting a system.

16.9 The lower layers of theISOnetwork model provide datagram service, with no delivery guarantees for messages. A transport-layer protocol such asTCPis used to provide reliability. Discuss the advantages and disadvantages of supporting reliable message delivery at the lowest possible layer.

Answer: Many applications might not require reliable message de-livery. For instance, a coded video stream could recover from packet losses by performing interpolations to derive lost data. In fact, in such applications, retransmitted data is of little use since they would arrive much later than the optimal time and not conform to realtime guaran-tees. For such applications, reliable message delivery at the lowest level is an unnecessary feature and might result in increased message traffic, most of which is useless, thereby resulting in performance degradation.

In general, the lowest levels of the networking stack needs to support the minimal amount of functionality required by all applications and leave extra functionality to be implemented at the upper layers.

16.10 What are the implications of using a dynamic routing strategy on ap-plication behavior? For what type of apap-plications is it beneficial to use virtual routing instead of dynamic routing?

Answer: Dynamic routing might route different packets through dif-ferent paths. Consecutive packets might therefore incur difdif-ferent laten-cies and there could be substantial jitter in the received packets. Also, many protocols, such as TCP, that assume that reordered packets imply dropped packets, would have to be modified to take into account that reordering is a natural phenomenon in the system and does not imply packet losses. Realtime applications such as audio and video transmis-sions might benefit more from virtual routing since it minimizes jitter and packet reorderings.

16.11 Run the program shown in Figure 16.5 and determine theIPaddresses of the following host names:

• www.wiley.com

• www.cs.yale.edu

• www.javasoft.com

• www.westminstercollege.edu

• www.ietf.org

Answer: As of October 2003, the corresponding IP addresses are

• www.wiley.com - 208.215.179.146

• www.cs.yale.edu - 128.36.229.30

• www.javasoft.com - 192.18.97.39

• www.westminstercollege.edu - 146.86.1.2

• www.ietf.org - 132.151.6.21

16.12 Consider a distributed system with two sites, A and B. Consider whether site A can distinguish among the following:

Exercises 115 a. B goes down.

b. The link between A and B goes down.

c. B is extremely overloaded and its response time is 100 times longer than normal.

What implications does your answer have for recovery in distributed systems?

Answer: One technique would be for B to periodically send a I-am-up message to A indicating it is still alive. If A does not receive an I-am-up message, it can assume either B – or the network link – is down.

Note that an I-am-up message does not allow A to distinguish between each type of failure. One technique that allows A to better determine if the network is down is to send an Are-you-up message to B using an alternate route. If it receives a reply, it can determine that indeed the network link is down and that B is up.

If we assume that A knows B is up and is reachable (via the I-am-up mechanism) and that A has some value N which indicates a normal response time. A could monitor the response time from B and compare values to N, allowing A to determine if B is overloaded or not.

The implications of both of these techniques are that A could choose another host—say C—in the system if B is either down, unreachable, or overloaded.

16.13 The original HTTP protocol used TCP/IP as the underlying network protocol. For each page, graphic, or applet, a separateTCPsession was constructed, used, and torn down. Because of the overhead of building and destroying TCP/IP connections, performance problems resulted from this implementation method. Would usingUDPrather thanTCP be a good alternative? What other changes could you make to improve HTTPperformance?

Answer: Despite the connection-less nature of UDP, it is not a seri-ous alternative toTCP for the HTTP. The problem with UDP is that it is unreliable, documents delivered via the web must be delivered reli-ably. (This is easy to illustrate - a single packet missing from an image downloaded from the web makes the image unreadable.)

One possibility is to modify how TCP connections are used. Rather than setting up - and breaking down - aTCPconnection for every web resource, allow persistent connections where a singleTCP connection stays open and is used to deliver multiple web resources.

16.14 Of what use is an address-resolution protocol? Why is it better to use such a protocol than to make each host read each packet to determine that packet’s destination? Does a token-passing network need such a protocol? Explain your answer.

Answer: AnARPtranslates general-purpose addresses into hardware interface numbers so the interface can know which packets are for it.

Software need not get involved. It is more efficient than passing each packet to the higher layers. Yes, for the same reason.

16.15 What are the advantages and the disadvantages of making the com-puter network transparent to the user?

Answer: The advantage is that all files are accessed in the same manner.

The disadvantage is that the operating system becomes more complex.

17

C H A P T E R

Distributed