The End-to-End Argument - The Underlying Architecture of the Internet

II. Network Neutrality and the Evolving Internet

2.2 The Underlying Architecture of the Internet

2.2.2 The End-to-End Argument

The “end-to-end” argument is a design principle that attempts to serve as a guide for resolving the issue of how to allocate functions among the layers. At its core, it argues for a framework for organizing the distribution of functionality within a network in a way that

“intelligence” in the network be implemented at the “ends” of the network, where the higher layers of the network are. In the context of the Internet, this is the Application Layer at the end host. On the flip side, it calls for the lower layer communications protocols themselves to be as “simple and general” as possible, in order to maximize its utility for all applications.¹⁸

The end-to-end principle has implicitly guided the development of the Internet since its inception, but it was not explicitly recognized as a design principle until the early 1980s, in a paper entitled End-to-end Arguments in System Design¹⁹, by Professors Jerome Saltzer, David Reed, and David Clark. In various subsequent papers, the same authors have, jointly and

18 See Mark A. Lemley and Lawrence Lessig, The End of End-to-End: Preserving the Architecture of the Internet in the Broadband Era, 48 UCLAL.REV. 925 (Oct. 1, 2000).

19 Jerome H. Saltzer, David. P. Reed, and David. D. Clark, End-to-End Arguments in System Design, 2 ACM TRANSACTIONS ON COMPUTER SYSTEMS 277-288 (Nov. 1984). An earlier version appeared in the

Second International Conference on Distributed Computing Systems 509-512 (Apr. 1981).

independently, sought to refine and clarify the principle and what it means for the underlying architecture of the Internet. One general depiction of the principle is as follows:

End to end arguments have … two complimentary goals: (1) Higher-level layers, more specific to an application, are free to (and thus expected to) organize lower level network resources to achieve application-specific design goals efficiently (application autonomy); (2) lower-level layers, which support many independent applications, should provide only resources of broad utility across applications, while providing to applications useable means for effective sharing of resources and resolution of resource conflicts (network transparency).²⁰

The principle, however, is not without its own uncertainties. For one, it relies on the ability to distinguish clearly between application specific and non-application specific functions. At a more technical level, it is ambiguous as to how it should allocate potentially application specific functions that may be possible to “completely and correctly” implement at multiple layers. As such, it is not always clear what exactly the end-to-end principle entails in certain specific cases, resulting in the awkward situation in which both proponents and opponents of a technical implementation invoke the end-to-end-principle to back up their views.²¹

In her book, Internet Architecture and Innovation²², Professor Barbara van Schewick is one of the first academics to it undertake a critical analysis of the inconsistencies in the different interpretations of the end-to-end principle. She finds that even in the writings of Saltzer, Reed, and Clark, there exist “two versions of the end-to-end arguments that represent different rules for architectural design.”²³ The first version (the “narrow” version) states that

“a function should only be implemented in a lower layer, if it can be completely and correctly implemented at that layer. Sometimes an incomplete implementation of the function at the

20 Jerome H. Saltzer, David. P. Reed, and David. D. Clark, Active Networking and End-To-End Arguments, 12 IEEENETWORK 66, 70 (May 1998).

21 See VAN SCHEWICK, supra note 13, at 81.

22 Id.

23 See id. at 58.

lower layer may be useful as a performance enhancement.”²⁴ The second version (the

“broad” version) states that “a function or service should be carried out within a network layer only if it is needed by all clients of that layer, and it can be completely implemented in that layer.”²⁵ Van Shewick notes that technical discussions tend to focus on the narrow version, whereas policy texts and descriptions of the Internet’s architecture tend to focus on the broad version. Generally speaking, most of the literature that refers to the end-to-end arguments

Higher layer Lower layer may be implemented at

YES YES YES Both layers Both layers

Table 1: Differences between narrow and broad versions of the end-to-end argument Source: Adapted from Barbara van Schewick, Internet Architecture and Innovation

From Table 1, we can see that in practice, the adoption of the narrow or broad version of

24 See Saltzer, Reed, and Clark 1984, supra note 19, at 278.

25 See Saltzer, Reed, and Clark 1998, supra note 20, at 69.

the principle can result in a different implementation in network architecture in two sets of circumstances:

(1) When a function can be implemented in both layers, but is not needed by all clients of the lower layer, the narrow version allows implementation at both layers, while the broad version allows implementation at the higher layer only.

(2) When a function can only be completely and correctly implemented at the higher layer, the narrow version allows for additional incomplete implementations at the lower layer for performance considerations.

The differences between the two versions may seem trivial, but the distinction is important for reasons that will later become apparent. The narrow version focuses on an end-to-end system where the sole emphasis is placed on “correctness”²⁶ — functions can be implemented at any layer where they may be “correctly and completely” implemented, whether at the ends, or at the core. The broad version, on the other hand, goes beyond the concept of correctness, and insists on implementations of any type of non-general, application specific functions being placed at the higher layers, which by extension means placing them at the end points, away from the core of the network.²⁷ Van Schewick lays out several key advantages to the latter approach, in terms of the network and the applications:

A. Network Evolvability

As a general purpose network, the Internet needs to have the flexibility to be able to support any kind of application. Since each type of application may have a different set of requirements, implementing functions at a lower layer to increase the performance of a certain

26 See VAN SCHEWICK, supra note 13, at 79.

27 See id. at 76.

type of application may increase the overhead for another, or even render the network unusable for some other type of application.

Van Schewick cites a classic example of network optimizations that ended up presenting unintended obstacles for subsequent application innovations: the use of load coils in traditional public switched telephone networks to boost the transmission of high frequency voice communications.²⁸ A side effect of the use of load coils was that frequencies above 3.4 kHz would be cut off. Since voice telephony did not use frequencies over 3.4 kHz at the time, network designers did not see this as a problem. However, this limitation later posed serious problems for the introduction of Digital Subscriber Line (DSL) services over the same lines, as DSL used higher 25 kHz frequencies that were effectively cut off by the load coils. The moral of the story is that optimizations that appear benign in the context of one type of application (voice telephony) could become catastrophic in the context of another application (DSL). By placing application-specific functionality in a higher-layer protocol at the end hosts, we avoid the possibility of lower layers becoming a bottleneck in future innovation. The network itself remains free evolve, as the lower layer continues to accommodate any kind of innovation that may come along.

B. Application Autonomy

It is undisputable fact that applications will always know their own needs better than the network. Van Schewick notes that it is virtually impossible that lower-layer designers will be able to guess in advance all the features applications at the higher layers will potentially need, especially in the case of applications that have yet to materialize. However many features lower-layer designers attempt to cram into the network, applications will most likely end up having to implement application-specific services themselves anyway. Furthermore,

28 See id. at 69.

additional features may not be suitable for all uses of the application, and may create extra overhead, ending up being more harmful than helpful in certain cases. Placing application specific functions at the higher layers ensures that applications have the freedom to determine their own actions, and most importantly, the consequences thereof.²⁹

C. Reliability

By implementing functionality specific to certain applications at the lower layer, we may be introducing additional points of failure in the system for those applications that rely on these functions. Since these network functions are not under the direct control of the designer or user of the application, they have no means to correct those problems when they arise. By restricting the placement of application specific functions at the higher layers, we can ensure that all potential points of failure for an application can be addressed by the designer or user of the application, without needing intervention from the lower layers. At the same time, this approach reduces the complexity of the software that needs to be implemented on the hardware at the lower layers. This makes designing and maintaining the network easier, and less prone to malfunction. Together, this makes both the applications and the network more reliable.³⁰

D. Lack of Application Awareness in the Core

This is not so much a “feature” of the broad version of the end-to-end principle, as it is a consequence of the architectural limitations set by the principle. Since all application specific functionality is systematically removed from the lower layers, this inevitably results in a network core that lacks any sort of application awareness. With all the “intelligence” placed at the ends, the network itself becomes a “stupid” network, responsible only for the transmission

29 See id. at 71.

30 See id. at 72.

of raw data packets, without regard as to the nature of the content in the packets. This effectively places control of how to use the network in the hands of the applications and the users of the end hosts.³¹

2.3 The Emergence of Application Awareness in the Core

The layering principle in conjunction with an adherence to a broad interpretation of the end-to-end principle necessarily results in an architecture where the core of the network is not able to distinguish between applications. However, for the most part of its history, the choice of whether to adhere to the narrow or broad version of the end-to-end principle did not make all that much of a difference. Computing resources were scarce, and any kind of application specific functionality would have added immense overhead to the routers in a manner that would have greatly impacted throughput. The performance trade-off meant that regardless of which version of the principle one chose to follow, it was generally more rational to leave such functionality at the higher layers on the end points, where computing resources where far more abundant, even if it would have been acceptable in principle to implement those functions at the lower layers.

As computing power continued to grow at an exponential rate, this began to change. At the turn of the century, technological capabilities had advanced enough that hardware equipment vendors began to introduce new network hardware with enhanced capabilities that could inspect data packets as they passed through the network. Later, new hardware would appear that not only allowed network providers to know exactly what was passing through their systems, but also gave them the capability to assign priorities to packets, and most importantly, change the way it handled them. This was a fundamental departure from the original Internet’s lack of application awareness.

31 See id.

2.4 Packet Inspection and the Growing Threat of Discrimination

Up until then, the only way network providers had been able to control the end user ’s use of the network was through contractual usage restrictions and acceptable use policies. Now, through packet identifying technologies such as Deep Packet Inspection (DPI), they had the capability to directly control how users made use of the network.³² Suddenly, the choice between adhering to a narrow or broad version of end-to-end made the world of a difference, and the prospect of network providers controlling the flow of information on their networks became so much more realistic. This development understandably had many stakeholders and policy makers worried. It signaled a return to the centralized architecture of the telephone system, where the network provider could act as a gatekeeper, and decide who would get what kind of treatment. In an article ominously titled Deep Packet Inspection, one commentator remarked on the implications of the technology:

Operators can tag packets for fast-lane or slow-lane treatment – or block the packets altogether – based on what they contain or which application sent them…When a network provider chooses to install DPI equipment, that provider knowingly arms itself with the capacity to monitor and monetize the Internet in ways that threaten to destroy Net Neutrality and the essential open nature of the Internet.³³

The ability to discriminate on the basis of content or application was definitely a legitimate concern. However, despite the long list of potentially unsavory uses, not all types of discrimination were inherently bad. In fact, there were equally as many ways discrimination could be used for the benefit of the user. In the following section, we take a look at the mechanics of network discrimination from a technical perspective, and how they may be used for good and problematic purposes.

32 See Jon M. Peha, The Benefits and Risks of Mandating Network Neutrality, and the Quest for a Balanced Policy, 1 INT’L J.COMM.644, 648-50 (2007).

33 M. Chris Riley and Ben Scott, Deep Packet Inspection (Mar. 2009), available at http://www.wired.com/images_blogs/threatlevel/files/dpi.pdf.

2.4.1 The Mechanics of Network Discrimination

To understand how network discrimination works in practice, we must first understand how data traverses the network. Picture a scenario where end host A and B are respectively located on separate networks X and Y, which are connected through Z. When host A sends a packet to B, the data is transferred from network X, through Z, to network Y, via a series of routers and switches along the network. Whenever a router receives a packet, it must first determine which outgoing link to send it on. If the link is available, the packet is sent on its way. If the link is busy, the packet is queued in a buffer, and waits its turn to use the link. If the buffer is full, which happens when the network is overloaded, the packet may be dropped³⁴.

In the original application-agnostic Internet, all packets were transferred on a first come first serve basis. In an application aware network, the system has far more choices when it comes to deciding what to do with the packet. In the paper Nuts and Bolts of Network

Neutrality

³⁵, Edward Felten describes some of the different approaches network owners may take, which we adapt here.

A. Best Efforts or Absolute Non-Discrimination

Absolute non-discrimination is where the network does not discriminate at all between the single bits that pass through it. Every individual packet transmitted through the system is treated in exactly the same way, on a first-come-first-serve basis, regardless of its properties.

This was referred to as a “best-efforts” service, whereby the network would attempt to deliver any packet based on its best guess and best effort as to how to get it to its destination. When a

34 According to the TCP/IP protocol, a dropped packet signals to the sending end host that the link is congested, and a well behaved host will then back off and reduce the rate of transmission until the link returns to an uncongested state.

35 Edward W. Felten, Nuts and Bolts of Network Neutrality (Aug. 2006), http://itpolicy. princeton.

edu/pub/neutrality. pdf.

link buffer is full and a new packet comes in, the router has several choices: (1) it can drop the new incoming packet, or (2) it can allow it into the queue by dropping another packet in the queue, likely the oldest packet in the queue, if not some other packet at random. In such a scenario, any packet has an equal chance of being dropped.

B. Minimal Discrimination

There are, however, no rules requiring the router to drop packets in a certain way. In fact, a router can discard packets in any way it pleases. Minimal discrimination is a scenario whereby the network assigns priorities to packets in the queue. When necessary, rather than dropping packets at random, or based on their order of arrival, the router will drop packets with the lowest priority first. For example, whenever the buffer is full, the router may decide to drop P2P packets first. Felten calls this “minimal” discrimination³⁶, because it only discriminates against certain types of packets when the network is congested and therefore cannot serve all packets at once. Most of the time, when the network is not congested, there is no difference between treatment of higher and lower priority packets.

C. Non-Minimal Discrimination

There is another type of implementation, however, in which the routers may selectively discard low priority packets even if there is enough capacity on the network to deliver them.

For example, the router may be set to reserve 50% of the network’s capacity for high priority packets. When the percentage of lower priority packets reaches the threshold, they may face being dropped, even if the remaining 50% stays idle. Felten calls this kind of discrimination

“non-minimal,” because it artificially restricts certain packets to an arbitrary percentage of

36 See id. at 2.

capacity. ³⁷

D. Delay Discrimination

Another type of discrimination possible is delay discrimination. This type of discrimination can happen in conjunction with minimal and non-minimal discrimination.

Unlike the previous two types of discrimination, which are executed through the dropping of packets, this type of discrimination works through the reordering of packets. Just as the Internet Protocol does not specify what in what order packets should be dropped, it likewise does not specify the order in which they should be sent. While routers generally route packets on a first-come-first-serve basis, it is equally acceptable to send packets in a different order.

For example, a router could allow high priority packets to always cut in front of the line, or advance through the queue at a faster pace. Low priority packets therefore experience an extra delay when passing through the router, much like humans do when people cut in line. This delay is known as “latency.” Another consequence of delay discrimination is that packets may be sent out of order, or experience different delays. This variation in delay is known as

“jitter.”³⁸

E. Absolute Discrimination

This is the most extreme type of discrimination, and in practice this is synonymous with blocking. What happens is that certain types of packets are categorically blocked when they pass through the router, regardless of whether or not there is a link available, or if there is a

在文檔中網路中立管制：差別待遇之經濟效應及其合理性認定 (頁 18-0)