The design and implementation of the NCTUns 1.0 network simulator

(1)

The design and implementation of the NCTUns 1.0

network simulator

S.Y. Wang

*

_{, C.L. Chou, C.H. Huang, C.C. Hwang, Z.M. Yang,}

C.C. Chiou, C.C. Lin

Department of Computer Science and Information Engineering, National Chiao Tung University, 1001 Ta Hsueh Road, 30050 Hsinchu, Taiwan

Received 2 May 2002; received in revised form 20 November 2002; accepted 20 January 2003 Responsible Editor: I. Nikolaidis

Abstract

This paper presents the design and implementation of the NCTUns 1.0 network simulator, which is a high-ﬁdelity and extensible network simulator capable of simulating both wired and wireless IP networks. By using an enhanced simulation methodology, a new simulation engine architecture, and a distributed and open-system architecture, the NCTUns 1.0 network simulator is much more powerful than its predecessor––the Harvard network simulator, which was released to the public in 1999. The NCTUns 1.0 network simulator consists of many components. In this paper, we will present the design and implementation of these components and their interactions in detail.

Keywords: Network simulator; Simulation methodology

1. Introduction

Network simulators implemented in software are valuable tools for researchers to develop, test, and diagnose network protocols. Simulation is economical because it can carry out experiments without the actual hardware. It is ﬂexible because it can, for example, simulate a link with any bandwidth and propagation delay or a router with any queue size and queue management policy. Simulation results are easier to analyze than ex-perimental results because important information

at critical points can be easily logged to help re-searchers diagnose network protocols.

Network simulators, however, have their limi-tations. A complete network simulator needs to simulate networking devices (e.g., hosts and rou-ters) and application programs that generate net-work traffic. It also needs to provide netnet-work utility programs to configure, monitor, and gather statistics about a simulated network. Therefore, developing a complete network simulator is a large effort. Due to limited development resources, tra-ditional network simulators usually have the fol-lowing drawbacks:

• Simulation results are not as convincing as those produced by real hardware and software

*

Corresponding author.

E-mail address:shieyuan@csie.nctu.edu.tw(S.Y. Wang).

doi:10.1016/S1389-1286(03)00181-6

(2)

equipment. In order to constrain their complex-ity and development cost, most existing net-work simulators can only simulate real-life network protocol implementations with limited detail, and this can lead to incorrect results. For example, OPNETÕs modeler product [1] uses a simpliﬁed ﬁnite state machine model to model complex TCP protocol processing. As another example, in ns-2 [2] package, it is documented that ‘‘there is no dynamic receiverÕs advertised window for TCP.’’

• These simulators are not extensible in the sense that they lack the standard UNIX POSIX ap-plication programming interface (API). As such, existing or to-be-developed real-life appli-cation programs cannot run normally to gener-ate traﬃc for a simulgener-ated network. Instead, they must be rewritten to use the internal API pro-vided by the simulator (if there is any) and be compiled with the simulator to form a single big and complex program. For example, since the ns-2 network simulator itself is a user-level program, there is no way to let another user-level application program ‘‘run’’ on top of it. As such, a real-life application program cannot run normally to generate traﬃc for a network simulated by ns-2.

To overcome these problems, Wang proposed a simulation methodology in [3,4] and used it to implement the Harvard network simulator. The Harvard network simulator has two desirable properties as follows. First, it uses the real-life UNIX TCP/IP protocol stack, real-life network application programs, and real-life network utility programs. As such, it can generate more accurate simulation results than a traditional TCP/IP net-work simulator that abstracts a lot away from a real-life TCP/IP implementation. Second, it lets the system default UNIX POSIX API (i.e., the standard UNIX system call interface) be provided on every node in a simulated network. Any real-life UNIX application program, either existing or to-be-developed, thus can run normally on any node in a simulated network to generate traﬃc. One important advantage of this property is that since an application program that is developed for simulation study is a real UNIX program, the

programÕs simulation implementation can be its real implementation on a UNIX machine. As such, when the simulation study is ﬁnished, we can quickly implement the real system by reusing its simulation implementation.

Although the methodology proposed in [3,4] can provide the above two advantages, it has several limitations and drawbacks. To remove these problems, we enhanced the methodology, designed a new simulation engine architecture, and used these improvements to develop a new net-work simulator called ‘‘the NCTUns 1.0 netnet-work simulator.’’ In the rest of the paper, we will present these enhancements as well as the features, com-ponents, design, and implementation of the NCTUns 1.0 network simulator. (For the sake of brevity, we will just call it the ‘‘NCTUns 1.0’’ in the rest of the paper.)

2. Related work

The predecessor of the NCTUns 1.0 is the Harvard network simulator [5], which was au-thored by Wang in 1999. Since its release in July 1999, as of January 1, 2002, the Harvard network simulator has been downloaded by more than 2000 universities, research institutions, industrial re-search laboratories, and ISPs.

As feedback about using the Harvard network simulator gradually comes back, it becomes clear that the Harvard network simulator has several limitations and drawbacks that need to be over-come and solved. Also, it is clear that some useful features and functions need to be implemented and added to it. For these reasons, Wang decided to develop the NCTUns 1.0.

In the literature, some approaches also use a real-life TCP/IP protocol stack to generate results [6–10]. However, unlike our approach, these ap-proaches are used for emulation purposes, rather than for simulation purposes. Among these ap-proaches, Dummynet [10] most resembles our simulator. Both Dummynet and our simulator use tunnel interfaces to use the real-life TCP/IP pro-tocol stack on the simulation machine. However, there are some fundamental diﬀerences. Dummy-net uses the real time, rather than the simulated

(3)

networkÕs virtual time. Thus the simulated link bandwidth is a function of the simulation speed and the total load on the simulation machine. As the number of simulated links increases, the highest link bandwidth that can be simulated de-creases. Moreover, in Dummynet, routing tables are associated with incoming links rather than with nodes. As such, the simulator does not know how to route packets generated by a router, as they do not come from any link.

OPNET, REAL [11], ns-2, and SSFnet [12] represent the traditional network simulation ap-proach. In this approach, the thread-supporting event scheduler, application programs that gener-ate network traffic, utility programs that configure, monitor, or gather statistics about a simulated network, the TCP/IP protocol implementation on hosts, the IP protocol implementation on routers, and links are all compiled together to form a single user-level program. Due to the enormous com-plexity, such a simulator tends to be difficult to develop and verify. In addition, a simulator con-structed using this approach cannot provide UNIX POSIX API for real-life application pro-grams to run normally to generate network traffic. Although some simulators may provide their own internal API, real-life application programs still need to be rewritten so that they can use the in-ternal API, be compiled with the simulator suc-cessfully, and be concurrently executed with the simulator during simulation.

ENTRAPID [9] uses another approach. It uses the virtual machine concept [13] to provide mul-tiple virtual kernels on a physical machine. Each virtual kernel is a process and simulates a node in a simulated network. The system calls issued by an application program are redirected to a virtual kernel. As such, UNIX POSIX API can be pro-vided by ENTRAPID and real-life application programs can be run in separate address space normally. However, because the complex kernel needs to be ported to and implemented at the user-level, many involved subsystems (e.g., the file, disk I/O, process scheduling, inter-process communi-cation (IPC), virtual memory subsystems) need to be modified extensively. As such, the porting effort is very large and the correctness of the ported system may need to be extensively verified.

3. High level architecture

The NCTUns 1.0 uses a distributed architecture to support remote simulations and concurrent simulations. It also uses an open-system architec-ture to enable protocol modules to be easily added to the simulator. Functionally, it can be divided into eight separate components described below: • The ﬁrst component is the fully-integrated GUI

environment by which a user can edit a network topology, configure the protocol modules used inside a network node, specify mobile nodesÕ moving paths, plot performance curves, play back animations of logged packet transfers, etc. From a network topology, the GUI program can generate a simulation job description file suite. Since the GUI program uses Internet TCP/IP sockets to communicate with other components, it can submit a job to a remote simulation machine for execution. When the simulation is finished, the simulation results and generated log files are transferred back to the GUI program. The user then can either examine logged data, plot performance curves, or play back packet transfer animations, etc.

While a simulation is running at the remote simulation machine, the user can query or set an objectÕs value at any time. For example, the user may query or set the routing table of a router or the switch table of a switch at any time. If the user does not want to do any query or set op-eration during a simulation, the user can choose to disconnect the currently running simulation so that he (she) can use the GUI program to handle other simulation cases. The user can later reconnect to a disconnected simulation at any time, whether it is still running or has ﬁnished. A user thus can submit many simulation jobs in a short period of time. This can increase simula-tion throughput if there are many simulasimula-tion machines available to service these jobs con-currently.

• The second component is the simulation engine. A simulation engine is a user-level program. It functions like a small operating system. Through a deﬁned API, it provides useful and basic simulation services to protocol modules

(4)

(to be described soon). Such services include virtual clock maintenance, timer management, event scheduling, variable registrations, etc. The simulation engine needs to be compiled with various protocol modules to form a single user-level program, which we call the ‘‘simula-tion server’’. When executed to service a job, the simulation server takes a simulation job de-scription ﬁle suite as its input, runs the simula-tion, and generates data and packet transfer log ﬁles as its output. When a simulation server is running, because it needs to use a lot of kernel resources, no other simulation server can be running at the same time.

• The third component is various protocol mod-ules. A protocol module is like a layer of a pro-tocol stack. It performs a speciﬁc propro-tocol or function. For example, the ARP protocol or a FIFO queue is implemented as a protocol mod-ule. A protocol module is composed of a set of functions. It needs to be compiled with the sim-ulation engine to form a simsim-ulation server. In-side the simulation server, multiple protocol modules can be linked into a chain to form a protocol stack.

• The fourth component is the simulation job dis-patcher, which is a user-level program. It should be executed and remain alive all the time to man-age multiple simulation machines. We use it to support concurrent simulations on multiple sim-ulation machines. The job dispatcher can oper-ate between a large number of GUI users and a large number of simulation machines. When a user submits a simulation job to the job dis-patcher, the dispatcher will select an available simulation machine to service this job. If there is no available machine at this time, the submit-ted job can be queued in the dispatcher as a background job. Background jobs are managed by the dispatcher. Various scheduling policies can be used to schedule their service order. • The ﬁfth component is the coordinator. which is

a user-level program. On every machine where a simulation server program resides, a coordina-tor program needs to be executed and remain alive. Its task is to let the job dispatcher know whether this machine is currently busy running a simulation or not. When executed, it

immedi-ately registers itself with the dispatcher to join the dispatcherÕs simulation machine farm. Later on, when its status (idle or busy) changes, it will notify the dispatcher of its new status. This en-ables the dispatcher to choose an available ma-chine from its mama-chine farm to service a job. When the coordinator receives a job from the dispatcher, it forks (executes) a simulation ser-ver to simulate the specified network and pro-tocols. At certain times during a simulation, the coordinator may also fork (start) or kill (end) some real-life application programs, which are specified in the job to generate traffic for the simulated network. Because the coordinator has the process IDs of these forked traffic genera-tors, the coordinator passes these process IDs into the kernel to register these traffic generators with the kernel. From now on, all time-related system calls issued by these registered traffic generators will be performed based on the vir-tual time of the simulated network, rather than the real time.

When the simulation server is running, the co-ordinator communicates with the job dispatcher and the GUI program on behalf of the simula-tion server. For example, periodically the sim-ulation server sends the current virtual time of the simulated network to the coordinator. The coordinator then forwards this information to the GUI program. This enables the GUI user to know the progress of the simulation. During a simulation, the user can also on-line set or get an objectÕs value (e.g., to query or set a switchÕs switch table). Message exchanges happening between the simulation server and the GUI program are all done via the coordinator. • The sixth component is the modiﬁcations that

need to be made to the kernel of the simulation machine so that a simulation server can cor-rectly run on it. For example, during a simula-tion, the timers of TCP connections used in the simulated network need to be triggered by the virtual time rather than by the real time. • The seventh component is various protocol dae-mons (programs) running at the user-level. Like the routing daemon ‘‘routed’’ or ‘‘gated’’ run-ning on UNIX machines that exchange routing messages and set up system routing tables,

(5)

when the NCTUns 1.0 is running to simulate a network, some protocol daemons can run at the user-level to perform speciﬁc jobs. For example, the real-life routed (using the RIP routing pro-tocol) or gated (using the OSPF routing proto-col) daemons can run with the NCTUns 1.0 to set up the routing tables used by the routers in a simulated network.

• The last component is all real-life application programs running at the user-level. As stated previously, any real-life user-level application program can run on a simulated network to ei-ther generate network traffic, configure net-work, or monitor network traffic, etc. For example, the tcpdump program can run on a simulated network to capture packets flowing over a link and the traceroute program can run on a simulated network to find out the rout-ing path traversed by a packet.

Fig. 1 depicts the distributed architecture of the NCTUns 1.0. It shows that, due to the nature of the distributed architecture, simulation machines can be very far away from the machines where the GUI programs run. For example, the simulation service center may be at NCTU in Taiwan while the GUI users come from many diﬀerent places of the world.

When the components of the NCTUns 1.0 are run on multiple machines to carry out simulation

jobs, we say that the NCTUns 1.0 is operating in the ‘‘multiple machine’’ mode. This mode can support remote simulations and concurrent simu-lations. These components can also run on the same machine to carry out simulation jobs. This mode is called the ‘‘single-machine’’ mode and is more suitable for a user who has only one ma-chine. Due to the nature of the IPC design, the NCTUns 1.0 can be used for either mode without changing its program code. Only the mode pa-rameter in its conﬁguration ﬁle needs to be chan-ged.

4. Design and implementation

4.1. Fully-integrated GUI environment

The NCTUns 1.0 has a fully-integrated GUI environment by which a user can easily perform simulation studies. The GUI program is composed of four main components. In the following, we will present each of them.

The ﬁrst component is the topology editor, which is shown in Fig. 2. The topology editor provides a convenient and intuitive way to graph-ically construct a network topology, specify vari-ous parameters of network devices and protocols, and specify the application programs that will be run during simulation to generate traﬃc.

Fig. 1. The distributed architecture of the NCTUns 1.0.

Fig. 2. The topology editor of the NCTUns 1.0 network sim-ulator.

(6)

A constructed network can be either a ﬁxed wired network or a mobile wireless network.

The second component is the performance monitor, which is shown in Fig. 3. The perfor-mance monitor can easily and graphically display the plots of some monitored performance metrics such as a linkÕs utilization or a TCP connectionÕs achieved throughput.

The third component is the packet animation player, which is shown in Fig. 4. By using the packet animation player, a logged packet transfer

trace can be graphically replayed at any speed. Both wired and wireless networks are supported. The network at the top of Fig. 4 is a ﬁxed wired network. When the packet animation player starts, packets are represented as line segments with ar-rows ﬂowing smoothly on the links. The network at the bottom is a mobile ad hoc network. When the player starts, a wireless transmission is repre-sented by two circles centered at the transmitting node. These two circles represent the transmission and interference ranges of the wireless network interface. Their display time is proportional to the packet transmission time of this wireless transfer. The packet animation player is a very useful tool because it can help a researcher to visually debug the behaviors of a protocol. It is also very useful for educational purposes.

The last component is the node editor, which is shown in Fig. 5. A node in the NCTUns 1.0 rep-resents a network device such as a switch or an IEEE 802.11(b) wireless LAN access point. The node editor provides a convenient environment to ﬂexibly conﬁgure the protocol modules used inside a network node. By using this tool, a user can use the mouse to graphically add, delete, or replace a protocol module with his (her) own module. As

Fig. 3. The performance monitor of the NCTUns 1.0 network simulator.

Fig. 4. The animation player of the NCTUns 1.0 network

(7)

such, the node editor enables a user to easily test the functionality and performance of a new de-signed protocol. Fig. 5 shows the internal protocol stacks used by a router, which in this case has three network interface ports. In Fig. 5, each square box represents a protocol module. We see that each network interface port is configured with a chain of protocol modules (i.e., a protocol stack). The protocol modules supported by the NCTUns 1.0 are classified into different categories (e.g., MAC, PHY, Packet Scheduling, etc.). They are displayed at the top of the node editor.

4.2. The enhanced simulation methodology

The NCTUns 1.0 uses an enhanced simulation methodology, which enables it to be much more powerful and useful than the Harvard network simulator. The enhancements come from the de-sires to support multiple subnets in a simulated network, simulate various network devices oper-ating at diﬀerent layers, simulate various proto-cols, simulate various types of networks, support both broadcast and unicast transfer modes for application programs, let users use the familiar real-life IP address and port number scheme to specify the network parameters of application programs, etc. In summary, the goal of the en-hanced simulation methodology is to allow users to simulate any desired network and operate it in exactly the same way as they operate a physical real network.

In the following, we present the design and

implementation of the enhanced simulation

methodology.

4.2.1. Tunnel network interface

Tunnel network interfaces is the key facility in the used simulation methodology. A tunnel net-work interface, available on most UNIX ma-chines, is a pseudo network interface that does not have a real physical network attached to it. The functions of a tunnel network interface, from the kernelÕs point of view, are no diﬀerent from those of an Ethernet network interface. A network ap-plication program can send out its packets to its destination host through a tunnel network inter-face or receive packets from a tunnel network

in-terface, just as if these packets were sent to or received from a normal Ethernet interface.

Each tunnel interface has a corresponding de-vice special file in the/dev directory. If an appli-cation program opens a tunnel interfaceÕs special file and writes a packet into it, the packet will enter the kernel. To the kernel, the packet appears to come from a real network and just be received. From now on, the packet will go through the kernelÕs TCP/IP protocol stack as an Ethernet packet would do. On the other hand, if the ap-plication program reads a packet from a tunnel interfaceÕs special file, the first packet in the tunnel interfaceÕs output queue in the kernel will be de-queued and copied to the application program. To the kernel, the packet appears to have been transmitted onto a link and this pseudo transmis-sion is no different from an Ethernet packet transmission.

4.2.2. Simulating single-hop networks

Using tunnel network interfaces, we can easily simulate the single-hop TCP/IP network depicted in Fig. 6(a), where a TCP sender application pro-gram running on host 1 is sending its TCP packets

Fig. 6. (a) A TCP/IP network to be simulated. (b) By using tunnel interfaces, only the two links need to be simulated. The complicated TCP/IP protocol stack need be simulated. Instead, the real-life TCP/IP protocol stack is directly used in the simu-lation.

(8)

to a TCP receiver application program running on host 2. We set up the virtual simulated network by performing the following two steps. First, we conﬁgure the kernel routing table of the simulation machine so that tunnel network interface 1 is chosen as the outgoing interface for the TCP packets sent from host 1 to host 2, and tunnel network interface 2 is chosen for the TCP packets sent from host 2 to host 1. Second, for the two links to be simulated, we run a simulation server to simulate them. For the link from host i to host j

(i¼ 1 or 2, j ¼ 3 i), the simulation server opens

tunnel network interface iÕs and jÕs special file in/ dev and then executes an endless loop until the simulated time elapses. In each step of this loop, it simulates a packetÕs transmission on the link from host i to host j by reading a packet from the spe-cial file of tunnel interface i, waiting the linkÕs propagation delay time plus the packetÕs trans-mission time on the link, and then writing this packet to the special file of tunnel interface j.

After performing the above two steps, the vir-tual simulated network has been constructed. Fig. 6(b) depicts this simulation scheme. Since the trick of replacing a real link with a simulated link happens outside the kernel, the kernels on both hosts do not know that their packets actually are exchanged on a virtual simulated network. The TCP sender and receiver programs, which run on top of the kernels, of course do not know the fact either. As a result, all existing real-life network application programs can run on the simulated network, all existing real-life network utility pro-grams can work on the simulated network, and the TCP/IP network protocol stack used in the simu-lation is the real-life working implementation, not just an abstract or a ported version of it.

Note that in this simulation methodology, the kernel of the simulation machine is shared by all nodes (hosts and routers) in a virtual simulated network. Therefore, although in Fig. 6(b) there are two TCP/IP protocol stacks depicted, actually they are the same one––the protocol stack of the single simulation machine.

4.2.3. Simulating multi-hop networks

The above simulation methodology can only simulate a network composed of two hosts that are

directly connected by a full-duplex link. To simu-late a multi-hop network composed of layer-1 hubs, layer-2 switches, and layer-3 routers, to allow multiple subnets to exist in a simulated network, and to let packets be routed automatically through routers as they are forwarded toward their desti-nation nodes, we need to enhance the basic simu-lation methodology. In the following, we use Fig. 7 to illustrate the enhanced simulation methodology. Suppose that we want to simulate the network depicted in Fig. 7(a), which has two subnets. The ﬁrst subnet is subnet 8 (its network address is 1.0.8.X) while the second subnet is subnet 9 (its network address is 1.0.9.X). A layer-3 router (i.e., router 1) connects both of these two subnets to-gether and forward packets between them. In subnet 9, a layer-2 switch (i.e., switch 1) connects to both router 1 and host 2 and switches packets between them. In the following, we deﬁne the schemes used in the NCTUns 1.0.

• Interface IP address scheme: In a simulated net-work, multiple subnets can exist. For each layer-3 or above network node (e.g., a host or a router), if it has multiple network interfaces, each one is simulated by a tunnel network inter-face. A tunnel network interface has an IP ad-dress assigned to it, just like a normal network interface does. Suppose that a tunnel interface connects to subnet A and its host number on this subnet is B, its IP address is configured as 1.0.A.B in this scheme. (In the rest of the paper, we assume that IPv4 addresses are used to con-struct a simulated network.) Arbitrarily chosen, 1.0.X.X represents the network address of the whole simulated network. Using the common netmask of 255.255.255.0, a simulated network can have up to 255 subnets, each having up to 255 hosts or routers residing on it. This interface IP address scheme is the same as the standard IP address scheme used in real-life networks. If a tunnel interface is used in a simulation, its IP address needs to be configured. We can use the UNIX ifconfig program to do this task. For example, to configure tun1, we can use the ‘‘if-config tun1 1.0.8.1 netmask 255.255.255.0’’ command. Other tunnel interfaces used in Fig. 7(a) are configured in a similar way.

(9)

In the NCTUns 1.0, a layer-1 network node (e.g., a hub) or a layer-2 network node (e.g., a switch) does not have any IP address assigned to its interface ports. This is correct as in real-life networks an IP address is used for addressing a layer-3 network interface. Note that although the familiar network mask of 255.255.255.0 is used as the network mask for a simulated net-work, it can be set to any valid value as well. In short, the IP address scheme used in this meth-odology is the same as that used in real-life networks.

Fig. 7(a) shows that tun1 is used by host 1 to connect to subnet 8, tun2 used by router 1 to connect to subnet 8, tun3 used by router 1 to

connect to subnet 9, tun4 used by host 2 to connect to subnet 9, and switch 1 does not have any IP address assigned to its interface ports. We see that each tunnel interface is conﬁgured with an IP address and a MAC address. These MAC addresses can be arbitrarily chosen as long as they are diﬀerent on a subnet.

• Source-destination-pair IP address scheme: After assigning an IP address to each tunnel interface used in a simulated network, now an application program running on a node can send packets to an application program running on a diﬀerent node. Assuming that the sending node has a tun-nel interface whose assigned IP address is 1.0.A.B and the receiving node has a tunnel interface Fig. 7. (a) An example network to be simulated. (b) The automatic routing scheme is used to automatically forward packets across layer-3 routers. (c) The simulation server participates in the simulation to simulate links and switches.

(10)

whose assigned IP address is 1.0.C.D, in this methodology the sending application program should use A.B.C.D as the destination IP address when sending packets to the receiving node. We call such addresses the ‘‘source-destination-pair’’ addresses. These addresses are not used by any interface in a simulated network. In-stead, they are used by sending application programs to indicate their intended destination nodes. Using the source-destination-pair ad-dress scheme enables packets to be automati-cally forwarded through layer-3 routers in a simulated network. The details about the au-tomatic routing scheme will be explained later.

Although using source-destination-pair

ad-dresses to specify the address parameters of application programs is unnatural to simulator users, by using the fully-integrated GUI envi-ronment, a user need not know the concept and need not use the source-destination-pair address scheme at all. In the GUI program, the user can still use 1.0.C.D as the destination address when specifying the address parameters of application programs. On the simulation machine, the co-ordinator will automatically translate the des-tination address to A.B.C.D before launching these application programs.

• Automatic routing scheme: To let the simulation machineÕs kernel automatically route a packet through many layer-3 routers in a simulated network, we can properly conﬁgure the routing entries of the simulation machineÕs system rout-ing table. The automatic routrout-ing design has two main advantages. First, we can use the real-life IP protocol stack of the simulation machineÕs kernel to forward packets in a simulated net-work. Simulation results thus can be more accu-rate. Second, we can reuse the system default routing scheme to add, delete, or change rout-ing entries and look up the routrout-ing table. As such, we need not waste time and eﬀort to re-implement the same scheme in the simulator. Note that although there may be many routers in a simulated network, they all share and use the same system routing table.

For example, in Fig. 7(b), several routing en-tries are added to the system routing table of the simulation machine. When host 1 wants to send

packets to host 2, it uses the 8.1.9.4 source-destination-pair address to look up the routing table. The found entry is [8.1.9.4 tun1 1.0.8.2]. This entry indicates that the packet needs to be sent through tun1 and the used gateway IP address should be 1.0.8.2. The ARP module at the sending node then ﬁnds the MAC address used by 1.0.8.2 (by using the ARP request/reply protocol) and puts it (i.e., BB) in the MAC header of the packet as the destination MAC address. The MAC module at the sending node then sends out the completed MAC frame, which will then reach the interface whose as-signed IP address is 1.0.8.2.

Note that the source-destination-pair address 8.1.9.4 is used only for looking up the routing table. After the corresponding routing entry is found, the source-destination-pair address 8.1.9.4 is no longer used. The destination IP address carried in the IP header of the packet is always 1.0.9.4. It remains the same from the source node to the destination node, no matter how many routers the packet needs to traverse. When the MAC frame arrives at router 1, its MAC header is stripped oﬀ by the MAC module at router 1. At the IP layer of the simulation machineÕs kernel protocol stack, the 1.0.9.4 ad-dress carried in the IP header is taken out and translated to 9.3.9.4 source-destination-pair ad-dress for looking up the routing table. The rea-son why 9.3.9.4 is used is because 1.0.9.3 is one of router 1Õs IP addresses. Actually, because 1.0.8.2 is also one of router 1Õs IP addresses, the 8.2.9.4 source-destination-pair address can also be used. In Fig. 7(b), we see that both [9.3.9.4 tun3] and [8.2.9.4 tun3] routing entries exist in the system routing table. As such, whether 1.0.9.4 is trans-lated to 9.3.9.4 or 8.2.9.4, the found routing entry will indicate that the destination node (i.e., host 2) is already on the same subnet as router 1 (because there is no gateway IP address as-sociated with this entry) and the packet should be sent out via tun3 directly to 1.0.9.4. The ARP module at router 1 then ﬁnds the MAC address used by 1.0.9.4 (i.e., DD) and puts it into the MAC header of the packet. The completed MAC frame is then sent out through tun3.

(11)

When the MAC frame arrives at switch 1, its destination MAC address is taken out by switch 1 for looking up the switch table. Because the found switch entry is [DD port2], which indi-cates that this MAC frame should be forwarded out via port 2, this MAC frame is forwarded out without modiﬁcation via port 2 of switch 1. Note that the switch is simulated by the simu-lation server (which is compiled and linked with the switch protocol module). Unlike a layer-3 router, which is simulated by letting packets re-enter the kernel IP protocol stack, a layer-2 switch or a layer-1 hub is simulated internally inside the simulation server.

When the MAC frame arrives at host 2, its MAC header is stripped off. The destination IP address 1.0.9.4 is taken out and translated to the source-destination-pair address 9.4.9.4 before the ker-nel looks up the routing table. Because the first two numbers 9.4 is the same as the second two numbers 9.4 in the source-destination-pair ad-dress 9.4.9.4, the kernel knows that this packet has reached its final destination node and therefore there is no need to look up the routing table. The kernel then delivers the packet to the TCP/UDP layer for further processing.

4.3. Simulation engine

The NCTUns 1.0 is a network simulator, not a network emulator. As such, it can simulate net-works with a very large number of links and nodes. Links with very high bandwidth can also be sim-ulated. As a simulator, when simulating a net-work, the simulation engine needs to maintain a virtual clock for the simulated network. Simula-tion events are triggered and executed based on the virtual clock, rather than the real clock.

The virtual clock in the simulation engine is maintained by a counter. The time unit represented by one tick of the counter can be set to any value (e.g., one nanosecond) to simulate high speed links. The current virtual time thus is the current value of the counter times the time unit used. The simula-tion engine uses the discrete-event simulasimula-tion method to advance its virtual clock. During simu-lation, the counter is continuously advanced to the timestamp of the event to be processed next.

The simulation engine needs to pass the current virtual time down into the kernel. This is required for many purposes. First, the timers of TCP con-nections used in the simulated network need be triggered by the virtual time rather than by the real time. Second, for those application programs launched to generate traﬃc in the simulated net-work, the system calls issued by them must be performed based on the virtual time rather than the real time. For example, if we launch a ping program in a simulated network to send out a ping request every 1 s, the sleep (1) system call issued by the ping program must be triggered by the virtual time, not the real time. Third, the in-kernel packet logging mechanism (i.e., the Berkeley-packet-ﬁlter (BPF) scheme used by tcpdump) needs to use timestamps based on the virtual time, rather than the real time, to log packets transferred in a sim-ulated network.

The simulation engine needs to pass the current virtual time to the kernel in a low-cost and ﬁne-grain way. The simulation engine can pass the current virtual time into the kernel by periodically making a system call. (For example, the simulation engine can make the system call once every 1 ms in virtual time.) However, the cost of this approach will be too high when we want the virtual time maintained in the kernel to be as precise as that maintained in the simulation engine. For example, the in-kernel packet logging mechanism needs a microsecond-resolution clock to generate time-stamps. To solve this problem, the simulation engine uses a memory-mapping technique. The simulation engine maps the memory location that stores the current virtual time in the simulation engine to a memory location in the kernel. As such, at any time the virtual time in the kernel is as precise as that maintained in the simulation engine without any system call overhead.

4.4. Protocol modules

Protocol modules are compiled and linked with the simulation engine to simulate layer-2 and below devices, protocols, and transmission medium. Although the automatic routing scheme enables the simulation machineÕs kernel to use its layer-3 and above TCP/IP protocol stack to forward

(12)

packets, layer-2 and below devices, protocols, and transmission medium are not simulated when this scheme is used. As such, the simulation server (i.e., the simulation engine plus protocol modules) needs to simulate transmission medium, all layer-2 and below protocols, and devices. For example, Fig. 7(c) shows that, to simulate the network de-picted in Fig. 7(a), the simulation server needs to simulate link 1, link 2, link 3, and switch 1. (It does not need to simulate host 1, host 2, and router 1 because they are ‘‘simulated’’ by using the auto-matic routing scheme.)

Layer-2 and below devices, protocols and transmission medium are simulated as proto-col modules. Several protoproto-col modules may be chained together to form a protocol stack. A layer-3 interface (i.e., a tunnel interface) uses such a protocol stack to simulate its layer-2 and below processing. For example, a layer-3 interface nor-mally has the following protocol modules. First, an ARP module is required to ﬁnd the MAC ad-dress used by an IP adad-dress (i.e., the destination IP address of an outbound packet). Second, a packet scheduling and buﬀer management (PSBM) mod-ule is required for storing and scheduling out-bound packets. (The simplest one is a FIFO queue.) Third, a Medium Access Control (e.g., 802.3 or 802.11) module is required for controlling when to send a packet onto the link. Lastly, a physical layer (PHY) module is required to simu-late the characteristics of the transmission medium (e.g., delay, bandwidth, Bit-Error-Rate, etc.). These modules are chained together. When a layer-3 interface sends out a packet onto a link, the packet will be passed down module-by-module to the PHY module. In the other direction, when the PHY module of a layer-3 interface receives a packet, the packet will be passed up module-by-module to the layer-3 interface if a lower-layer module does not discard it (e.g., to simulate bit errors).

Although by default each layer-3 interface (i.e., tunnel interface) has an output queue (FIFO) as-sociated with it inside the kernel, the NCTUns 1.0 does not use it. Instead, whenever the kernel en-queues a packet into a tunnel interfaceÕs output queue, a notiﬁcation event is immediately passed to the simulation server, which enables the

simu-lation server to immediately dequeues the packet and reads it out from the kernel. This operation takes no time in virtual time because the simula-torÕs virtual clock is stopped during this period.

The simulation server then passes the packet to the ARP module associated with this tunnel in-terface, which in turn passes the packet down to the PSBM module below it. At the PSBM module, any sophisticated PSBM scheme can be used. This design enables a host or a router to use various sophisticated PSBM scheme for its ports. For ex-ample, a routerÕs ﬁrst port can use a PSBM module that implements the Round-Robin scheme while its second port can use a PSBM module that im-plements the FIFO scheme. Another advantage of this design is that a PSBM module developed for layer-2 switches can be readily used for layer-3 routers. No extra time and eﬀort are needed.

As an example, Fig. 8(b) shows how the simu-lation server simulates the network depicted in Fig. 8(a). Suppose that the TCP sender sends a packet to the TCP receiver. On host 1, the packet will pass through the TCP/IP protocol stack and be en-queued into the output queue of tun1. The simu-lation server will immediately dequeue it and read it out from the kernel. The simulation server then delivers it to the protocol stack created for tun1. The packet then passes the ARP module, the PSBM module, the 802.3 module, and ﬁnally reaches the PHY module of this protocol stack.

Before being delivered to the other end of the link, the packet needs to wait a certain amount of time to simulate the delay of link 1 and its packet transmission time on link 1. While waiting, it is stored as a timeout event in the simulation engineÕs event heap. When the packetÕs timer expires, the simulation server then delivers the packet to the protocol stack created for tun2 by moving the packet to the PHY module of the second protocol stack. The packet then is passed up and reaches the 802.3 module. At the 802.3 module, the packetÕs destination MAC address is checked against tun2Õs MAC address to see whether this packet should be accepted or discarded. If the packet should be accepted, it is passed to the PSBM module. The PSBM module simply passes the packet to the ARP module because its PSBM functions are for outbound packets, not for inbound packets. When

(13)

the ARP module receives the packet, because ARP protocol is for outbound packets only, it simply writes the packet into the kernel. The packet then passes through the TCP/IP protocol stack and ﬁ-nally reaches the TCP receiver.

To enable the user-level simulation server to quickly detect that the kernel has enqueued a packet into a tunnel interfaceÕs output queue, a memory-mapping technique similar to that used for passing the current virtual time down into the kernel is used. In the kernel, a bit-map is used to record the empty or non-empty status of every tunnel interfaceÕs output queue. The memory lo-cation that stores this bit-map in the kernel is mapped to a memory location in the simulation server. By using this technique, the simulation server can immediately detect that a packet has been enqueued into a tunnel interfaceÕs output queue without any system call overhead.

4.5. Kernel modiﬁcations

Some parts of the simulation machineÕs kernel need to be modiﬁed. In the following, we present some important kernel modiﬁcations.

4.5.1. IP address translation

The use of source-destination-pair IP address scheme enables the kernel to automatically for-ward a packet tofor-ward its destination node. How-ever, when a simulator user speciﬁes an application programÕs destination IP address parameter, he (she) should be able to use the normal IP address scheme for this task. For example, in Fig. 7(a), the destination IP address parameter given to the TCP sender should be 1.0.9.4, rather than 8.1.9.4. The internally used and unnatural source-destination-pair IP address scheme should be hidden from the user. The user need not know how we use the source-destination-pair address to automatically route packets.

Internally, the kernel needs to perform the ad-dress translation on each node that is on the path from the source node to the destination node, and use the translated IP address to look up the rout-ing table. However, to translate the address, the kernel ﬁrst needs to know the identity of the cur-rent node. That is, when a packet is forwarded to and enters node i, the kernel should know that the identity of the current node is i. After obtaining this information, the kernel can look up the in-terface table associated with node i and pick up an interface IP address to perform the translation. (The kernel keeps an interface table for each node, which records the IP addresses used by this node.) As an example, in Fig. 7(a), when a packet is forwarded to node 2, the kernel can pick up an IP address (say 1.0.9.3) from node 2Õs interface table and translate 1.0.9.4 to 9.3.9.4 before looking up the routing table. (Note: the kernel could pick up 1.0.8.2 and translate 1.0.9.4 to 8.2.9.4 as well. The reason has been explained in Section 4.2.3.)

To pass the current node identity to the kernel when a packet arrives at a node, the simulation server, after simulating the packetÕs transmission on a link, can put the identity of the destination node of the link (e.g., i) into the packetÕs header before writing it into the kernel.

Fig. 8. (a) A network to be simulated. (b) Using the simulation server to simulate the network depicted in (a).

(14)

Although the above method seems to work successfully for all nodes on the path, actually it cannot work successfully for the source node. For a non-source node, before a packet enters it, the packet must be transmitted on a link. As such, the simulation server knows the identity of the desti-nation node of this link. However, for the source node, since the packet does not come from any link, this information is unavailable and thus cannot be provided by the simulation server.

We solve this problem by explicitly telling the kernel the current node identity when an applica-tion program is launched. Since in the NCTUns 1.0 every application program is launched by the coordinator, the coordinator is designed to issue a system call to the kernel before launching an ap-plication program. The system call passes the identity of the node on which the application program is intended to run into the kernel. The kernel then stores this information in one of its variables. Very soon when the application pro-gram is launched, the kernel will store this infor-mation in the control block of this launched process. Form now on, every packet generated by this application program can carry this informa-tion in its header when it is sent down from the socket layer to the IP layer. This solves the address translation problem on the source node.

4.5.2. Port number translation

An inherent problem with the proposed simu-lation methodology is that application programs cannot bind to the same port in a simulated net-work, even though they are running on diﬀerent nodes in the simulated network. The reason is that, since these application programs are running on a single-machine (i.e., the simulation machine), they cannot choose the same port to bind. In real-life networks, however, this is possible and should be allowed. For example, in a network, there may be a Web server binding to port 80 on every host and a RIP routing daemon binding to port 520 on every router.

From an application programÕs viewpoint, it does not matter which port to use as long as it can use the port to communicate with its partners. As such, when multiple application programs running on diﬀerent nodes want to bind to the same port in

a simulated network, a network simulator user can solve this problem by letting them choose diﬀerent port numbers to bind. Although this solution works and does not aﬀect the simulation result, it makes a simulated network unnatural to the sim-ulator user, which should be avoided. A better solution would be that these application programs are still allowed to bind to the same port when they are launched; however, the kernel internally translates the port number used by them to dif-ferent port numbers to avoid port number colli-sions.

To achieve this goal, the kernel maintains a bit-map to record which port numbers have been used and which have not been used. During a simula-tion, suppose that an application program (say A) running on node i wants to bind to port number j, the kernel will ﬁnd an unused port number (say k) and instead let application program A bind to port number k. The kernel then creates an association

(nodeID¼ i, real port num ¼ j, remapped port

num¼ k) and inserts it into a hash table.

With this arrangement, if an application pro-gram (say B) wants to send packets to application program A, application program B can use the port number originally used by application pro-gram A (i.e., j) as the destination port number. Application program B need not know the port number translation details. The simulated network looks like a real network to it.

The port number translation process occurs at the destination node(s), not at the source node. When application program B sends a packet to application program A, before the packet reaches the destination node, the destination port number carried in the packet remains j, not k. Only after the packet reaches the destination node is its des-tination port number translated to k. Finding k is achieved by searching the hash table using the key pair (i; j), where j is readily available from the packet header. As for the value of i (the current node identity), the kernel can obtain this infor-mation by using the method described in Section 4.5.1.

Translating the port number at the destination node(s), not at the source node, has two advan-tages. The ﬁrst advantage is that it supports broadcast transfers on a subnet. If the translation

(15)

is performed at the source node, only unicast transfers can be supported. Broadcasting a packet on a subnet to multiple application programs that bind to the same port but run on different machines (e.g., the routing daemons case) will be impossible. At present, we have not investigated how to sup-port multicast transfers. The second advantage is that we can use the tcpdump program to correctly filter and capture packets in a simulated network. The tcpdump program can use port numbers to filter and capture packets. When a user wants to capture the packets sent from application program A to B, naturally he (she) will set the filtering des-tination port number to j. If we translate the port number at the source node, the destination port number carried in the packet will be k when it is traversing the network. This will make the tcp-dump program unable to capture this packet. 4.5.3. Process scheduling

We modified the default UNIX process sched-uler so that the processes of the simulation server and all launched traffic generators can be sched-uled in a controlled way. The default UNIX pro-cess scheduler uses a priority-based dynamic scheme to schedule processes. As such, the order in which the simulation server and traffic generator processes are scheduled cannot be precisely con-trolled. Also, the CPU cycles allocated to each of these processes cannot be guaranteed. This may result in a potential problem. For example, after getting the control of CPU, the simulation server may use the CPU too long before releasing it to traffic generators. Because the simulation server is responsible for advancing the virtual clock while it is executing, if it monopolizes the CPU too long, no network traffic can be generated during this long period of time, which should not occur. To avoid this potential problem, we modified the de-fault UNIX process scheduler so that the simula-tion server and all traffic generator processes are explicitly scheduled according to the timestamp order of their events.

4.6. System functions

In addition to simulating network devices and protocols, to be a useful software, the NCTUns 1.0

provides many useful system functions. In the following, we present two of them.

4.6.1. Per-node command console shell

For each node in a simulated network, we provide a command console. A GUI user can easily invoke a nodeÕs command console by right-clicking the nodeÕs icon in the topology editor. Immediately a terminal window (like the X ter-minal window) will appear and automatically log into the (possibly remote) simulation machine. On the simulation machine, a shell program is then executed to process the real-life UNIX commands that may be typed in by the GUI user.

The command console is a very useful feature. During a simulation, in a nodeÕs command con-sole, a user can launch application programs or execute UNIX commands at run time, just like he (she) is operating in a real-life network nodeÕs command console. For example, a user can run the ‘‘netstat’’ command to get the packet transfer statistics of an interface. The user can run the ‘‘traceroute’’ command to see the routing path between any pair of nodes in the simulated net-work. This is useful for quickly checking the routing paths generated by routing daemons. The user can also run the ‘‘tcpdump’’ command to monitor the packets ﬂowing on an interface. Ac-tually, any real-life command can be executed in the command console. The user can immediately get the output of these commands without waiting until the simulation is ﬁnished.

To make a command console totally natural to the user, we modified the system default shell program so that the user will not see anything inconsistent. The modification handles interface name conversion and filtering. On a real-life UNIX machine, a user may execute the ‘‘ifconfig’’ command to check the settings of an (or all) in-terface(s). The output is useful as it includes the name assigned to the interfaces. (For example, the first Intel EtherExpress Ethernet interface is as-signed the name fxp0, the second asas-signed the name fxp1, etc.) Knowing an interfaceÕs name is important as some utility programs need this in-formation. For example, if we run the tcpdump program to monitor the packets flowing on an interface, we need to know the interfaceÕs name

(16)

and give it as a parameter to the tcpdump pro-gram.

If the default shell program is not modified, when the user uses the ‘‘ifconfig -a’’ command to see all interfaces used by this node, he (she) will see all the tunnel interfaces used by the simulation machine and will not know which tunnel interfaces are internally used for the interfaces of this node. As such, the shell program needs to perform two tasks. The first task is to filter out unrelated output and the second task is to convert interface names between tunXXX and fxpXXX, where XXX rep-resents a number.

For example, suppose that 256 tunnel inter-faces (tun0; tun1; . . . ; tun255) are used by the simulation machine to simulate a network, and among them, tun1, tun8 and tun9 are inter-nally used to simulate the three interfaces used by a node in the simulated network. Suppose that in the topology editor, these three interfaces are given the names fxp0, fxp1, and fxp2, respectively. Now in the nodeÕs command console, if the user exe-cutes the ifconﬁg -a command, what he (she) should see is the settings about fxp0, fxp1, and fxp2, rather than the settings about tun0; tun1; . . . ; and tun255. The shell program needs to internally convert tun1 to fxp0, tun8 to fxp1, and tun9 to fxp2 before displaying the commandÕs output. It also needs to ﬁlter out the settings of all other tunnel interfaces before displaying the output. To the user, the names of the interfaces used by this node are fxp0, fxp1, and fxp2. He (she) should be able to use any of these interface names (fxpXXX) as a parameter for any real-life command or pro-gram.

To achieve this goal, the interface name con-version and filtering operations must be performed for both the input and output of the shell program. For example, after the user finds that the node has three interfaces named fxp0, fxp1, and fxp2, he (she) may decide to execute the tcpdump program to monitor the packets flowing on fxp2. (The exact command is ‘‘tcpdump -i fxp2.’’) Before launching the tcpdump command, the shell program needs to intercept this command string and convert fxp2 back to tun9 so that the internally-launched tcp-dump command will be ‘‘tcptcp-dump -i tun9’’ rather than ‘‘tcpdump -i fxp2.’’

To intercept both the input and output of the shell program, we fork a process and insert it be-tween the shell process and the system terminal device driver. This process acts as a relaying pro-cess. All input to and output from the shell process must be relayed by this process. As such, it has a chance to perform its tasks. This process actually performs more tasks than those described here. This is because a command string may contain the

shell I/O redirection (i.e., >) and pipe (i.e.,j)

op-erators. The interface name conversion and ﬁlter-ing operations must still be handled properly in such cases.

The command console shell needs to perform two other tasks, which are also performed by the coordinator. First, before launching an application program, the shell needs to pass the current node identity into the kernel. (The reason is explained in Section 4.5.1.) Second, after launching an appli-cation program, the shell needs to register the forked process with the kernel. (The reason is ex-plained in Section 4.5.)

4.6.2. Tcpdump packet filtering and capturing tool The tcpdump program is a packet filtering and capturing tool. It is a user-level program that can pass filtering rules to the kernel and display cap-tured packets. Packet filtering operations are actu-ally performed by the BPF module in the kernel. When a packet is sent or received at an interface, the device driver of the interface passes the packet to the BPF module for evaluation. If the BPF module decides to accept this packet, it will associate a timestamp with the packet. The module gives each captured incoming packet a timestamp to record when it is received by the interface. The module also gives each captured outgoing packet a timestamp to record when it is transmitted onto a link.

The tcpdump program operates on an interface. Since from the kernelÕs viewpoint, a tunnel inter-face is no different from a real interinter-face, the tcp-dump program should be able to work correctly to capture packets flowing on a nodeÕs interface in a simulated network. In the current design, however, some modifications are needed to let the tcpdump program generate correct output.

In Section 4.4, we show that in the current de-sign, the packets that should be sent through a

(17)

tunnel interface are no longer queued in the output queue of the tunnel interface. Instead, they are immediately dequeued by the simulation server as soon as they are enqueued into the tunnel inter-faceÕs output queue. The only place where they may be queued (and delayed) is in the PSBM module associated with this tunnel interface, which is in the simulation server. This causes a problem as now the timestamps given by the tunnel inter-faceÕs device driver to these packets are incorrect. On a real-life machine, the timestamp given to an outgoing packet represents the time when the packet is transmitted to a link rather than the time when the packet is enqueued into the output queue. However, in the current design, if there is no modiﬁcation, a packet will receive a timestamp that represents the time when it leaves the tunnel interface (or enters the PSBM module, they are the same.), rather than when it is transmitted to a link. To solve this problem, we disabled the part of the tunnel interface device driver that is responsi-ble for passing each outgoing and incoming packet to the BPF module. We also developed a tcpdump module and insert it between the MAC and PHY modules. This tcpdump module cannot hold any packet. When receiving a packet from the MAC module (if it is an outbound packet) or from the PHY module (if it is an inbound packet), the tcpdump module makes a copy of the packet, gives it a special tag, and associates it with the current timestamp. The tcpdump module then writes the copy into the kernel through the tunnel interface that the user-level tcpdump program is currently operating on. The tunnel interfaceÕs device driver, when seeing this special tag, passes the packet to the BPF module for evaluation. If the BPF module decides to accept this packet, it then passes this packet to the user-level tcpdump program.

As an example, suppose that during a simula-tion a user wants to monitor the traﬃc of a nodeÕs interface and this interface is internally simulated by tun4. In the nodeÕs command console, the user will execute the user-level tcpdump program and the command console shell will internally translate this command string to ‘‘tcpdump -i tun4.’’ From now on, a user-level tcpdump program is running and monitoring packets on tun4. At the same time, the shell also asks the simulation server to insert a

tcpdump module between the MAC and PHY modules that are associated with tun4. From now on, when receiving a packet (either incoming or going), the inserted tcpdump module will make a copy of this packet, give it a special tag, and as-sociate it with the current timestamp. The module will then write it into the special ﬁle of tun4 (i.e., / dev/tun4). After entering tun4, due to the special tag, the copy of the packet will be passed to the BPF module for evaluation. If accepted, the copy will be received by tcpdump program operating on tun4 at the user-level.

The above design has two advantages. First, we can fully exploit the power of the tcpdump gram. Any feature of the real-life tcpdump pro-gram can be used. Second, we reuse the user-level tcpdump program and in-kernel BPF module code to the maximum extent. They need not be modiﬁed at all. We only need to disable a very small part of the tunnel device driver code and write a very simple tcpdump module.

5. Scalability issues

Because in our scheme a single UNIX machine is used to simulate a whole network (including nodesÕ protocol stacks, traﬃc generators, etc.), the scalability of the simulator is a concern. In the following, we discuss several scalability issues. 5.1. Number of nodes

Because our scheme simulates multiple routers and hosts by letting packets re-enter the simulation machineÕs kernel, there is no limitation on the maximum number of routers and hosts that can be simulated in a network. For hubs and switches, since they are simulated in the simulation server, there is no limitation on them either.

5.2. Number of interfaces

In our scheme, because each layer-3 interface uses a tunnel interface, the maximum number of layer-3 interfaces that can be simulated is limited by the maximum number of tunnel interfaces that a BSD UNIX system can support, which currently

(18)

is 256. (This limitation is caused by UNIX using an 8-bit integer as a deviceÕs identity.) Since this problem can be easily solved, (for example, we can clone tunnel interfaces, give the cloned interfaces diﬀerent names, and use them in the same way as we use the original tunnel interface.), there is no limitation on the maximum number of layer-3 in-terfaces that can be simulated in a network. For layer-1 and layer-2 interfaces (used in hubs and switches, respectively), since they are simulated in the simulation server, there is no limitation on them either.

5.3. Number of routing entries

In our scheme, the kernel routing table needs to store many source-destination-pair IP addresses so that packets can be automatically forwarded across routers by the kernel. Since the kernel routing table is used only by routers, if a simulated network has only one subnet (and thus has no router), the kernel routing table need not be used and can be empty.

The NCTUns 1.0 supports the ‘‘subnet’’ con-cept. Therefore, the more efficient subnet-routing scheme can be used instead of the less-efficient host-routing scheme. For example, for the net-work depicted in Fig. 7(a), instead of storing the two routing entries [8.1.9.3 tun1 1.0.8.2] and [8.1.9.4 tun1 1.0.8.2] in the kernel routing table, we can store only one routing entry [8.1.9 tun1 1.0.8.2] in the table. Because the subnet-routing scheme can be used, suppose that in a simulated network there are S different subnets and on average there are Hhosts residing on a subnet, the number of source-destination-pair routing entries that need to be stored in the kernel routing table would be

about S H S. As an example, suppose that S is

30 and H is 20, the number of required routing entries would be about 18,000.

We have tested several network conﬁgurations that need to store over 60,000 routing entries in the kernel routing table. We found that because the BSD UNIX systems use the radix tree [14] to ef-ﬁciently store and look up routing entries, using a large number of routing entries in a simulation is feasible and does not slow down simulation speed much.

5.4. Number of application programs

Since application programs running on a UNIX simulation machine are all real independent pro-grams, the simulation machineÕs physical memory requirement would be proportional to the number of application programs running on top of it. Al-though, at ﬁrst glance, this requirement may seem severe and may greatly limit the maximum number of application programs that can simultaneously run on a UNIX machine, we found that the virtual memory mechanism provided on a UNIX machine together with the ‘‘working set’’ property of a running program greatly alleviate the problem. The reason is that, when an application program is running, only a small portion of its code related to network processing will need to be present in the physical memory. In addition, because UNIX machines support the uses of shared libraries and shared virtual memory pages, the required mem-ory space for running the same application pro-gram multiple times can be greatly reduced.

6. Simulator performance

Here we report the simulation speed of the NCTUns 1.0 under several network and traﬃc conﬁgurations. The used machine for performance testing is an IBM A31 notebook computer equip-ped with a 1.6 GHz Pentium processor and 128 MB RAM.

6.1. Variable CBRUDP on a single-hop network case

In this test suite, the network topology is a single-hop network in which a sending host and a receiving host are connected together by a link. The bandwidth of the link is set to 10 Mbps and the delay is set to 10 ms, in both directions. The traﬃc generated is a one way constant-bit-rate (CBR) UDP packet stream. Each UDP packet size is set to 576 bytes.

We varied the packet inter-arrival time of the CBR packet stream to see how the simulatorÕs speed will change when it needs to process more events in each simulated second. The packet

(19)

inter-arrival time is the time interval between two suc-cessive packet transmissions. The tested intervals are 0.001, 0.005, 0.025, 0.125, 0.625, and 3.125 s, respectively.

The performance metric reported is the ratio of the simulated seconds to the elapsed seconds for running the simulation. In all of our tests, the simulated seconds is set to 999 s. A simulation case with a higher ratio means that it can be ﬁnished more quickly than a case with a lower ratio. A simulation case with a ratio of 1 means that it needs the same amount of time in real time to ﬁnish simulating the amount of time that it wants to simulate.

Fig. 9 shows the ratio vs. CBR packet inter-arrival time performance plot. We see that due to the discrete-event simulation engine design, a simulation case can be ﬁnished very quickly if it does not have many events to process (i.e., traﬃc) per simulated second. We also see that the ratio (2.5) of the high-load case (0.001 s) is still greater than 1. This means that the simulator can still run 2.5 times faster than the real world under this load on the testing machine.

6.2. Multiple greedy TCP connections on a single-hop network case

Since during a simulation the applications that are run to generate traﬃc are real-world programs, we are interested to see how the simulatorÕs speed

will change if more applications need to be run at the same time. To perform this test, we used the network conﬁguration depicted in Fig. 10.

In this conﬁguration, there are six source nodes, one destination node, and a bottleneck router node. The bandwidth and delay of all links are set to 10 Mbps and 10 ms, respectively. The maximum packet queue length of the FIFO queue in the bottleneck router is the default 50 packets. Be-tween a pair of a source and the destination node, we can set up a greedy TCP connection by running the stcp program on the source node and the rtcp program on the destination node. The length of TCP data packets is 1500 bytes, which is Ether-netÕs MTU.

In this test, we varied the number of greedy TCP connections that are set up to compete for the bottleneck linkÕs bandwidth. The numbers tested are 1, 2, 3, 4, 5, and 6, respectively. In all of these cases, the bottleneck linkÕs bandwidth is always 100% utilized.

Fig. 11 shows that the simulatorÕs speed does not degrade as more applications (stcp and rtcp) are run to generate traﬃc. This phenomenon can be explained because the number of events that need to be processed per simulated second remains about the same. No matter how many greedy TCP connections (their stcp and rtcp programs) are launched to send and receive their data, the ag-gregate amount of data that can be pumped into the bottleneck link or received from the bottleneck

0 20 40 60 80 100 120 140 160 180 0 1 2 3

Simulated Time (999 seconds) / Elapsed Tim

e

CBR Packet Interval Time (in seconds) (0.001, 2.5) (0.005, 12) (0.025, 45) (0.125, 111) (0.625, 143) (3.125, 166)

Fig. 9. The simulation performance under various CBR UDP traﬃc load. (A higher ratio means a better performance.)

Fig. 10. The multi-source-node network topology used to test whether the simulation performance will degrade when more applications need to be run to generate traﬃc.

(20)

link per simulated second is always ﬁxed to the 10 Mbps rate. As such, the stcp and rtcp programs of the six greedy TCP connections will take turns to run, and their aggregate context-switching rate is about the same as in the single greedy TCP con-nection case.

6.3. Fixed CBRUDP on multi-hop networks case For a discrete-event simulation engine, the more events it needs to process in each simulated second, the slower its simulation speed will be. In Section 6.1, we shows that on a single-hop network if we decrease the CBR packet streamÕs packet inter-arrival time, the simulator will become slower. Here we show that, given a ﬁxed CBR packet inter-arrival time, if packets need to go through more hops, the simulator will also become slower.

The network conﬁgurations used are multi-hop chain networks shown in Fig. 12. The node on the left hand side is the source host while the node on the right hand side is the destination host. The bandwidth and delay of all links are set to 10 Mbps and 10 ms, respectively. The packet inter-arrival time of the CBR UDP packet stream is set to 0.025 s and the packet length of each UDP packet is set to 576 bytes.

We performed two suites of performance tests. In the ﬁrst suite, all of the intermediate forwarding

nodes are routers. In the second suite, all of them are switches. We made these two cases to observe how much more costly a router is in forwarding a packet than a switch.

Fig. 13 shows the performances of these two suites. We see that as the number of hops in-creases, the simulation performance decreases. This phenomenon is reasonable as in such a case, the number of events that need to processed in each simulated second increases.

We also see that a router is more costly than a switch in forwarding a packet. This phenomenon can be explained as follows. When simulating a routerÕs forwarding a packet, the simulation

en-0 0.5 1 1.5 2 0 1 2 3 4 5 6 7

Simulated Time (999 seconds) / Elapsed Time

The number of greedy TCP connections

Fig. 11. The simulation performance remains about the same for diﬀerent numbers of application programs running as traﬃc generators.

Fig. 12. The multi-hop networks used to test the simulation performance. The source host is on the left while the destination host is on the right. Intermediate forwarding nodes may be routers or switches. 0 5 10 15 20 25 30 35 40 45 50 0 1 2 3 4 5 6 7

Simulated Time (999 seconds) / Elapsed Time

Number of hops (Router or Switch in between, CBR = 0.025 seconds) Router Switch

Fig. 13. The simulation performance decreases as the number of hops increases. The cost of forwarding a packet by a switch is less than that of forwarding a packet by a router.