Resilient routing overlay network construction with Super-Relay nodes.
Overlay routing has emerged as a promising approach to improve reliability and efficiency of the Internet. The key to overlay routing is the placement and maintenance of the overlay infrastructure, especially, the selection and placement of key relay nodes. Spurred by the observation that a few relay nodes with high betweenness centrality can provide more optimal routes for a large number of node pairs, we propose a resilient routing overlay network construction method by introducing Super-Relay nodes. In detail, we present the K-Minimum Spanning Tree with Super-Relay nodes algorithm (SR-KMST), in which we focus on the selection and connection of Super-Relay nodes to optimize the routing quality in a resilient and scalable manner. For the simultaneous path failures between the default physical path and the overlay backup path, we also address the selection of recovery path. The objective is to select a proper one-hop recovery path with minimum cost in path probing and measurement. Simulations based on a real ISP network and a synthetic Internet topology show that our approach can provide high-quality overlay routing service, while achieving good robustness.
Keywords: overlay topology, Super-Relay nodes, betweenness centrality, resilient routing, one-hop path recovery
A preliminary version of this paper appeared in IEEE Symposium on Computers and Communication (ISCC), 1-6, 2014. This version includes a recovery path selection algorithm and corresponding simulation results and concrete analysis. This work was jointly supported by: (1) the National Basic Research Program of China (No. 2013CB329102); (2) National Natural Science Foundation of China (No. 61271019, 61101119, 61121001, 61072057, 60902051); (3) PCSIRT (No. IRT1049); (4) MICINN (No.TIN2010-19077); (5) CAM (No.S2009TIC-1692); (5) Foundation of Ludong University in China (No. LB2016019, No. LB2016017).
Overlay network has recently emerged as an effective way to support new applications without any changes in the Internet infrastructure. Examples include peer-to-peer (P2P) , content delivery network (CDN) , application-layer multicast , routing overlay network , and service overlay network , etc. In overlay networks, routing overlays have become increasingly a promising approach to improve the reliability and efficiency of the Internet. On the one hand, overlays should be resilient to IP-layer link failures and congestion. On the other hand, overlays should select alternative paths with as low delay as possible to improve the efficiency. To some extent, good network performance depends on a reasonable topology design. The inefficient overlay topology can increase the overhead traffic and limit the performance gain from the overlay routing technology.
An overlay network is formed by a subset of underlying physical nodes. The connections between the overlay nodes are provided by overlay links, each of which is usually composed of one or more physical links. Most previous work on overlay networks can be categorized into two broad categories : P2P overlays and infrastructure overlays. A P2P overlay network is a highly dynamic environment governed by the churn of peer nodes. The peer nodes in a P2P overlay are logically connected together without any support from the network provider. An infrastructure overlay network is deployed and maintained by the third party, which uses fixed nodes distributed in the Internet to facilitate overlay services. Compared with P2P overlays, infrastructure overlays have much better connectivity, higher persistence and availability. Moreover, infrastructure overlays are more effective in fully realizing the potential benefits of overlay routing. Our proposed routing overlay network belongs to this category, which is more suitable for applications requiring resilient and high-performance data delivery, such as A/V multicast, video conference, and multi-path routing, etc.
In order to construct an infrastructure overlay, two problems need to be solved: i). what kind of nodes is selected as overlay relay nodes. ii). How these overlay nodes are connected into the overlay network to meet the overlay routing criteria: reliability, efficiency and scalability. Previous work on the overlay routing just addresses one of the two problems. For example the researches  focus on selecting good relay nodes. They assume that the relay nodes are already deployed. Although other researches  study the overlay node placement problem only for one-hop overlay routing, the selection of relay nodes is ignored. Our proposed scheme in this paper not only considers the selection of relay nodes, but also pays attention to the connection of overlay nodes to provide overlay paths with good quality. Our idea comes from the observation that only a few nodes with high betweenness centrality are repeatedly present in many overlay paths . We call these nodes Super-Relay nodes in this paper. Our objective is to optimize the routing overlay architecture by selecting a small set of Super-Relay nodes and then connecting them properly.
Our approach is significantly different from the previous literatures. In our proposed K-Minimum Spanning Tree with Super-Relay nodes algorithm (SR-KMST), the overlay nodes are composed of Ordinary-Relay nodes and Super-Relay nodes. All overlay nodes are connected into the K-Minimum Spanning Tree (KMST) , where Super-Relay nodes form a full mesh structure. The KMST can ensure the overlay path diversity to promote robust and tolerance. The full mesh structure of Super-Relay nodes is helpful to provide the shortest routing paths. In addition, the full mesh structure consisting of a small number of Super-Relay nodes, rather than all overlay nodes, is beneficial for the scalability of overlay networks. In fact, the technology of super relay nodes is often applied in other fields. For example, the reference  proposes a new method to overcome the mobile edge computing (MEC) system failure by introducing ad-hoc relay nodes. The authors in the reference  refer to some of cloud data servers as super-peers to resolve the problem of link failure recovery in future fifth generation (5G) mobile ad-hoc networks. Specifically, each super-peer in the reference  is designed only as a proxy, by which mobile devices may access the cloud system. Each Super-Relay node in our proposed method is not only a terminal node, but also a routing node, which is able to calculate routing and forward packets.
In our proposed algorithm, although the recovery rate can be improved when IP-layer links suffer from failures or performance degradation, it cannot reach 100%. We cannot ensure that each overlay path is independent completely from the default physical path, because two paths that are disjointed at the overlay layer may share some physical links. As a result, one physical link failure may cause the failure of the default physical path and the overlay backup path simultaneously. This fact motivates us to address the selection of the recovery path in SR-KMST, through which the traffic is rerouted between the source and destination node when the default physical path and the overlay backup path suffer from simultaneous failures.
Restoration methods can be classified as reactive or proactive . In a reactive method, backup paths are not identified before failures happen, and the search for an alternative path is initiated when the existing path fails. In a proactive method, at least one backup path is reserved when the primary path is established. Both reactive and proactive methods can be link-based or path-based. The link-based approach locally reroutes traffic around the failed component, while the path-based method reroutes traffic through a backup path between the source and destination node. Reactive restoration is more suitable for the application-layer overlay over IP network. In addition, reactive restoration does well in the situation when the default physical path and the overlay backup path suffer from simultaneous failures. In this paper, the recovery path routing is a path-based reactive method.
Real-time performance is one of the measurement indexes of reactive restoration. When the default physical path and the overlay backup path suffer from simultaneous failures, it is not feasible to continuously detect all possible alternatives. Because it brings about too much overhead and typically incurs a latency that would not be compatible with the requirements of real-time applications. Conversely, randomly selecting a relay node, while clearly lightweight, can often result in poor choices. Based on this intuition, we introduce a simple and effective one-hop path restoration method, which selects the relay node from the Super-Relay nodes set. We call this method One-Hop overlay Path Recovery model (OHPR). In OHPR, the relay node is selected from the Super-Relay nodes set whose size is much smaller than the size of the overlay network. Moreover, owing to the merit that Super-Relay nodes can provide more optimal routes for a large number of node pairs, OHPR achieves better routing performance.
In this paper, we carry out the simulations and compare SR-KMST and OHPR with some classical overlay algorithms. The simulation results show that our proposed algorithm outperforms other existing approaches.
The rest of the paper is organized as follows. In Section II, we introduce the related work. Our proposed SR-KMST algorithm and OHPR approach are described in Section III. In Section IV, we present the simulation results and analyze the performance of different overlay topology construction methods. Finally, the paper is concluded in Section V.
2. Related Work
There has been considerable research on routing overlay network to optimize and enable applications over the Internet. Resilient Overlay Network (RON)  uses the overlay routing to quickly detect and recover path outages and performance degradation. Due to its full mesh architecture (FM), RON requires that each node actively monitor all the other nodes and broadcast a full copy of its link state table. Its computational complexity is o([|[V.sub.o]|.sup.2]), where |[V.sub.o]| is the number of overlay nodes. So, FM has the poor scalability. KMST  focuses on a distributed algorithm to compute the K minimum spanning trees for minimizing the state maintenance and overlay link performance measurement overhead, which has minimal overlaps between overlay links. Mesh-Tree topology (MT)  is proposed to enhance the resilience of the overlay multicast, which is formed by joining grandchild-grandparent or uncle-nephew relationship links into the Minimum Spanning Tree (MST) overlay topology. Adjacent Connection topology (AC)  uses the knowledge of IP-layer topology for the overlay network construction based on the following method: if no overlay node is directly mapped to the nodes on the IP-layer path between any pair of overlay nodes, there will be an overlay link connecting these two overlay nodes.
Some heuristic approaches have been proposed to construct the overlay topology. In , the authors propose two heuristic methods to construct overlay topologies: Topology-aware K Minimum joint-Spanning Trees (TKMST) and Topology-aware K Random Connection (KRC). TKMST and KRC take the degree constraint to each overlay node. It is well known that the problem of the degree-constrained minimum spanning tree is a NP-hard problem. These two heuristic algorithms have computational complexities of o([|[V.sub.u]|.sup.3]), where |[V.sub.u]| refers to the number of the physical network nodes. In , the authors address three traffic demand-aware heuristic overlay topology construction algorithms by considering the node degree constraint. But they assume that the traffic demand passing through overlay nodes is invariant, which is usually impractical.
In addition, Service Overlay Network (SON) topology construction problem is considered in  and . The authors take SON as an optimization problem, whose objective is to minimize the economic cost while meeting the bandwidth and delay constraints. In  and , the authors construct Service Overlay Network to support service composition, which facilitates the flexible creation of services and the resource provision for QoS. In our earlier work , we proposed an open multi-plane framework for Next Generation Service Overlay Network (NGSON), in which different functional overlays can systematically be coordinated with each other. The problem of dynamic overlay network reconfiguration is addressed in , its aims is to find the optimal reconfiguration policies that can meet the time-varying communication requirements while minimizing the total overlay network cost.
From the perspective of the constructing process, our SR-KMST is similar to KMST. But we consider the effect of Super-Relay nodes on the performance of overlay routing paths. SR-KMST outperforms KMST in terms of the routing performance and the routing overhead.
On the other hand, considerable researches have been proposed to address the issue of backup path selection in the overlay networks. In , the authors develop an overlay backup path model based on a correlated link failure probability. They assume that the joint overlay link failure probabilities are known. However, obtaining these probabilities consume significant resources. The authors in  introduce a heuristic approach based the earliest-divergence rule in AS-level topology to select a backup path, but it is difficult to obtain the accurate AS-level topology information . The proposal in  employs the Grid Quorum System to reduce the end-to-end measurement cost for the one-hop backup path selection, which is restricted to a full-mesh overlay network. The existing literatures (e.g. [23-25]) are mostly proactive methods.
Our work in this paper is an extension of our earlier work  appeared in IEEE Symposium on Computers and Communication (ISCC2014), where we increase a recovery path selection algorithm called OHPR and the corresponding simulation results and concrete analysis.
OHPR employs one-hop source routing to reactively recover from path failures, which is similar to scalable one-hop source routing (SOSR) . SOSR first proposes a one-hop source routing approach to mitigate Internet path failures. SOSR randomly picks up K candidate relay nodes and then chooses the best one to form the one-hop overlay path. Clearly, when K is large, such a system performs well, but it may bring about the heavy-weight measurement cost. When K is small, on the other hand, random selection may potentially rule out the good relay nodes. In comparison with SOSR, OHPR selects the intermediary relay node from a small set of Super-Relay nodes.
3. K-Minimum Spanning Tree overlay routing topology with super-relay nodes
3.1 Problem Formulation
Assume that a graph [G.sub.u] ([V.sub.u], [E.sub.u]) describes an IP-layer network topology, where [V.sub.u] and [E.sub.u] are the set of nodes and the set of links in the physical network respectively. And [V.sub.o] [subset] [V.sub.u] is the set of nodes in the overlay network. N and M are the IP-layer topology size and the overlay network size respectively. [P.sup.mn.sub.ij] is an IP-layer link indicator, where [P.sup.mn.sub.ij] if the IP-layer path between m and n includes the IP-layer link ij; otherwise [P.sup.mn.sub.ij] = 0. [Q.sup.xy.sub.mn] is an overlay link indicator, where [Q.sup.xy.sub.mn] if there exists an overlay path from x to y passing through the overlay link mn, [Q.sup.xy.sub.mn] otherwise. Let [D.sub.ij] be the delay of the link ij in a physical network, the overlay topology construction problem can be formalized as follows:
Find an overlay topology [G.sub.o] ([V.sub.o], [E.sub.o]) (where [E.sub.o] is the set of overlay links) that minimizes the cost function:
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (1)
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (2)
The objective function in Eq. (1) minimizes the network delay. Constraint (2) states that for each overlay path (x,y), there is no repeated link (i,j) on the corresponding physical path.
3.2 Foundation of Selecting Super-Relay Nodes
From the problem formulation above, the objective of the overlay construction problem is to minimize the end-to-end delays of routing paths between all source and destination pairs. References  indicate that only a few nodes with high betweenness centrality can be repeatedly present in many overlay paths. In other words, a small number of relay nodes can provide optimal routes to a large portion of end-to-end pairs. We call these nodes Super-Relay nodes.
Betweenness centrality  of a node v is the sum of the fraction of all-pairs shortest paths that pass through v:
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (3)
where V is the set of nodes, [[sigma].sub.st] denotes the number of shortest paths from s to t, and for any v [member of] V, [[sigma].sub.st] (v) is the number of shortest paths from s to t that go through v.
As a result, selecting and connecting the Super-Relay nodes properly can improve the performance of overlay networks. Fig. 1 plots the betweenness centralities of all nodes in CN070  (depicted in detail in Section IV), where in x-axis node IDs are sorted by their betweenness centralities in a decreasing order. From this figure, we can obtain that only a few nodes have extremely high betweenness centralities. Here the circle line and the star line refer to the betweenness centralities of nodes in the unweighted topology and the corresponding weighted topology of CN070 respectively. In the weighted network, the link bandwidth (available bandwidth) is assigned according to a uniform distribution in the range [40, 240] Mb/s, and the weight of a edge is set as the reciprocal of its bandwidth. Assigning different weights to links can generate different network topologies. In Fig. 1, the betweenness centrality of each node in the weighted topology of CN070 is the average value after assigning the link weight 2000 times. From this figure, we can observe that the betweenness centralities of the nodes in the unweighted network and the corresponding weighted network have almost the same distribution. This implies that the good relay nodes can be selected from the unweighted network. Moreover, because the link weight in the weighted network changes dynamically (namely the available bandwidth of the link changes dynamically), it is simple and convenient to compute the beweenness centralities of nodes in the unweighted network. In our proposed SR-KMST, the Super-Relay nodes are selected from the unweighted network according to their betweenness centralities.
[FIGURE 1 OMITTED]
3.3 SR-KMST Algorithm
Based on the analysis above, we use the K-Minimum Spanning Tree with Super-Relay nodes algorithm (SR-KMST) to construct an overlay network, which is composed of Super-Relay nodes and Ordinary-Relay nodes. Super-Relay nodes refer to a certain number of relay nodes whose betweenness centralities are high. Super-Relay nodes are selected from the unweighted physical network directly according to their betweenness centralities. Ordinary-Relay nodes are selected based solely on the requirements of end-systems or users (e.g. the geographical positions of nodes). Therefore, Super-Relay nodes are extracted from the physical network, while Ordinary-Relay nodes are mapped to some physical nodes based on the requirements of end-systems or users. Super-Relay nodes play a crucial role in improving the routing performance of the overlay networks. In Fig. 2, A, B, E, F, G and I are the Ordinary-Relay nodes, while C, D and H are the Super-Relay nodes. The number of Super-Relay nodes to be selected depends on the size of the physical network. The simulation results (in Section IV) show that selecting about 5%-10% of the total nodes as Super-Relay ones can achieve good performance.
In the process of constructing an overlay network by SR-KMST, all overlay nodes are connected into K-minimum spanning trees. If the full-mesh connection is not formed among all Super-Relay nodes, we connect them into the full-mesh structure. In the K-minimum spanning trees, K can be set to different values based on different cost-performance tradeoffs and node degree constraints (how to set the value of K will be studied in our future work). The weight of an overlay link is defined as the number of hops of the IP-layer path that the overlay link passes through.
[FIGURE 2 OMITTED]
As is well-known, a minimum spanning tree is the lowest cost tree among all candidate trees that connect all nodes. We construct K trees (K-trees) to ensure the existence of K disjoin overlay paths between any two overlay nodes, to promote fault- and performance-tolerance, and to enable path diversity. SR-KMST has the advantage of the minimum spanning tree and the K-trees. In addition, Super-Relay nodes play a crucial role in minimizing the routing paths and reducing the overlay link measurement overhead due to their full-mesh structure.
Fig. 2 shows 2-minimum spanning tree (2-MST) overlay topology with Super-Relay nodes, in which the solid lines and dashed lines in the overlay network belong to two least disjoint minimum spanning trees, and the nodes C, D and H are the Super-Relay nodes, which together compose the overlay topology. Because the full-mesh structure is not formed among the nodes C, D and H in the 2-MST topology, we add an edge between C and H. The detailed description of SR-KMST algorithm is as follows.
Algorithm 1 SR-KMST algorithm Let [G.sub.u] ([V.sub.u], [E.sub.u]) represent an unweighted network corresponding to the weighted network [G.sub.u] ([V.sub.u], [E.sub.u]). [V.sub.OR] is the Ordinary-Relay nodes set. [V.sub.SR] is the Super-Relay nodes set. [V.sub.SR] is the number of selected Super-Relay nodes. MST(G) refers to the edges of the minimum weighted spanning tree of graph G. If graph G is not connected, MST(G) would correspond to a forest of disconnected components. Input: [G'.sub.u] ([V.sub.u], [E'.sub.u]) and [V.sub.OR]. Output: overlay topology [G.sub.o]. 1: compute the betweenness centralities BCs of all the nodes in ([V.sub.u], [E'.sub.u]). 2: select [N.sub.r] nodes as Super-Relay nodes according to the descending order of BCs, and obtain [V.sub.SR]. 3: connect [N.sub.o] = |[V.sub.OR] [union] [V.sub.SR]| overlay nodes into a temporary full-mesh topology [G.sub.TO]. 4: initialize each overlay link's weight in [G.sub.TO] as the number of hops of the corresponding IP path. 5: compute K-minimum spanning tree [F.sub.k] in [G.sub.TO] by using the following method: [G.sub.0] = [G.sub.TO]; [F.sub.0] = null; for j = 1... k do [T.sub.j] = MST([G.sub.j-1]); [F.sub.j] = [F.sub.j-1] [union] [T.sub.j]; [G.sub.j] = [G.sub.j-1]\[T.sub.j]; end for 6: connect [N.sub.r] Super-Relay nodes into a temporary full-mesh topology [G.sub.relay], in which the weight of each link is the number of hops of the corresponding IP path. 7: [G.sub.o]= [F.sub.k] [union] [G.sub.relay]
Table 1 shows the computational complexity and the node degree analysis of SR-KMST. N refers to the number of nodes in the physical network. M is the number of nodes in the overlay network. [N.sub.o] is the number of nodes in the overlay network constructed by SR-KMST, where the number of Super-Relay nodes is [N.sub.r]. [L.sub.p] denotes the average number of hops in the IP-layer paths. Node degree determines the number of neighbors that each node needs to maintain in the overlay topology.
From Table I, we can observe that the complexity of SR-KMST is a little greater than that of KMST ([N.sub.0] [greater than or equal to] M). Though there exists full-mesh connections among all Super-Relay nodes in SR-KMST, the node degree of SR-KMST is much less than that of FM (M>> [N.sub.r] and M-1>> K). We believe that the increased complexity will be marginal if the service performance is increased.
3.4 One-Hop overlay Path Recovery model with Super-Relay nodes (OHPR)
OHPR is a reactive restoration method based on the one-hop source routing, which is the supplement of SR-KMST algorithm. When the IP-layer path fails or its performance degrades, the source uses the overlay network constructed by SR-KMST to find an overlay backup path connecting to the destination. If the overlay backup path also fails, OHPR algorithm will be used to recover from the failures occurred between the source and destination.
[FIGURE 3 OMITTED]
As shown in Fig. 3, the overlay backup path [O.sub.A]-[O.sub.C]-[O.sub.D]-[O.sub.B] shares the physical link [L.sub.CD] with the default physical path A-C-D-E-B. When the failure of [L.sub.CD] causes the simultaneous failure of the overlay backup path [O.sub.A]-[O.sub.C]-[O.sub.D]-[O.sub.B] and the physical path A-C-D-E-B, the source A (or B) selects an intermediary node [O.sub.i] (i = 1,2,3 * * *) and attempts to reroute its packets to the destination B (or A) through this intermediary node.
The key to the OHPR algorithm is to find out an optimal intermediary node [R.sub.op] through which the traffic is rerouted with the minimum end-to-end delay. Let [V.sub.SR] be the Super-Relay nodes set, and [M.sub.SR] be the Super-Relay nodes included in the failed overlay backup path. The subset [T.sub.SR] = [V.sub.SR] - [M.sub.SR] is obtained by removing [M.sub.SR] from [V.sub.SR]. OHPR selects [R.sub.op] from [T.sub.SR] as follows: when the default physical path and the overlay backup path suffer from simultaneous failures, the failed overlay links are labeled as unreachable (e.g. [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]) firstly. For each Super-Relay node [O.sub.i] [member of] [T.sub.SR] (i = 1,2,3 * * *), the source node S probes the destination D by using one-hop source routing S [right arrow] [O.sub.i] [right arrow] D. If the one-hop source route S [right arrow] [O.sub.i] [right arrow] D has the minimum end-to-end delay, [O.sub.i] is selected as [R.sub.op]. The detailed description of the intermediary node selection algorithm is as follows.
Algorithm 2 Intermediary node selection algorithm Let [??] and [??] refer to the end-to-end delay of the shortest path S[right arrow][O.sub.i] and [O.sub.i] [right arrow] D respectively. Input: Overlay topology [G.sub.o] ([V.sub.o],[E.sub.o]) and Super-Relay nodes set [V.sub.SR]. Output: Optimal intermediary node [R.sub.op]. 1: Set the weight of each failed overlay link to infinite. 2: [M.sub.SR] [left arrow] [empty set], [D.sub.min] [left arrow] [varies] 3: For each Super-Relay node [Temp.sub.SR] included in the failed overlay path [M.sub.SR] =[M.sub.SR] [union] [Temp.sub.SR] End for 4: [T.sub.SR] =[V.sub.SR] - [M.sub.SR] 5: For each Super-Relay node [O.sub.i] [member of] [T.sub.SR] (i = 1,2,3 * * *) Compute [??] and [??] using Dijkstra algorithm. [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] If [D.sub.min] > [??] [D.sub.min] = [??] [R.sub.op] = [O.sub.i] End if End for
From Algorithm 2 above, we can obtain that the computational complexity of OHPR is O(|[V.sub.SR]|), which is a little greater than that of SOSR (O(|K|)). Where |[V.sub.SR]| is the number of Super-Relay nodes, and |K| is the number of K candidate relay nodes selected in SOSR algorithm. In general, K is set to 4 in SOSR.
4. Performance evaluation
4.1 Simulation settings
To evaluate the performance of our proposed algorithm, we carry out the simulations on top of two IP-layer topologies: a real AS-level topology CN070  with 135 nodes and 338 links, and a random topology GT180 generated by GT-ITM  with 180 nodes and 502 links. CN070 records the interconnection situation of most routers in China in 2006. GT180 is based on the following Waxman probability :
P(u,v) = [[alpha]e.sup.-d(u,v)]/[beta]L (4)
where P(u,v) is the probability that a link between node u and v is created, d(u,v) is the Euclidean distance between u and v, L is the maximum distance between two nodes, and [alpha] and [beta] are parameters in the range (0, 1). Larger values of [beta] result in graphs with higher edge densities, while small values of [alpha] increase the density of short edges relative to longer ones. In the simulation, we take [alpha] = [beta] = 0.03 and L = [square root of 2a], where a is a constant and set as a = 180.
The link bandwidth in CN070 and GT180 is assigned according to a uniform distribution in the range [40, 240] Mb/s. It is worth highlighting that this value represents the available bandwidth of the IP-layer link. In the simulation, the weight of an edge in [G.sub.u] ([V.sub.u],[E.sub.u]) is set as the reciprocal of its bandwidth.
Suppose that the IP-layer always takes the shortest path protocol based on the link-state information as its routing protocol. For the overlay topology constructed by the algorithms except SR-KMST, each overlay node is mapped to one of the physical nodes randomly. The overlay topology constructed by SR-KMST algorithm is composed of Super-Relay nodes and Ordinary-Relay nodes. The mapping mechanism of Ordinary-Relay nodes is random, and that of Super-Relay nodes is deterministic, because Super-Relay nodes are selected from the physical network based on their betweenness centralities. We also assume that the delay of an overlay link is proportional to its link length, that is, the weight of an edge in [G.sub.o] ([V.sub.o],[E.sub.o]) is the number of hops of the corresponding IP-layer path.
4.2 Performance Metrics
To compare the performance of different overlay topology construction algorithms, we focus on the following performance metrics  during the simulation.
a) Failure Recovery Ratio (FRR)
The failure recovery ratio is an important metric to evaluate the overlay network's service performance, which is defined as follows:
FRR=Number of recovered failure paths via overlay/Number of failed IP - layer paths (5)
b) Recovery Path Hop Penalty (RPHP)
We assume that the IP-layer always takes the shortest path protocol to connect the source and destination pairs. This means that the recovered overlay path may have higher number of hops comparing with the default IP-layer path. To some extent, longer IP-layer path means longer latency. In practice, data packets transmission between inter-AS (Autonomous System) may not be along the shortest path , this is because each AS is an independent business entity and the BGP routing policy reflects the inter-AS commercial relationships. In this case, the recovered overlay path may be shorter than the default IP-layer path. We use the following recovery path hop penalty to quantify the overlay path's physical distance compared with the original IP-layer path:
RPHP= Number of hops in recovered path via overlay/Number of hops in the corresponding failed IP - layer paths (6)
c) Average Node Degree (AND)
Average Node Degree reflects the information about the topology overhead caused by the path measurement and route calculation among nodes. The greater the overlay node degree is, the higher the cost of topology maintenance and route calculation is. Let [d.sub.i] be the number of neighbors of the [i.sup.th] overlay node, and M be the number of overlay nodes, Average Node Degree can be defined as follows:
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (7)
4.3 Simulation analysis
To evaluate the performance of our SR-KMST algorithm, we compare it with some classical overlay topology construction algorithms: Full Mesh (FM), K-Minimum Spanning Tree (KMST), Adjacent Connection (AC) and Mesh Tree (MT). For SR-KMST and KMST, we set K=2. During the simulation, we vary the following parameters: IP-layer link failure ratio, overlay network size, and the number of Super-Relay nodes. For each simulation scenario, we run the simulation 2000 times and obtain the average value for each performance metric.
a) The comparison of the overlay path length
We randomly select 80 and 120 overlay nodes from CN070 and GT180 respectively, and use different topology construction algorithms to construct the overlay topology. For SR-KMST, 10 and 20 Super-Relay nodes are selected in CN070 and GT180 respectively according to their betweenness centralities in a descending order. Fig. 4 shows the Probability Density Function (PDF) of the path length between all nodes in different overlay topologies, which refers to the relative probability that the path length takes a given value in this paper. The overlay path length refers to the number of link hops of the corresponding IP path. Fig. 4(A) shows the result of CN070. From Fig. 4(A), we can observe that the maximum overlay path length is 6 in AC and SR-KMST, which is less than that in KMST and MT. In addition, nearly half of the overlay paths have the length of 3 in FM and SR-KMST. The result of GT180 is similar to that of CN070. This indicates that SR-KMST can provide better performance than other topology construction algorithms. MT performs the worst, while the performance of KMST is worse than that of AC.
[FIGURE 4 OMITTED]
b) The effect of IP-layer link failure ratio
The simulation setting is similar to a). The overlay network is composed of 80 and 120 nodes in CN070 and GT180 respectively, 10 and 20 Super-Relay nodes are selected for SR-KMST. An IP-layer link failure means all IP-layer routing paths passing through this link fail. The IP-layer link failure ratio is the number of concurrent failed links divided by the total number of links in the physical topology, which is selected between 0.01 and 0.1. The simulation results are shown in Fig. 5 and Fig. 6.
From Fig. 5 and Fig. 6, for all methods, we can see the following trend: with the increase of the IP-layer link failure ratio, the failure recovery ratio decreases while the recovery path hop penalty increases by a small margin. More importantly, the failure recovery ratio of SR-KMST is significantly greater than that of others. Furthermore, SR-KMST and FM have the same performance in terms of the recovery path hop penalty, which are better than AC, KMST and MT. This suggests that:
* Some overlay links in FM can be pruned without scarifying the performance, in other words, a small number of Super-Relay nodes play a key role in improving the overlay network performance.
* SR-KMST decreases the number of shared links between the default IP-layer path and the overlay backup path, and thus improves the failure recovery ratio.
* SR-KMST is better than KMST in term of failure recovery ratio and recovery path hop penalty. This is because the full-mesh structure of Super-Relay nodes can not only decrease the number of shared links between the default IP-layer path and the overlay backup path, but also reduce the overlay path length.
[FIGURE 5 OMITTED]
[FIGURE 6 OMITTED]
However, from Fig. 5(B), we can see that the difference of the failure recovery ratio about GT180 between SR-KMST and the existing algorithms is about 0.01~0.02, which is less than that of CN070 in Fig. 5(A). This result is due to the topology structure of GT180 generated by Waxman random model. The nodes in GT180 are uniformly distributed in the plane and the node degree is relatively uniform. Therefore, each node in GT180 is selected as a relay node with equal probability, which leads to little difference of failure recovery ratio between SR-KMST and the existing algorithms including FM, KMST, AC and MT.
In addition, In Fig. 5, especially for FM, AC, MT and KMST, the failure recovery ratio fluctuates in a narrow range, such as 0.8-0.85 in Fig. 5(A) and 0.975-0.985 in Fig. 5(B). These results are also due to the corresponding topology construction algorithms. Because FM, AC, MT and KMST are flat topologies, during the 2000 times simulations, each overlay node is in the recovery path with equal probability. Because of their different degrees, the resilience of each overlay node is different, which leads to the fluctuation of failure recovery ratio curve. Compared with FM, AC, MT and KMST, SR-KMST belongs to the hierarchical overlay topology, in which the Super Relay Nodes are in the recovery paths with high probability when IP-layer links suffer from failures or performance degradation. The large degrees of the Super Relay Nodes enable SR-KMST to achieve a high failure recovery ratio with relatively smooth curves.
c) The effect of overlay network size
Now, we discuss the overlay network size's effect on the overlay topology performance. We fix the IP-layer link failure ratio to 0.03 and vary the size of overlay networks. For SR-KMST, 10 and 20 Super-Relay nodes are selected to construct the overlay topology in CN070 and GT180 respectively. The simulation results are shown in Fig. 7, Fig. 8, Fig. 9 and Fig. 10.
[FIGURE 7 OMITTED]
[FIGURE 8 OMITTED]
From Fig. 7, we observe that the performance of SR-KMST is much superior to the others. In addition, with the increase of the overlay network size, the recovery ratio of SR-KMST almost remains stable while that of other methods increases. This further indicates the important effect of Super-Relay nodes on the performance of the overlay network.
Fig. 8 shows the recovery path hop penalty with different overlay network sizes. From the figure, we can see that the performance of all methods maintains steady except MT, in which the recovery path hop penalty fluctuates irregularly with the increase of the number of overlay nodes. This is because the links in MT are set up between uncle-nephew or grandfather-grandson nodes, without considering the proximity of nodes in the underlying physical topology. This means that some recovery paths in MT may have higher number of hops than others, which leads to the fluctuation of Recovery Path Hop Penalty. This situation is also shown in Fig. 9. As a result, SR-KMST and FM have the least overlay recovery path hop penalty, which is around 1.2 in CN070 and 1.4 in GT180 respectively.
Failure Recovery Time is a key metric for resilient overlay networks. Lower failure recovery time represents lower end-to-end delay spent on the overlay recovery path. The simulation results are shown in Fig. 9. Note that the failure recovery time in Fig. 9 refers to the computation time of different algorithms and is measured in number of seconds. It is noteworthy that the failure recovery time of SR-KMST is close to that of FM, while far less than that of KMST and AC. Meanwhile, the failure recovery time slightly increases with the increase of the number of overlay nodes. Combined with the information in Fig. 7 and Fig. 8, we can obtain that SR-KMST can achieve greater failure recovery rate with lower delay.
[FIGURE 9 OMITTED]
[FIGURE 10 OMITTED]
Fig. 10 shows the relationship between the average node degree and the overlay network size. The greater the overlay node degree is, the higher the cost of topology maintenance and route calculation will be. From Fig. 10(A) and Fig. 10(B), we can observe that KMST has the least average node degree, and the average node degree of SR-KMST is slightly higher than that of KMST in CN070 and GT180. This suggests that though the Super-Relay nodes are interconnected in a full mesh manner in SR-KMST, they are only a minority of nodes and have less influence on the average node degree. In addition, Fig. 10(C) shows that the average node degree of SR-KMST decreases as the overlay network size increases in CN070 and GT180, which means that SR-KMST does not increase the overlay overhead especially in the large scale overlay application.
d) The effect of Super-Relay node number
In this section, we study the effect of the number of Super-Relay nodes in SR-KMST on the performance, including FRR, RPHP and AND. There are 80 and 120 Ordinary-Relay nodes in CN070 and GT180 respectively, we fix the IP-layer link failure ratio to 0.03 and vary the number of Super-Relay nodes from 5 to 50. The simulation results are shown in Fig. 11. From the figure, we can obtain that the failure recovery ratio and the average node degree increase, and the recovery path hop penalty decreases, when the number of Super-Relay nodes increases. From Fig. 11(A), we can see that the number of Super-Relay nodes "10" in CN070 is an inflection point, where the average increase amplitude of the failure recovery ratio is 6% from 5 to 10 Super-Relay nodes while 0.02% from 10 to 50 Super-Relay nodes. By the same token, "20" is the inflection point in GT180. Using the inflection point, we can determine the suitable number of Super-Relay nodes to be selected. From these results, we can conclude that the Super-Relay nodes can improve the overlay routing performance, as long as we select a suitable number of Super-Relay nodes from the physical networks and connect them properly.
[FIGURE 11 OMITTED]
e) The performance of One-Hop Path Recovery algorithm
In this section, we evaluate the performance of One-Hop Path Recovery algorithm (OHPR). In the simulation, we use SR-2MST (K=2) to construct the overlay topologies in CN070 and GT180 with 10 and 20 Super-Relay nodes respectively. We fix the IP-layer link failure ratio to 0.03 and vary the size of overlay network to analyze the performance of OHPR by comparing with SOSR . For SOSR, we randomly pick up 4 overlay nodes and choose the best one to form a one-hop overlay path. Because both OHPR and SOSR are one-hop source routing approaches and can always find an intermediary node to detour the failed link, we use two metrics Recovery Path Hop Penalty (RPHP) and Failure Recovery Time to evaluate the performance of OHPR. The simulation results are shown in Fig. 12 and Fig. 13. Note that the failure recovery time in Fig. 13 refers to the computation time of different algorithms and is measured in number of seconds. From the figures, we observe that the performance of OHPR is far superior to that of SOSR regardless of the overlay network size.
[FIGURE 12 OMITTED]
[FIGURE 13 OMITTED]
In this paper, a resilient and scalable routing overlay architecture is addressed by taking into account the effect of Super-Relay nodes. Moreover, we develop a method OHPR in SR-KMST to select a proper one-hop recovery path when the default physical path and overlay backup path suffer from simultaneous failures. The simulation results show that adding a few Super-Relay nodes into the overlay network and connecting them properly can improve the routing performance of the overlay network. In addition, selecting an intermediary node from Super-Relay nodes for the one-hop recovery path can enhance the reliability of the overlay network. The simulation results show that SR-KMST outperforms other existing approaches. For the future work, we will focus on the analytical study of Super-Relay nodes to assure a bound of Super-Relay nodes.
 N. Ramzan, E. Quacchio, T. Zgaljic, S. Asioli, L. Celetto, et al. "Peer-to-peer streaming of scalable video in future Internet applications," IEEE Communications Magazine, vol. 49, no. 3, pp. 128-135, January, 2011. Article (CrossRefLink).
 I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan, "Chord: A scalable peer-to-peer lookup service for internet applications," in Proc. of ACM SIGCOMM, pp. 149-160, August 27-31, 2001. Article (CrossRefLink).
 Z. Li, G. Xie, K. Hwang, and Z. Li, "Churn-resilient protocol for massive data dissemination in P2P networks," IEEE Transaction on Parallel and Distributed Systems, vol. 22, no. 8, pp. 1342-1349, August, 2011. Article (CrossRefLink).
 M. Amad, A. Meddahi, D. Aissani, and G. Vanwormhoudt, "GPM: a generic and scalable P2P model that optimizes tree depth for multicast communications," International Journal of Communication Systems (IJCS), vol. 25, no. 4, pp. 491-514, April, 2012. Article (CrossRefLink).
 M. Hosseini, D. T. Ahmed, S. Shirmohammadi, and N. D. Georganas, "A survey of application-layer multicast protocols," IEEE Communications Surveys and Tutorials, vol. 9, no. 3, pp. 58-74, September, 2007. Article (CrossRefLink).
 D. Anderson, H. Balakrishnan, F. Kaashoek, and R. Morris, "Resilient overlay network," In Proc. of ACM Symposium on Operating System Principle (SOSP), vol. 35, no. 5, pp. 131-145, June, 2001. Article (CrossRefLink).
 Z. Li and P. Mohpratra, "QRON: QoS-aware routing in overlay networks," IEEE Journal on Selected Areas in Communications(JSAC), vol. 22, no. 1, pp. 29-40, January, 2004. Article (CrossRefLink).
 L. Subramanian, I. Stoica, H. Balakrishnan, and R. H. Katz, "OverQoS: an overlay based architecture for enhancing internet QoS," In Proc. of USENIX Symposium on Networked Systems Design and Implementation (NSDI), pp. 29-31, March 29-31, 2004. Article (CrossRefLink).  A. Capone, J. Elias, and F. Martignon, "Routing and resource optimization in service overlay networks," Computer Networks, vol. 53, no. 2, pp. 180-190, February, 2009. Article (CrossRefLink).
 S. Roy, H. Pucha, Z. Zhang, Y. C. Hu, and L. Qiu, "On the placement of infrastructure overlay nodes," IEEE/ACM Transactions on Networking, vol. 17, no. 4, pp. 1298-1311, August, 2009. Article (CrossRefLink).
 R. Cohen, D. Raz, "Cost-effective resource allocation of overlay routing relay nodes," IEEE/ACM Transactions on Networking (TON), vol. 22, no. 2, pp. 636-646, April, 2014. Article (CrossRefLink).
 M. Cha, S. Moon, C. D. Park, and A. Shaikh, "Placing relay nodes for intra-domain path diversity," In Proc. of IEEE INFOCOM, pp. 1-12, April 23-29, 2006. Article (CrossRefLink).
 R. Kawahara, S. Kamei, N. Kamiyama, H. Hasegawa, H. Yoshino, E. K. Lua, and A. Nakao, "A method of constructing QoS overlay network and its evaluation," In Proc. of IEEE GLOBECOM, pp. 1-6, November, 2009. Article (CrossRefLink).
 A. Young, J. Chen, Z. Ma, A. Krishnamurthy, L. Peterson, and R. Y. Wang, "Overlay mesh construction using interleaved spanning trees," In Proc. of IEEE INFOCOM, pp. 396-407, March 7-11, 2004. Article (CrossRefLink).
 T. Fei, S. Tao, L. Gao, and R. Guerin, "How to select a good alternate path in large peer-to-peer systems," In Proc. of IEEE INFOCOM, pp. 1-13, April, 23-29, 2006. Article (CrossRefLink).
 S. Banerjee, S. Jee, B. Bhattacharjee, and A. Srinivasan, "Resilient multicast using overlays," IEEE/ACM Transactions on Networking, vol. 14, no. 2, pp. 237-248, April, 2006. Article (CrossRefLink).
 A. Nakao, L. Peterson, and A. Bavier, "A routing underlay for overlay networks," In Proc. of ACM SIGCOMM, pp. 25-29, August 25-29, 2003. Article (CrossRefLink).
 Z. Li, and P. Mohapatra, "On investigating overlay service topologies," Computer Networks, vol. 51, no. 1, pp. 54-68, January, 2007. Article (CrossRefLink).
 D. Adami, C. Callegari, S. Giordano, G. Pagano, et al, "Design and performance evaluation of service overlay networks topologies," Journal of Networks, vol. 6, no. 4, pp. 556-566, April, 2011. Article (CrossRefLink).
 J. Liao, J. Wang, B. Wu, and W. Wu, "Toward a Multi-plane Framework of NGSON: a Required Guideline to Achieve Pervasive Services and Efficient Resource Utilization," IEEE Communications Magazine, vol. 50, no. 1, pp. 90-97, January, 2012. Article (CrossRefLink).
 Q. Wang, J. Wang, J. Yu, M. Yu, and Y. Zhang, "Trust-aware query routing in P2P social networks," International Journal of Communication Systems (IJCS), vol. 25, no. 10, pp. 1260-1280, October, 2012. Article (CrossRefLink).
 J. Fan, and M. H. Ammar, "Dynamic topology configuration in service overlay networks: a study of reconfiguration policies," In Proc. of IEEE INFOCOM, Vol. 2, no. 9, pp. 1-12, April 23-29, 2006. Article (CrossRefLink).
 W. Cui, I. Stoica, and R. H. Katz, "Backup path allocation based on a correlated link failure probability model in overlay networks," In Proc. of IEEE International Conference on Network Protocol (INCP), pp. 236-245, November, 12-15, 2002. Article (CrossRefLink).
 Z. M. Mao, L. Qiu, J. Wang, and Y. Zhang, "On AS path inference," In Proc. of ACM SIGMETRICS, Vol. 33, no. 1, pp. 339-349, June, 06-10, 2005. Article (CrossRefLink).
 X. Zhou, D. Guo, T. Chen, Z. Shu, and X. Luo, "ABPS: an accurate backup path selecting approach in overlay networks," In Proc. of IEEE 14th International Conference on High Performance Computing and Communications (HPCC), pp. 1247-1252, June 25-27, 2012. Article (CrossRefLink).
 K. P. Gummadi, H. Madhyastha, S. D. Gribble, H. M. Levy, and D. J. Wetherall, "Improving the reliability of internet paths with one-hop source routing," In Proc. of USENIX Symposium on Operating System Design and Implementation (OSDI), pp. 13-27, December, 2004.
 U. Brandes, "On variants of shortest-path betweenness centrality and their generic computation," Social Network, vol. 30, no. 2, pp. 136-145, May, 2008. Article (CrossRefLink).
 G. Zhang, "An algorithm for Internet AS graph betweenness centrality based on Backtrack," Journal of Computer Research and Development, vol. 43, no. 10, pp. 1790-1796, October, 2006. Article (CrossRefLink).
 GT-ITM: Modeling Topology of Large Internetworks [online]. Available from:-http://www.cc.gatech.edu/projects/gtitm/.
 B. M. Waxman, "Routing of multipoint connections," IEEE Journal on Selected Areas in Communication(JSAC), vol. 6, no. 9, pp. 1617-1622, December, 1988. Article (CrossRefLink).
 J. Liao, S. Tian, J. Wang, et al, "Load-Balanced One-hop Overlay Multipath Routing with Path Diversity," KSII Transactions on Internet and Information Systems (TIIS), vol. 8, no. 2, pp. 443-461, February, 2014. Article (CrossRefLink).
 Y. Chen, S. Radhakrishnan, S. Dhall, and S. Karabuk, "The service overlay network design problem for interactive internet applications," Computer & Operations Research, vol. 57, pp. 73-82, May, 2015. Article (CrossRefLink).
 J. Liao, L. Yang, J. Wang, and X. Zhu, "Service composition based on niching particle swarm optimization in service overlay networks," KSII Transactions on Internet and Information Systems (TIIS), vol. 6, no. 4, April, 2012. Article (CrossRefLink).
 J. Kim, S. W. Han, D. Yi, and N. Kim, "Media-Oriented Service Composition with Service Overlay Networks: Challenges, Approaches and Future Trends," Journal of Communications, vol. 5, no. 5, pp. 374-389, May, 2010. Article (CrossRefLink).
 S. Tian, J. Liao, J. Wang, Q. Qi, "Overlay routing network construction by introducing super-relay nodes," IEEE Symposium on Computers and Communication (ISCC), 23-26, September, 2014. Article (CrossRefLink).
 N. D. Han, Y. Chung, and M. Jo, "Green Data Centers for Cloud-assisted Mobile Ad-hoc Networks in 5G", IEEE Network, Vol.29, No.2, pp. 70-76, March, 2015. Article (CrossRefLink).
 D. Satria, D. Park, and M. Jo, "Recovery for Overloaded Mobile Edge Computing", Future Generation Computer Systems, Vol. 70, pp. 138-147, May, 2017. Article (CrossRefLink).
Shengwen Tian (1,*), Jianxin Liao (2), Tonghong Li (3), Jingyu Wang (2) and Guanghai Cui (1)
(1) School of Information and Electrical Engineering, Ludong University, Yantai, 264000 - China
[e-mail: firstname.lastname@example.org, email@example.com]
(2) State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, 100876 - China
[e-mail: firstname.lastname@example.org, email@example.com]
(3) Department of Computer Science, Technical University of Madrid, Madrid, 28000 - Spain
(*) Corresponding author: Shengwen Tian
Received October 6, 2015; revised May 26, 2016; revised October 5, 2016; revised October 30, 2016; accepted February 8, 2017; published April 30, 2017
Shengwen Tian received the Ph.D. degree in computer science and technology from Beijing University of Posts and Telecommunications in 2015. Now he is a lecturer in Ludong University, China. His research interests include overlay networks, P2P network, Next Generation Network, network virtualization, and complex network.
Jianxin Liao received his PhD degree at University of Electronics Science and Techno logy of China in 1996. He is presently a professor of Beijing University of Posts and Tel ecommunications. He has published hundreds of papers in different journals and confere nces. His research interests are mobile intelligent network, broadband intelligent networ k and 3G core networks. He is the Specially-invited Professor of the "Yangtse River Sch olar Award Program" by the China Ministry of Education in 2009.
Tonghong Li received his PhD degree from Beijing University of Posts and Telecommunications in 1999. He is currently an assistant professor with the department of computer science, Technical University of Madrid, Spain. His main research interests include resource management, distributed system, middleware, wireless networks, and sensor networks.
Jingyu Wang received his PhD degree from Beijing University of Posts and Telecommunications in 2008. Now he is an associate professor in Beijing University of Posts and Telecommunications, China. His research interests span broad aspects of performance evaluation for Internet and overlay network, traffic engineering, image/video coding, multimedia communication over wireless network.
Guanghai Cui received the Ph.D. degree in computer science and technology from Dalian University of Technology in 2015. Now he is a lecturer in Ludong University, China. His research interests mainly focus on network economics, including incentive mechanism design, application of evolutionary game theory on communication networks and security issues, performance evaluation and social network analysis.
Table 1. Computational complexity comparison among FM, KMST, MT, AC, SR-KMST Complexity Degree FM O([M.SUP.2]) M-1 KMST O(K*[M.sup.2]) K MT O([M.sup.3]) (>2) & (<M-1) AC O([N.sup.3] +[L.sub.p]* [M.sup.2]) <M-1 SR-KMST O(K*[N.sub.o.sup.2]) Ordinary-Relay nodes: K Super-Relay nodes: [N.sub.r] - 1
|Printer friendly Cite/link Email Feedback|
|Author:||Tian, Shengwen; Liao, Jianxin; Li, Tonghong; Wang, Jingyu; Cui, Guanghai|
|Publication:||KSII Transactions on Internet and Information Systems|
|Date:||Apr 1, 2017|
|Previous Article:||An adaptive-harvest-then-transmit protocol for wireless powered communications: Multiple antennas system and performance analysis.|
|Next Article:||On efficient processing of continuous reverse skyline queries in wireless sensor networks.|