CHAPTER 3 Social-based Routing Approach
3.2 Data replicate and receive strategy
國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
22
When the data rank of the encounter node is higher than the data rank of the sender, the data will replicate and send a copy to the encounter node.
3.2 Data replicate and receive strategy
When a node A with data is encounter with node B. First, they will exchange its data’s information, which is including the data’s dRank of node A, as well as the destination node’s personal interest, personal information, and social relation. Second, node B will use destination node’s information to calculate the data’s dRank of B.
Third, node B will compare each data’s dRank between node A and node B, if dRank of node B is bigger than dRank of node A, node B will ask node A to transmit the data to node B.
On the other hand, when node B encounter the node A which have data, after node B calculate the dRank of all receive data information, if dRank of node B is bigger than the dRank of node A, node A will be asked to send that data to node B. If node B want to receive data, node B will check if it have enough space to receive that data, if there is no empty space to receive data, node B will delete the data which has lowest data rank in its space and do it repeatedly until it have enough space for the data of node A.
The flow chart of the data replication approach we mention before is as follow:
‧ 國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
23
Figure 13: Workflow of the sender (Node A)
Figure 14: Workflow of the receiver (Node B)
‧ 國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
24
Figure 15: Transition flowchart
3.3 Rank calculating in social-based routing:
We separate three aspect of social information for calculating the rank of data.
Including personal interest, personal information, and social relation.
1. Personal Interest
In personal interest, we use m-dimensional cosine similarity to calculate the interest similarity between node A and destination D. In cosine similarity, given two vectors of attributes (personal interest of node A and node D which is the destination of x-th data in A).
cos 𝜃 = 𝐴⃗,𝐴 ⃗⃗⃗⃗⃗⃗⃗⃗⃗⃗⃗⃗⃗⃗
𝑥(𝐷)
||𝐴⃗||∙||𝐴 ⃗⃗⃗⃗⃗⃗⃗⃗⃗⃗⃗⃗⃗⃗||
𝑥(𝐷)
=∑
𝑛𝑖=1𝐴
𝑖×𝐴
𝑥(𝐷)
𝑖√∑
𝑛𝑖=1(𝐴
𝑖)
2×∑
𝑛𝑖=1(𝐴
𝑥(𝐷)
𝑖)
2‧
The resulting similarity ranges is from −1 to 1, if the result is -1 meaning their interest is exactly opposite, to 1 meaning exactly the same, with 0 usually indicating independence, and the in-between values indicating intermediate similarity or dissimilarity. By calculating this value we can know whether the interest of node A is similar with the destination node D or not.
2. Personal information
In personal information, we use Jaccard index, also known as the Jaccard similarity coefficient, to comparing the similarity of the personal information set between two user. The Jaccard index defined as the size of the intersection divided by the size of the union of the sample sets.
J(𝐴, 𝐵) = |𝐴 ∩ 𝐵|
|𝐴 ∪ 𝐵|
We use Jaccard index as
where M is the question set of user’s information, 𝑀11 Represents the total number where A and B both have an answer of 1, 𝑀01 Represents the total number where A have an answer of 0, B have an answer of 1, 𝑀10 Represents the total number where A have an answer of 1, B have an answer of 0, 𝑀00 Represents the total number where A and B both have an answer of 0, 𝑀11+ 𝑀01+ 𝑀10+ 𝑀00 = 𝑛 (𝑡ℎ𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑞𝑢𝑒𝑠𝑡𝑖𝑜𝑛). The result is a real number between 0 and 1 which represents the closeness between node A and 𝐴𝑥(𝐷).
When the result is more close to 1, its mean these two node have more similar personal information which can conjecture they have more probability to come front.
J (𝐴, 𝐴 𝑥 (𝐷)) = |𝐴∩𝐴
𝑥(𝐷)|
|𝐴∪𝐴
𝑥(𝐷)| = 𝑀
11𝑀
01+𝑀
10+𝑀
11‧
3. Social relation
We use the idea of PageRank in counting social relation. In PageRank, it under the world wild web environment, use the link between pages to calculate the importance of the page. In our algorithm, we establish a social graph between persons when they socially related to each other. We denote a graph where the nodes are like pages, and the connections are like the links between pages. The node’s rank, popular degree, is defined by the personal interest and personal information similarity between node and this people’s common personal interest and personal information. We establish a virtual node, the node’s interest and information is come from the mode of every node in this experiment environment, so the node’s rank is calculate the personal_table from the node and that virtual node. In our method, social relation is defined by
∑ 𝑅𝑎𝑛𝑘(𝐵)
|𝐹(𝐴)|
𝐵∈𝐹(𝐴)
. When the friend of node A have higher rank, is more popular in this environment, node A’s social relation value is higher.In our research, we combine three aspect estimate method that we mention before, including non-social data personal interest, personal information, and social data social relation. We think about the scene that the two people may not recognize but they often confront cause the same personal information like the same interest or the same classroom. So in our approach, we will consider the non-social aspect. On the other hand, social relation is an important data of sending data, if the person is more popular in the group, it will have higher probability to confront the destination node or the node whom personal table is very familiar with the destination.
The way to calculate rank value in our approach is as follow:
‧ 國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
27
𝑅𝑎𝑛𝑘(𝐴 𝑥 )
= 𝛼 . 𝑆𝑖𝑚 𝑖𝑛𝑡𝑒𝑟𝑒𝑠𝑡 (𝐴, 𝐴 𝑥 (𝐷)) + 𝛽 .J (𝐴, 𝐴 𝑥 (𝐷)) + 𝛾 . ∑ 𝑅𝑎𝑛𝑘(𝐵)
|𝐹(𝐴)|
𝐵∈𝐹(𝐴)
To explain this formula in word isWe combine the approach we mention before by given it a weight and add them together. Cosine similarity for evaluate personal interest cause the active for person may be like, dislike, and no comment. Cosine similarity can separate these three answer and given an outcome about the interest similarity between two nodes. The answer of personal information between two nodes is just the same or different, so the Jaccard Index is suit for evaluate two node’s similarity, if the result is more close to 1, it seems that these two nodes have more probability to confront. At the end, we use the virtual node’s data to calculate the popular degree of node, if the node’s friends is very popular we will know it appropriate for helping data transmission.
Rank(Ax) = α.personal interest + β.personal information + 𝛾.social relation
‧ 國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
28
CHAPTER 4 Simulation Results
In this chapter, we discuss the simulation results of our approach, and we compare with three other protocols. Epidemic [7], Direct Delivery, BubbleRap [5].
Hence, an optimal routing approach might maximizing the delivery ratio while reducing the overhead ratio and delivery delay. So we take these three aspect as the indicators of routing performance.
1. Delivery ratio (Successful delivery ratio from the source to the destination) 2. Overhead ratio
(
𝑅𝑒𝑙𝑎𝑦𝑒𝑑 𝑚𝑒𝑠𝑠𝑎𝑔𝑒𝑠−𝑆𝑢𝑐𝑐𝑒𝑠𝑠𝑓𝑢𝑙 𝐷𝑒𝑙𝑖𝑣𝑒𝑟𝑒𝑑 𝑚𝑒𝑠𝑠𝑎𝑔𝑒𝑠 𝑆𝑢𝑐𝑐𝑒𝑠𝑠𝑓𝑢𝑙 𝐷𝑒𝑙𝑖𝑣𝑒𝑟𝑒𝑑 𝑚𝑒𝑠𝑠𝑎𝑔𝑒𝑠
) 3. Delivery delay (Latency of successful delivery)4.1 Simulation Environment
In our simulation, we use ONE(Opportunistic Network Environment simulator) [14](shown as Figure 16) and the map of NCCU(Nation Cheng-Chi University) surrounding area to validate our approach. All node in the simulation is the student study in this college, and walk around according with their class schedule or their favorite.
‧ 國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
29
Figure 16: ONE simulator
4.2 Simulation Setting
The simulation setting is shown as Table 2. The map area is 3764m x 3420m, which is the main active area of NCCU (Figure 16), and the simulation time is from 8 a.m. to 6 p.m. about 36000 seconds, it is equivalent to 10 hours, this time area is the time which students will in the school. The node data transmission rate is 2Mbps, and the transmission rage is 10m. The message size of data is 500KB~1MB. The interval of message creation is about 300~450 seconds. The node buffer size is 500MB. And the message’s TTL is 18000seconds, it is equivalent to 5 hours.
‧ 國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
30
Area
3764m x 3420mSimulation Times
36000 SecData Rate
2MbpsRadio Range
10mMessage Size
500K~1MBInterval of message creation
300~450Sec
Buffer size
500MBTime To Live
18000 secTable 2. Simulation Settings
Figure 17: NCCU surrounding area
‧
In order to show the efficient of our algorithm, we evaluate our scheme in different buffer size and different distribute weight. The evaluate item including personal interest, personal information, and social relation. And in our simulation, we separate these three aspects to two parts, we seem personal interest plus personal information as personal data, which means it is the variable about people. On the other hand, social relation is the social variable about its relationship about other people in this environment. We compare the three other routing method we have mention before in the beginning of this chapter.
Figure 18: Frame of data rank
4.3.1. Performance of different routing method in our environment
First, we compare the delivery ratio in different time in Figure 19.
Data rank
‧
Figure 19: Delivery ratio at different time
Figure 20: Overhead in each algorithm
0
‧
Figure 21: Delivery delay in each algorithm
In Figure 19, Figure 20, and Figure 21, SB means our approach Social-Based Routing. α means the weight of node’s personal interest, β means the weight of node’s personal information, and γ means the weight of node’s social relation, and we
combine the personal information and personal interest as personal data to separate from social relation, because it have different property.
. As we can see in Figure 19, Epidemic has the highest delivery ratio, but it produce large amount of copies, it not only cause many copies will be dropped before deliver destination, but increase the delivery delay. Hence, Epidemic have high overhead and high delivery delay, so it not an ideal algorithm in the campus
environment. Direct Delivery has no overhead, but it suffer from large delivery delay and low delivery ratio cause it didn’t produce any copies, so data have very low probability to confront the destination. BubbleRap algorithm have low overhead and delivery delay is not the highest but it also have low performance in delivery ratio. In
Epidemic
‧ 國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
34
our algorithm, we use personal interest, personal information, and social relation to distinguish whether data need to be transmitted or be dropped. In the simulation, the best performance is appear when we use 20% personal data and 80% social relation, the delivery ratio is near the Epidemic algorithm but have low overhead and low delivery delay. We can conjecture that social relation is more important factor in campus environment rather than personal data. But if we take social relation as only judgment standard, the delivery ratio will decrease and the delivery delay will increase, since destination node may have its own unique personal data and is far away from other nodes, so when use only social relations may ignore these kind of node. To see other weight distribute in our algorithm, we can see personal data 100%
and personal data 80% and social relation 20% is overlay in the graph, its delivery ratio is not as good as use more social relation data, but also better than other routing algorithm we compare and suffer from low overhead ratio and delivery delay. When using half personal data and half social relation as judgment standard, the delivery ratio is lower than either high personal data weight or high social relation weight, by this result we can know whether focus on the destination node’s characteristic or focus on the relation of formal characteristic is better than dubious about these two approaches.
4.3.2. Performance of different buffer size in our environment
Here we show the delivery ratio, overhead ratio, and delivery delay in different buffer size in Figure 22, Figure 23, and Figure 24.
‧
Figure 22: Delivery ratio vs. different buffer size
Figure 23: Overhead ratio vs. different buffer size
0
‧
Figure 24: Delivery delay vs. different buffer size
As we can see in Figure 22, Figure 23, and Figure 24, when the buffer increase, node could store more data, so the delivery ratio will increase. Epidemic algorithm need more buffer size than others to get its best performance, and its overhead as well as delivery delay is bigger than other algorithm. The Direct delivery and BubbleRap algorithm have the same problem that delivery ratio is lower than other algorithms and need spend lot of time to transmit data.
In our approach, when the buffer size is not enough, we will drop the data with lowest data rank instead of the oldest data, this dropping decision let us have better performance than other algorithm in small buffer size and can reach the best performance with small buffer size.
0
‧
4.3.3. Performance of different personal data weight distribute in our environment
Figure 25: Delivery ratio vs. different personal data weight distribute
In Figure 25, we using different weight of personal data to see whether personal interest or personal information is more important in campus environment routing. α means the weight of node’s personal interest, β means the weight of node’s personal information. In the graph we can see that use 100% personal information or 80%
personal interest plus 20% personal interest will get the higher delivery ratio. This result reveal that in the campus environment personal information have more impact on student movement than personal interest. We can refer that in the campus
environment, students need to go to the particular classroom in the school time, only when the time which have no class can go to the area they like, so this aspect may have low impact when transmit data.
0
‧
CHAPTER 5 Conclusions and Future Work
In this thesis, we proposed a social-based routing approach. We use personal interest, personal information, and social relation three aspect to analysis which node is fit for helping transmit data. When two nodes come front, two node will exchange its data’s information. When the sender node come front a node which is the destination of one data in its database, or that node’s data rank is higher than the data rank in the sender, the receiver will check its space have enough space for the data or not, if the node don’t have enough space, the receiver node will delete the data with the smallest data rank in its space until it have enough space for the data then ask the sender node to replicate the data to itself. Finally, we evaluate the result with other algorithms, and we use some parameters to verify our approach is better than the others, although the delivery is worse than the epidemic, but have low overhead ratio and low delivery delay. Moreover, we find that social relation have more influence without real contact.
This method can save the node’s moving time, for example, when the node want to send node to the destination which is 3km far from it, using the relay node which
‧ 國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
39
with internet may reduce this distance to only few meters. On the other hand, using the internet can find more node than in the real contact, so it may find node that more appropriate to transmit data, and increase the transmission speed as well as reduce the transmission cost.
Figure 26: Concept of virtual contact
‧
Personal and Ubiquitous Computing, Vol 10(4):255–268, May 2006.
[2] Zhensheng Zhang. Routing in Intermittently Connected Mobile Ad Hoc Networks and Delay Tolerant Networks: Overview and Challenges. IEEE
Communications Surveys & Tutorials, 8(1):24–37, 2006.
[3] Christoph P. Mayer. Hybrid Routing in Delay Tolerant Networks. KIT Scientific
Publishing, July 3, 2012.
[4] Bulut. E, Szymanski, B.K., “Friendship Based Routing in Delay Tolerant Mobile Social Networks” in Global Telecommunications Conference (GLOBECOM 2010), 2010 IEEE.
[5] P. Hui, J. Crowcroft, and E. Yoneki, “Bubble rap: Social-based forwarding in delay tolerant networks,” in Proc. ACM MobiHoc, 2008, pp. 241–250.
[6] Jiuxin Cao, Liu Yang, Xiao Zheng, Bo Liu, Lei Zhao, Xudong Ni, Fang Dong and Bo Mao, “Social attribute based web service information publication mechanism in Delay Tolerant Network,” in IEEE International Conference on Computational Science and Engineering CSE/I-SPAN
[7] VAHDAT, A., AND BECKER, D. Epidemic routing for partially connected ad hoc networks. Technical Report CS-200006, Duke University (2000).
[8] LINDGREN, A., DORIA, A., AND SCHELÉ N, O. Probabilistic routing in intermittently connected networks. Lecture Notes in Computer Science 3126 (2004), 239–254.
‧
[9] SPYROPOULOS, T., PSOUNIS, K., AND RAGHAVENDRA, C. S. Spray and wait: an efficient routing scheme for intermittently connected mobile networks.
In proc. WDTN ’05 (2005), ACM Press, pp. 252–259.
[10] A. Mtibaa, M. May, M. Ammar, and C. Diot. PeopleRank: combining social and
contact information for opportunistic forwarding. INFOCOM, 2010.
[11] S. Brin and L. Page. The anatomy of a large-scale hypertextual Web search
engine. In Seventh International World Wide Web Conference, Brisbane,
Australia, 1998.
[12] K. Jahanbakhsh, G.C. Shoja, V. King, Social-greedy: a socially-based greedy
routing algorithm for delay tolerant networks, MobiOpp’10: Proceedings of the
Second International Workshop on Mobile Opportunistic Networking, ACM, New York, NY, USA (2010), pp. 159–162
[13] Karinthy, Frigyes. Chain-Links. Translated from Hungarian and annotated by Adam Makkai and Enikö Jankó.
[14] Ari Keränen, Jörg Ott, Teemu Kärkkäinen, “The ONE Simulator for DTN Protocol Evaluation”, In Proc. SimuTools, March 2009.
[15] N. Eagle, A. Pentland, and D. Lazer (2009), Inferring Social Network Structure using Mobile Phone Data, Proceedings of the National Academy of Sciences (PNAS),106(36), pp. 15274-15278.
[16] Pan Hui, People are the network: experimental design and evaluation of social-based forwarding algorithms, Computer Laboratory Technical Reports - Cambridge University(2008)
[17] P. Jaccard. ´Etude comparative de la distribution florale dans une portion des alpes et des jura. Bulletin del la Soci´et´e Vaudoise des Sciences Naturelles, 37:547–579, 1901.
[18] J. Kleinberg. The small-world phenomenon: an algorithm perspective. In
‧ 國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
42
TOC ’00: Proceedings of the thirty-second annual ACM symposium on Theory of computing, pages 163–170, New York, NY, USA, 2000. ACM.
[19] S. Milgram. The small world problem. Psychology Today, 1:60–67, 1967.
[20] T. Zhou, R. R. Choudhury, K. Chakrabarty, Diverse Routing: Exploiting Social Behavior for Routing in Delay-Tolerant Networks, Pro. Conf. Computational Science and Engineering, Canada, 2009.