Growing Social Networks - Complex Networks Applied to Friend-making Simulations

Chapter 2 Complex Networks Applied to Friend-making Simulations

2.5. Growing Social Networks

The most classical network growth models are that on Internet or on the World-Wild Web, such as Barabasi and Albert’s (1999) growth and preferential attachment mechanisms, and Bianconi and Barabási’s (2001) addition of fitness on the BA model’s vertices. These models continuously add both vertices and edges to the network as time passes (growth); and edges are more likely to connect to vertices of high degree than to ones of low degree (preferential attachment).

Growth models on Internet or on the Web, however, are quite inappropriate as models of the growth of social networks or acquaintance networks. The reasons are (a) the degree distribution of many acquaintance networks does not appear to follow a power-law distribution; (b) the preferential attachment mechanism is not an important one in acquaintance networks; and (c) social networks such as acquaintance networks usually appear high clustering, but growth models of the Web or Internet show weak clustering. Jin, Girvan, and Newman (2001), hence, first attempted at modeling the evolution of the structure of social networks and their model creates a sharply peaked distribution that is conforming to the observation in a lot of real social networks.

Chapter 3 Friendship Evolution and the Three-Rule Model

Davidsen et al. (2002) propose a two-rule model of acquaintance network evolution, with the first rule addressing how people make new friends—via introductions or meetings-by-chance. The second rule is that friendships are broken when one partner dies. The model formulates a fixed number of N nodes and undirected links between pairs of nodes representing individuals who know each other. We will introduce a friend remembering rule that allows for the weakening and strengthening of friendships (Fig. 1).

The model repeats the three rules until the acquaintance network in question reaches a statistically stationary state.

Rule 1. Friend Making: Randomly chosen persons introduce two friends to each

other. If this is their first meeting, a new link is formed between them. Randomly chosen persons with less than two friends introduce themselves to one other random person. Thus, we use the term “introduce” to describe meetings by chance as well as meetings via a common friend.

Rule 2. Leaving and Arriving: At a probability p, a randomly chosen individual and all associated links are removed from a network and replaced by another person. Accordingly, acquaintances can be viewed as circles of friends whose

members can leave for reasons other than death and enter the circle for reasons other than birth.

Rule 3. Friend Remembering: A certain number of friendships are updated, with

the number depending on an update proportion b. This proportion and details about updating will be explained in the next two sections.

Figure 1. Three-rule model flow diagram.

3.1. Friendship Selection Methods

We considered three selection methods for updating friendships. In the first, person selection, an individual chooses b × N persons before picking a specific friend for each

does not have any friends. In this method, b is a proportion factor for deciding how many persons are chosen and N represents the number of persons in the network. In the second method, pair selection, the individual chooses b × N pairs of persons and updates their

friendships. Updating is canceled if the paired persons don’t know each other—a frequent occurrence, since the network in question is sparse in comparison to a complete graph. In this method, b is a proportion factor for deciding how many pairs are chosen. In the last, edge selection, the individual has more direct choice in selecting b × M friendships for updating. In this method, b is a proportion factor for deciding how many friendships are chosen and M is the number of friendships (or edges) at a specific moment.

We rejected the first two methods because in both cases, the number of chosen friendships is in proportion to N (number of nodes or persons). Since N × (N – 1) / 2 (the

upper boundary of the number of friendships) is directly proportional to M (the number of edges or friendships), we adopted the edge selection method for choosing friendships.

3.2. Friendship Update Equation

During friend remembering, the model uses the selection tactic described in the preceding section for choosing a specific number of friendships. If a selected friendship links person u with person v, their friendship is updated using Equation (9), dependent upon individual remembering, resource, and breakup threshold factors:

( )

function. For convenience, the friend remembering q, resource r, and breakup threshold

parameters are normalized between 0 and 1.

Simplification without loss of generality is behind our decision to use D(x) = x as the distribution function and J(a, b) = (a + b) / 2 as the joint function. The updated equation is written as

The equation is divided into two parts by the breakup threshold, ^θ. The first part consists of the terms q (representing the effect of old friendships) and (1 – q) (representing the effect of limited resources). The newly updated friendship may be weakening or strengthening. It may also theoretically equal zero if the new friendship is below the breakup threshold, as shown in the second part of the equation.

3.3. Expected Effects of Local Rules

Acting locally, the three rules influence several aspects of an acquaintance network: (a) the friend-making rule adds links, thereby increasing the average number of friends; (b) the leaving and arriving and friend-remembering rules both remove links, thereby reducing the average number of friends; (c) increases in average number of friends <k>

lead to decreases in the average shortest path length L; and (d) the direction of the clustering coefficient C and average shortest path length L will reverse.

As opposed to the large number of factors associated with the friend-remembering rule, the leaving and arriving rule has a single parameter (probability p). The factor q denotes a person’s ability to remember friends, thus increasing that person’s number of friends. The resource factor r determines an individual’s resources for making friends, thereby setting an upper limit. The breakup threshold ^θ determines the difficulty of cutting off a friendship—a negative influence. The initial friendship factor f0 is a reflection of how much attention a person is paying when making a new acquaintance—a positive contribution to friend-making. We expect that parameters q, r, and f0 will exert positive (increasing) influences on <k> and that parameters p and ^θ will exert negative influences on <k>.

3.4. Fitting a Normal Distribution

For sensitivity analyses of skewness and critical parameters affecting distribution, a feasible probability-distribution function (pdf) must be applied. In most situations a

normal distribution is considered the best choice, but it does not fit our purposes in this study. Since critical parameters such as initial friendship, old friends remembering, resources, and breakup thresholds have ranges of 0 to 1, we chose a beta distribution—a two-parameter family of continuous probability distributions defined according to the interval [0, 1] with a probability density function of

0.2 0.4 0.6 0.8 1.0

Figure 3. Comparison of beta and normal distributions.

Once a simulation reaches a statistically stationary level, then clustering coefficient, average path length, average degree, average of degree squared, and degree distribution statistics can be collected. Degree distributions in our simulations involved some random rippling, especially for smaller populations. However, since large populations consume dramatically greater amounts of simulation time, we applied Bruce’s (2001) ensemble average as follows:

∑

where M is the number of curves to be averaged and p(k) a curve that represents, for example, a degree distribution, a time series, and so forth.

Chapter 4 Experiment

A simulation of our model begins with parameter initialization and ends once the acquaintance network reaches a statistically stationary state. Initialized parameters included the number of persons N, leaving and arriving probability p, updated friendship proportion b, old friend remembering q, breakup threshold

, distribution of friend-making resources r, and distribution of initial friendship f₀. Statistically stationary states were determined by observing average degree <k>, average square of degree <k²>, clustering coefficient C, and average path length L. Each of these four statistics eventually converged to values with slight ripples.

A statistically stationary state of parameter initialization at N = 1,000, p = 0, b = 0.001, q = 0.9,

= 0.1, r with a fixed value of 0.5, and a beta14 f0 ( = 0.9) is shown in Figure 4. Solid lines indicate the acquaintance network and the dashed lines in Figures 4c and d (which are calculated using Equation 6 and Equation 7) indicate the ER random model at the same average degree as the acquaintance model.

0 10 20 30 40 50 60 70 80

Figure 4. Example of a statistically stationary state using the proposed model.

4.1. Effects of Leaving and Arriving

For comparison, we reproduced Davidsen et al.’s (2002) simulations using their original parameters of N = 7,000 and p at 0.04, 0.01, and 0.0025. We then changed N to 1,000 and tested a broader p range. As noted in an earlier section, the leaving and arriving probability p is the only parameter in rule 2. In addition to using various degree distribution diagrams, we gathered <k>, C, and L varying in p and analyzed their correlations to determine the effects of p on the acquaintance network.

The degree distribution P(k) from the two-rule model is shown in Figure 5. All <k>,

C, and L values with parameter initializations for various probability p values are shown in Figure 6. Correlations among <k>, C, and L are shown in Figure 7. The solid lines in Figure 6 reflect the application of Davidsen et al.’s two-rule model; the dashed lines (which are calculated using Equation 6 and Equation 7) reflect the application of the ER model at the same average degree. Contrasts between the two lines in Figures 6b and 6c indicate that the acquaintance network has the small world characteristic. Figure 6a shows that the number of friends increases as the lifespan of an individual lengthens.

According to Figure 7d, the clustering coefficient closely follows average degree not but average path length.

A larger p indicates a higher death rate and a lower p a longer life span. Thus, parameter p acts as an aging factor. Relative to other species, humans require more time to make friends; Davidsen et al. therefore only focused on the p << 0.1 regime. To satisfy the needs of integrity theory, we also explored the p >> 0.1 regime and found that mean degree <k> decreased for p values between 0 and 0.5. The decrease slowed once p > 0.1 (Fig. 6a).

10⁰ 10¹ 10² 10³

k

10^-5 10^-4 10^-3 10^-2 10^-1 10⁰

P (k)

p=0.5 p=0.1 p=0.04 p=0.01 p=0.0025

10⁰ 10¹ 10² 10³

k

10^-5 10^-4 10^-3 10^-2 10^-1 10⁰

P (k)

p=0.5 p=0.1 p=0.04 p=0.01 p=0.0025

Figure 5. Two-rule model degree distribution P(k).

Table 1. <k>, C and L vary in leaving and arriving probability p.

p .5 .1 .05 .04 .01 .0025

<k> 1.55 5.246 10.02 12.64 39.71 115.34

< k²> 4.88 60.13 314.18 467.23 4708.25 29652.69

C 0.213 0.413 0.453 0.465 0.577 0.697

L 17.076 4.226 3.262 3.053 2.416 2.111

0.0 0.1 0.2 0.3 0.4 0.5

Figure 6. <k>, C and L vary in leaving and arriving probability p.

0 20 40 60 80 100 120

Figure 7. Correlations among <k>, C and L vary in leaving and arriving probability p.

Note that in our model, a leaving and arriving probability of 0 means that rule 2 is inactive, and a friendship update proportion of 0 means that rule 3 is inactive. Once rule 3 becomes inactive, our three-rule model becomes the equivalent of Davidsen et al.’s two-rule model. In all of the experiments described in the following sections, N was initialized at 1,000 and b at 0.001.

4.2. Effects of Breakup Threshold

To determine the effects of the breakup threshold on the acquaintance network, experiments were performed with parameters initialized at different levels of the

friendship-breakup threshold . Other initialized parameters were q = 0.6 and the constants r = 0.5 and f0 = 0.5. The solid lines in Figure 8 represent <k>, C, and L statistics without rule 2 (p = 0) and the dashed lines represent the same statistics with rule 2 included (p = 0.0025). The data indicate that rule 2—which acts as an aging factor on acquaintances in the network—reduced both average degree <k> and clustering coefficient C and increased average path length L.

According to the data presented in Figure 8, the breakup threshold ^θ lowers the average degree <k> and raises both the clustering coefficient C and average path length L.

The Figure 9 data show that the C–<k> and L–<k> corrections are negative and the C–L correction is positive. The threshold reflects the ease with which a friendship is broken.

As expected, a higher ^θ results in a smaller number of “average friends” and greater separation between individuals.

Table 2. <k>, C and L vary in breakup threshold ^θ with different leaving and arriving probability p.

0.2 0.1 0.05 0.025 0.0125

p 0

<k> 5.99 9.94 14.39 22.05 38.21

< k²> 58.84 153.62 256.06 533.08 1528.89

C 0.3778 0.3146 0.1899 0.1284 0.1085

L 4.0986 3.2369 2.8368 2.5431 2.1414

p 0.0025

<k> 6.00 9.61 14.30 21.89 34.84

< k²> 60.48 146.94 257.08 531.31 1365.39

C 0.3876 0.3112 0.1532 0.1108 0.1484

L 4.0543 3.3064 2.8975 2.6230 2.3075

0.02 0.04 0.06 0.08 0.10

0.10

0.02 0.04 0.06 0.08 0.10

0.10

0.02 0.04 0.06 0.08 0.10

Figure 8. <k>, C and L vary in breakup threshold ^θ with different leaving and arriving probability p

5 10 15 20 25 30 35 40

Figure 9. Correlations among <k>, C and L vary in breakup threshold ^θ.

4.3. Effects of Resources

To determine the effects of resources and memory factors on acquaintance networks, we ran a series of experiments using parameters initialized with different friend-making resource r and friend-remembering q values. Initialized parameters also included p = 0, ^θ

= 0.1, and a fixed f0 value of 1. According to our results, a larger r raised the average degree <k> but lowered the clustering coefficient C and average path length L (Fig. 10).

While it is not obvious that statistical characteristics are influenced by different resource

distributions. The Figure 10 data also show that an increase in q raised <k> and lowered both C and L. Furthermore, C–<k> and L–<k> corrections were identified as negative while the C–L correction was positive (Fig. 11).

Table 3. <k>, C and L vary in friend-remembering q value with different distributions of friend-making resource r.

q 0. .2 .4 .6 .8 .9

r beta14(=0.1)

<k> 4.00 4.05 6.00 8.17 18.00 39.95

< k²> 24.62 25.60 68.48 135.10 616.24 2338.91

C 0.3588 0.3816 0.3582 0.3590 0.2969 0.1780

L 0.5254 5.4701 3.9116 3.4143 2.7518 2.2644

r beta14(=0.5)

<k> 4.96 6.03 7.99 11.43 21.20 41.25

< k²> 30.59 49.21 105.52 237.55 753.64 2438.32

C 0.4554 0.3840 0.3642 0.3245 0.2382 0.1802

L 6.0173 4.4481 3.4747 3.0483 2.5915 2.2003

r beta14(=0.9)

<k> 8.18 9.73 11.60 14.00 23.97 43.99

< k²> 71.30 109.03 161.57 252.06 821.56 2625.60

C 0.1905 0.2246 0.1737 0.1778 0.1705 0.1461

L 3.8626 3.3340 3.1260 2.8912 2.4840 2.1564

r fixed_value(=0.5)

<k> 4.68 6.02 8.01 11.37 19.97 40.30

< k²> 26.03 46.23 95.12 226.93 659.40 2356.56

C 0.486 0.386 0.348 0.319 0.275 0.189

L 6.954 4.611 3.575 3.067 2.624 2.201

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Figure 10. <k>, C and L vary in friend-remembering q value with different distributions of friend-making resource r.

0 5 10 15 20 25 30 35 40 45

Figure 11. Correlations among <k>, C and L from the different distributions of friend-making resource r.

4.4. Effects of Initial Friendship

Experiments were run using parameters initialized at different initial-friendship f0 and friend-remembering q values for the purpose of determining the effects of those factors on acquaintance networks. Other initialized parameters were p = 0, ^θ = 0.1, and a fixed r value of 0.5. Our results show that a larger f0 raised the average degree <k> but lowered both the clustering coefficient C and average path length L (Fig. 12). That different distributions of initial friendship influenced the statistical characteristics was not obvious, but different averages of initial friendship clearly did. In other words, <k>, C, and L were affected by different initial friendship averages but not by different initial friendship

distributions. The Figure 12 data also show that the friend remembering q factor raised

<k> and lowered both C and L. Both C–<k> and L–<k> corrections were negative and the C–L correction positive (Fig. 13).

Table 4. <k>, C and L vary in friend-remembering q value with different distributions of initial friendship f0

q 0. .2 .4 .6 .8 .9

f0 beta14(=0.1)

<k> 4.01 4.14 4.58 5.05 6.00 8.00

< k²> 19.00 19.59 24.33 30.8 44.01 84.36

C 0.5007 0.4788 0.4539 0.4512 0.3547 0.2886

L 9.2204 9.7683 7.7325 6.6268 4.7783 3.7074

f0 beta14(=0.5)

<k> 4.74 6.00 7.03 9.52 15.96 29.98

< k²> 25.41 47.06 69.95 141.19 442.85 1342.34

C 0.4892 0.4023 0.3652 0.3232 0.2939 0.2065

L 8.1108 4.5738 3.8898 3.2932 2.7749 2.3597

f₀ beta14(=0.9)

<k> 4.66 5.99 7.99 10.59 20.04 40.04

< k²> 25.24 45.92 92.82 179.43 625.14 2330.23

C 0.5064 0.4068 0.3370 0.3031 0.2313 0.1691

L 8.2640 4.7841 3.6030 3.1511 2.6172 2.2277

f0 fixed_value(=0.5)

<k> 4.66 6.00 7.37 9.94 16.01 30.46

< k²> 24.85 46.29 76.41 153.62 395.63 1352.15

C 0.4950 0.4299 0.3670 0.3146 0.2312 0.1862

L 7.9614 4.8324 3.8830 3.2369 2.7998 2.3541

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Figure 12. <k>, C and L vary in friend-remembering q value with different distributions of initial friendship f₀.

0 5 10 15 20 25 30 35 40 45

Figure 13. Correlations among <k>, C and L from the different distributions of initial friendship f₀.

We analyzed the effects of different parameters on our proposed model by relationally cross-classifying all experiments; our results are shown in Tables 5 and 6.

The plus/minus signs in Table 5 denote positive/negative relations between parameters and statistics. In Table 6 the plus or minus signs denote the strength and direction of correlations. As Table 5 indicates, in addition to the effects of rule 1, q, r, and f0 had positive correlations with average degree <k> and p and ^θ had negative correlations with

<k>. Furthermore, each average degree had a negative relationship with its corresponding average path length. All of the rule 3 parameters affected the clustering coefficient C and average length L in a positive manner, while rules 1 and 2 affected C and L negatively. Note that friendships are initialized in rule 1 and updated in rule 3.

Table 5. Effective directions of the parameters on <k>, C, L.

Rule 2 Rule 3

Statistics Rule 1

p q ^θ r f0

<k> +^* - + - + +

C +^* - - + - -

L -^* + - + - -

Table 6. Summary of the correlations between <k>, C, L from above experiments

Experiments Variational

Parameters C-<k> L-<k> C-L

4.1 p +++ -- ---

4.2 ^θ, p -- --- +++

4.3 q, r --- -- +++

4.4 q, f0 --- -- +++

4.5. Sampling

Surveys, questionnaires, and sampling techniques stand at the center of traditional social science research and are considered cheaper and more practical than collecting large amounts of census data. However, the effectiveness of these methods for analyzing social networks has not been examined. We therefore ran an arbitrary simulation of our model after reaching a statistically stationary state and collected a sample of nodes. Initialized parameters were N = 1,000, p = 0, b = 0.001, q = 0.4, and ^θ = 0.1; constants were r = 0.5 and f0 = 0.5.

Figure 14 presents the degree distribution P(k) after sampling at 100, 300, 500 and 700 nodes. Figures 14a and 14b are log plots with log scaling on the x and y axes; these

were used to determine if distributions were scale-free. Figures 14c and 14d are semi-log plots with log scaling on the x axis only; these were used to determine if distributions were exponential. Degrees in Figures 14b and 14d are post-normalization, as required for different numbers of sampled nodes. Each curve in Figure 14 represents an ensemble average of 100 sampling repetitions. The solid lines in Figure 14 reflect a lower sampling ratio of 0.1—considered common for traditional surveys and sampling techniques. The dotted lines reflect a higher sampling ratio (0.7) considered common for a census. Turns in the direction of the y-axis were observed for high sampling but not for low. The degree distribution clearly lost its original shape after sampling.

Chapter 5 Conclusion

Most small world models of social networks are analyzed by mixing regular graphs with random networks. This practice is based on Watts and Strogatz’s (1988) model, which mixes one regular and one random graph to facilitate theoretical analysis. While the WS model has had a strong impact, there is growing awareness that social network research should not be restricted to issues associated with separation and clustering (Newman &

Park, 2003).

Exploring how people make new friends is a meaningful task. In most cases, people make new social connections via introductions by friends in common, but there are many cases in which strangers become friends through chance meetings with no introductions.

With few exceptions, most of us can only give limited attention or spend limited resources on friend-making, therefore friends who were once considered close can become distant over time. To gain a better understanding of acquaintance networks, we propose a three-rule model of network evolution. In rule 1, acquaintances are made via introductions and chance meetings; an aging factor is added in rule 2; in rule 3, friendships are altered according to such factors as limited resources, friend remembering, breakup thresholds, and initial friendships.

In our model, small world statistics (especially mean degree for each node) were solely dependent on the average for each parameter. For example, we used a fixed

friend-making resource value to compare resources with beta14 distributions of different averages, and found that the <k> statistic says more about average than resource distribution. A similar phenomenon was also found for the initial friendship factor f₀.

Experimental simulations are a necessary aspect of social network research, not only because of the expenses and other difficulties involved with fieldwork, but also because widely used sampling approaches cannot capture real social network distributions, since distributions for higher sampling rates differ from those for lower sampling rates.

Taking a bottom-up, human-interaction-based simulation approach to modeling is a reflection of the evolution mechanism of real social networks. Building on insights from previous studies, we applied local and interactive rules to acquaintance network evolution. This approach produced new findings that can be used to explore human activity in specific social networks—for example, rumor propagation and disease outbreaks.

Appendix A

Terms and Abbreviations

Table 7. Terms and abbreviations for parameters.

Abbreviation Description

N Number of persons (nodes) in acquaintance network. （結點數、人數）

M Number of friendships (edges) in acquaintance network. （友誼數、邊數）

f0, f(t=0) Initial friendship distribution. （初始友誼）

p Leaving and arriving probability in rule 2.（出入機率）

b Proportion of updated friendships in rule 3.（友誼更新比）

f Friendships. (友誼深淺)

q Old friend remembering.（念舊度）

r Friend-making resource distribution. （交友資源）

θ, th Breakup threshold. （斷交門檻）

, mu Mean.（平均值）

Table 8. Terms and abbreviations for statistics.

Abbreviation Description

<k> Average degree. （度均值）

<k²> Average square of degree. （度方均值）

C Average clustering coefficient. （群聚係數）

L Average shortest path length. （平均最短路徑長度）

Table 9. Terms and abbreviations for initial distributions.

Abbreviation Description

fixed_value( ) A distribution that sets its random variable as a fixed value .

在文檔中資源和記憶對舊識網路的影響 (頁 22-0)