VoIP - Case Study 16 - 量化網際網路使用者滿意度之通用統計方法

Chapter 4 Case Study 16

4.2 VoIP

Among the various VoIP services, Skype is by far the most successful. There are over 200 million Skype downloads and approximately 85 million users worldwide.

However, fundamental questions, such as whether VoIP services like Skype are good enough in terms of user satisfaction, have not been formally addressed. In this sub-section, we quantify Skype user satisfaction based on the call duration measured from actual Skype traces, and propose an objective and perceptual index called the User Satisfaction Index (USI).

To collect Skype traffic traces, we set up a packet sniffer to monitor all traffic entering and leaving a campus network. In addition, to capture more Skype traces, a powerful Linux machine was set up to elicit more relay traffic passing through it during the course of the trace collection. However, given the huge amount of monitored traffic

CHAPTER 4. CASE STUDY 23

Figure 4.3: Correlation of bit rate with session time

and the low proportion of Skype traffic, we used two-phase filtering to identify Skype VoIP sessions. In the first stage, we filtered and stored possible Skype traffic on a disk. Then, in the second stage, we applied an off-line identification algorithm to the captured packet traces to extract actual Skype sessions. Since we could not deduce round-trip times (RTT) and their jitter simply from packet traces, we sent out probe packets for each active flow while capturing Skype traffic. The trace was collected over two months in late 2005. We obtained 634 VoIP sessions, of which 462 sessions were usable because they had more than five RTT samples. Among the 462 sessions, 253 were directly-established and 209 were relayed.

4.2.1 Performance Factor Identification

Skype uses a wideband codec that adapts to the network environment by adjusting the bandwidth used. Thus, when we explore the relationship between call duration and

CHAPTER 4. CASE STUDY 24

Figure 4.4: Correlation of jitter with session time

network conditions, we must also consider the source rate, along with network delay and loss. However, we do not have exact information about the source rate of remote Skype hosts. Thus, we use the received data rate as an approximation of the source rate. For brevity, we use the bit rate to denote the received data rate. We illustrate the correlation of the bit rate and call duration in Fig. 4.3, where the median time and their standard errors are plotted. The effect of the bit rate is clear, as we find that users tend to have longer conversations when the bit rate is higher. In fact, the median duration of the top 40% of calls is ten times longer than the shortest 15%.

We also consider the jitter and round-trip time (RTT) variables, where jitter is the standard deviation of the bit rate sampled every second. It can capture the level of network delay variations and packet loss. We observe that when network impairment is more serious, users are more likely to terminate a call. For instance, as shown in Fig.

4.4, users who experienced jitter of less than 1 Kbps would make a call for 21 minutes

CHAPTER 4. CASE STUDY 25

User Satisfaction Index Median session time (min) 151050100200

5.2 6 6.8 7.6 8.4 9.2 10

Prediction 50% conf. band

Figure 4.5: Predicted vs. actual median duration of session groups sorted by their User Satisfaction Indexes.

in median; while users who experienced jitter of more than 2 Kbps would only talked for 3 minutes, which gives a high ratio of 7:1.

4.2.2 Impact of Individual Factors

To understand the impact of individual factors, we use regression analysis to model call duration as a response to QoS factors. Although we could simply put all potential QoS factors into the regression model, the result would be ambiguous if the predictors were strongly interrelated [15]. In [2], we analyze the level of correlation between QoS factors and classify them into three collinear groups. Then, we pick the bit rate, jit-ter, and RTT from each group and incorporate into the model, since they are the most significant predictors compared with their interrelated variables. For simplicity and parsimoniousness of the model, we omit the interaction terms of these three factors,

CHAPTER 4. CASE STUDY 26

although correlations between them have been observed. The developed User Satisfac-tion Index (USI) model is then used to evaluate the satisfacSatisfac-tion levels of Skype users.

As mentioned in Section 3.2.3, the risk score β^tZ is used to represent the levels of instantaneous hang up probability, as it can be taken as a measure of user intolerance.

Accordingly, we define the User Satisfaction Index of a session as its minus risk score:

USI = −β^tZ

= 2.15 × log(bit rate) − 1.55 × log(jitter)

− 0.36 × RTT.

We can further verify the proposed model by comparing the predicted call duration based on the proposed USI with the actual call duration. In Fig. 4.5, we group sessions by their USI, and plot the actual median duration, predicted duration, and 50% confi-dence bands of the latter for each group. The results show that the predicted duration is rather close to the actual median time; moreover, for most groups the actual median time is within the 50% of the predicted confidence band.

Although not shown, we use a set of independent metrics derived from patterns of user interactivity to validate USI [2]. A strong correlation between the call duration and user interactivity suggests that our model based on call duration is significantly representative of Skype user satisfaction.

4.2.3 Findings and Discussion

By deriving the objective perceptual index, we can quantify the relative impact of the bit rate, the compound of delay jitter and packet loss, and network latency on the

CHAPTER 4. CASE STUDY 27

duration of Skype calls. Also, in [2], we have derived the importance of these three factors is approximately 46:53:1 respectively. The delay jitter and loss rate are known to be critical to the perception of real-time applications. To our surprise, the above results show that network latency has relatively little effect; however, the source rate is almost as critical as jitter, which is the compound of the delay jitter and packet loss.

We believe these discoveries indicate that adaptations for a stable, higher bandwidth channel would probably be the most effective way to increase user satisfaction with Skype. The selection of relay nodes based on network delay optimization, a technique often used to find a quality detour by peer-to-peer overlay multimedia applications, is less likely to make a significant difference to Skype in terms of user satisfaction.

Chapter 5 Application

By understanding the most significant performance factors and their impacts on user satisfaction, we can further improve user experience and optimize resource allocation.

Given the quantified risk score of users leaving an application due to unsatisfactory service, systems can be modified accordingly. For example, for network applications, systems can be designed to automatically adapt to network quality in real time in order to improve user satisfaction. On the other hand, we might enhance the smoothness of usage in high-risk sessions by increasing the packet rate or the degree of data redun-dancy; thus, users would have better experiences and be less likely to leave an appli-cation prematurely. Resource alloappli-cation could be deliberately biased toward high-risk sessions. For example, scarce resources, such as processing power or network band-width, could be allocated more effectively based on session risk scores.

The developed model could also provide useful hints to resolve design trade-offs.

For instance, as the results in Section 4.1 indicate, players in ShenZhou Online are

CHAPTER 5. APPLICATION 29

less tolerant of large delay variations than high latency. Thus, providing a smoothing buffer at the client side, though incurring additional delay, would improve overall user experience. Also, the concept of session time can be used to design an alarm system for abnormal system conditions. As we know, to provide continuous high-quality services, providers must monitor system performance around the clock and detect problems in real time, i.e., before customer complaints flood the customer service center. However, monitoring a large-scale system in this way would be prohibitively expensive or even impractical. Instead, operators can track user session times, which is much more cost-effective. Since users are more sensitive to certain system performance factors, a series of unusual departures over a short period might indicate abnormal system conditions and thus automatically trigger appropriate remedial action.

Chapter 6 Conclusion

Unlike system-level performance, user satisfaction is intangible and unmeasurable.

The key to addressing this problem is our ability to measure user opinions objectively and efficiently. In this work, we have proposed a generalizable methodology, based on survival analysis, to quantify user satisfaction from session times, i.e., the length of time users stay with an application. The results of two case studies show that session time is strongly related to system performance factors, such as network QoS, and is thus a potential indicator of user satisfaction. With the derived model, service providers can further improve user experience and optimize resource allocation.

Bibliography

[1] K.-T. Chen, P. Huang, G.-S. Wang, C.-Y. Huang, and C.-L. Lei, “On the sen-sitivity of online game playing time to network QoS,” in Proceedings of IEEE INFOCOM’06, Barcelona, Spain, Apr. 2006, pp. 1–12.

[2] K.-T. Chen, C.-Y. Huang, P. Huang, and C.-L. Lei, “Quantifying skype user sat-isfaction,” in Proceedings of ACM SIGCOMM’06, Pisa, Italy, Sept. 2006, pp.

399–410.

[3] D. R. Cox and D. Oakes, Analysis of Survival Data. Chapman & Hall/CRC, June 1984.

[4] ITU-T Recommendation P.800, “Methods for subjective determination of trans-mission quality,” 1996.

[5] U. Jekosch, Voice and Speech Quality Perception Assessment and Evaluation.

Springer, 2005.

[6] A. Rix, J. Beerends, M. Hollier, and A. Hekstra, “Perceptual evaluation of speech quality (PESQ) - a new method for speech quality assessment of telephone

net-31

BIBLIOGRAPHY 32

works and codecs,” in Proceedings of IEEE International Conference on Acous-tics, Speech, and Signal Processing, vol. 2, 2001, pp. 73–76.

[7] ITU-T Recommendation P.862, “Perceptual evaluation of speech quality (PESQ), an objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs,” Feb 2001.

[8] E. L. Kaplan and P. Meier, “Nonparametric estimation from incomplete obser-vations,” Journal of the American Statistical Association, vol. 53, pp. 437–481, 1958.

[9] T. M. Therneau and P. M. Grambsch, Modeling Survival Data: Extending the Cox Model, 1st ed. Springer, August 2001.

[10] D. R. Cox and E. J. Snell, “A general definition of residuals (with discussion),”

Journal of the Royal Statistical Society, vol. B 30, pp. 248–275, 1968.

[11] “ShenZhou Online,” http://www.ewsoft.com.tw/.

[12] D. M. S. Ila and D. Lam, “Comparing the effect of habit in the online game play of australian and indonesian gamers.” in Proceedings of the Australia and New Zealand Marketing Association Conference, Adelaide, Australia, Dec. 2003.

[13] T. Beigbeder, R. Coughlan, C. Lusher, J. Plunkett, E. Agu, and M. Claypool, “The effects of loss and latency on user performance in Unreal Tournament 2003,”

in NetGames ’04: Proceedings of the 3nd Workshop on Network and System Support for Games. ACM Press, 2004, pp. 144–151.

[14] K.-T. Chen, P. Huang, and C.-L. Lei, “Game traffic analysis: An MMORPG perspective,” Computer Networks, vol. 50, no. 16, pp. 3002–3023, 2006.

[15] F. E. Harrell, Regression Modeling Strategies, with Applications to Linear Mod-els, Survival Analysis and Logistic Regression. Springer, 2001.

在文檔中量化網際網路使用者滿意度之通用統計方法 (頁 31-42)