R ELATED W ORK - 手持裝置中圖形化介面之順暢度評比

CHAPTER 2 BACKGROUND

2.3 R ELATED W ORK

Indexes of smoothness

Several indexes have been proposed to evaluate the performance of a network.

For network quality, Rohani Bakar et al. [3] adopted jitter and latency to evaluate the QoS. Their experiment results were validated by comparing them with the standard quality management scale defined by ITU-T P.862. Chang et al. [4] quantified the requirement of network quality, such as network delay, packet loss rate and delay jitter, for different kinds of games. Based on network delay, delay jitter, client packet

loss rate, and server packet loss rate, Chen et al. [5] developed a model to predict when players will leave a game. Chen et al. [6] also established the relationship between call duration and network quality, such as network delay, packet loss rate and delay jitter, to quantify the user satisfaction on VoIP applications. All the above mentioned network-based indexes are not able to fully evaluate the smoothness of smartphones because those indexes are closely related the quality of networks. It is hard to quantify the relationship between users’ interaction such as the clicking, long pressing and the network-based indexes.

In order to evaluate system-wide performance, several benchmarks have been developed to evaluate the performance of each hardware component of a smartphone, such as AnTuTu-Benchmark, which includes “Memory Performance”, “CPU Integer Performance”, “CPU Floating point Performance”, “2D 3D Graphics Performance”,

“SD card reading/writing speed”, and “Database IO Performance”. Hyeon-Ju et al. [7]

mentioned that hardware performance may not be able to fully represent software performance. Using two different strategies to implement the same software function on a platform will result in different performance. Hence, they adopted an Android utility, named Dalvik Debug Monitor Server (DDMS), to measure execution time.

Although their method can evaluate the software performance, it requires the source codes of the application under test. Our method, on the contrary, does not need source codes and can perform black-box testing.

Tian et al. [1, 2] demonstrated that the average frame rate cannot fully reflect the smoothness of a video because burst drop frame rate, which is rate of the suddenly dropping frames,can significantly affect user satisfaction. As a result, they extracted motion vectors (MVs) from a video to evaluate the smoothness. However, the motion vector is not suitable for the case of static frames with the external camera. For

example, some dark frames on smartphones are static. The MV can be captured more precisely by the internal recorders than the external camera. For example, the MVs of some dark frames on smartphones are zero. However, for the external camera, MVs of these frames may be mistaken because of the effect of light intensity of testing environment. Therefore, the index of MVs is not suitable for the external camera.

Xiao Feng [10] discovered that the four indexes including maximal frame time, frame time variance, frame rate, and frame drop rate may influence the smoothness of user interactions. He first tested the same touch event of fling on two different smartphones. He then found that the smartphone with lower hardware specification performed better than that with higher hardware specification in user experience. The reason was that the frame time variance and the maximal frame time of the low-end smartphones are quite low. Users feel sluggish when frames do not display smoothly.

However, he used only fling operation for benchmarking which can’t represent every aspect of smartphone smoothness. On the contrary, in this work, we extended the four indexes Xiao Feng found and translated the frame time to frame intervals for the consistence. However, the frame drop rate of one operation sequence is unknown. The number of frame interval will be reduced if the frame drop rate becomes higher.

Therefore, the four indexes we used are the mean of frame intervals (MFI), variance of frame intervals (VFI), maximal frame interval (MaxFI) and number of frame intervals (NFI). In addition, the touch screen of smartphone is not sensitive and users will end the tasks if the delay is longer than 10 seconds. For these reason, we also used other two indexes, frame no response (FNR) and times of maximal frame interval (TMaxFI), to evaluate the smoothness of operations. Table 1 shows that the comparison of related work on indexes of smoothness.

Table 1 The comparison of related work on indexes of smoothness Indexes of smoothness

Paper Works [Reference #] Indexes Insufficient reasons

Video Smoothness [1] Frame rates Same frame rates with different

users’ experience

This work Mean of frame interval (MFI)

Variance of frame interval (VFI) and objective methods. A subjective method requires user’s opinion to assess the QoE while an objective method adopts QoS parameters to assess the QoE. Most objective-based methods were evaluated by user’s or application’s behaviors. For example, Chen et al. [5, 6] collected packet traces to analyze the relationship between user behaviors and user experience, such as the duration of time users leave a game or end a phone call. However, low satisfaction is not the only reason that users leave a game or end a phone. As a result, their argument may not be applied to every scenario.

Rohani Bakar et al. [3] evaluated Skype application by an existing standard, Standard Quality Management (SQM) defined by ITU-T P.862. Although the SQM is good for the perfect network, it may not be applicable to a network environment with packet losses and propagation delay. More QoS parameters are required to evaluate Skype-like applications. Chang et al. [4, 16] used the subjective method that adopted paired comparison to access the game’s or multimedia’s satisfaction. They first asked users to compare two similar samples, such as two videos or two pictures, and select the one with better quality. Based on the users’ selection, they then adopted the Bradley-Terry-Luce model to determinate the probability of users’ choice. The higher

probability the sample has, the higher satisfaction user experience it is. However, the comparison is not fair because the users’ selection may be influenced by similar samples. For example, in the case of showing continue similar samples, users consider the second sample as non-smooth by comparing with the first sample. However, in the case of showing non-continue similar samples, users consider the second sample as smooth individually. In this work, we used yes or no question for a sample to avoid possible influences of similar samples and fairly evaluate the smoothness of different smartphones. Table 2 shows that the comparison of related work on QoE models.

Table 2 The comparison of related work on QoE models QoE models

Paper Works [Reference #] Quantifiable method of users’ experiences

Objectivity

VPOW-4G [3] Objective methods Low

Game’s QoE - Leave [5]

Skype’s QoE [6]

Game’s QoE-Pair [4] Subjective methods Continuous

similar samples

Medium Media’s QoE [16]

This work Non- Continuous

similar samples

High

在文檔中手持裝置中圖形化介面之順暢度評比 (頁 14-19)