In document 網站舒適度的評價與估量 (Page 18-26)

A two-phase study was designed to explore the identification and classification of visual comfort and flicker factor. Figure 3-1 is the architecture we design to verify the relation between visual comfort degree and the video-processing module. Our method designed to evaluate and measure visual comfort is focused on flicker making viewer feel uncomfortable.

There are two major portions: Matlab video processing code for signal processing and questionnaire result analyzing for people feeling.

Figure 3-1 Architecture of questionnaire, screen capturing, image processing, classification Screen Capture

and SVM training model

Here is the process we are going to construct the evaluation tools. First we capture a web-surfing video clip to record a website visual performance. The follow steps is human rating steps, collecting what people feel comfortable or not while browsing this website. We have questionnaires for student to answer. In another hand, according to WCAG guideline 2.0, we use image processing approach to processing each frame of this video clip. We will get a visual graph showing the flash occur or not and corresponding analyzed data set. In order to make our evaluation tool to predict human rating, we choose SVM for classification. This corresponding analyzed data set will be the feature and result of questionnaire will be the label to train SVM model. SVM model is used to predict the rating of next incoming video clip and will be embedded in our evaluation tool.

3.1 Experiment Data Preparation

In order to measure website visual comfort, we choose top 30 Alexa’s traffic ranking website for candidate. Using screen capture software to record 30s’ video clip and choose average flicker rating clips for questionnaire. The video clips contain normal Internet surfing progress using Microsoft IE. Each website we record screencast three times in different from cached or not. All these website are grouped into 23 individual websites since some of them belongs to the same company.

The hardware equipment list below:

SW/HW Specification

OS Windows XP SP3

CPU Intel Core 2 6300 @ 1.86GHz

Browser Firefox 3.0.3

Screen-Casting Tool AutoScreen Recorder Pro 3.0

Frame rate 30 frames /second

Table 3-1 Hardware and software environment we record a web surfing video clip

We replay the 23 captured video clips to 42 student majored in computer science and 46 student majored in non-computer science. After playing one clips, we ask these viewer to rate from 1 to 5 and repeated until all is done. Outliner of those rating scores is removed and left over 81 copies is available. These rating scores is used to label these web site good or not.

3.2 Video Processing Model

Figure 3-2 shows the process we transform color space into luminance space.

Figure 3-2 Luminance translation from sRGB color space into relative luminance domain Next


Original Frame

sequence Gamma corection Luminance

translation normalize


Relative luminance sequence

A video clips is a sequence of frames. Flicker is relative luminance changing ratio along time.

First, we process each frame using gamma correction. Second, we translate each frame from standard RGB color spaces into YUV color space. Since we focus on luminance information, we drop color data but the luminance value. The last step is normalizing luminance value into 0 to 1. Now whole screen is grayscale image and each pixel is stored in double type.

In the following step, we process these luminance images with discrete differentiation as shown in Figure 3-3.

Figure 3-3 Extraction process to calculus pixel variation in relative luminance domain

These sequences of grayscale images are the raw data we are going to calculate. W3C WCAG 2.0 indicated A general flash is defined as a pair of opposing changes in relative luminance of 10% or more of the maximum relative luminance where the relative luminance of the darker image is below 0.80; and where "a pair of opposing changes" is an increase followed by a decrease, or a decrease followed by an increase, therefore we process these image sequence differentiation in adjacent frames first. Filtering out these pixel vary in relative luminance exceed 10 %. That’s an mask hiding imperceptibility and the mask will be processed in the


area connectedness. Since the area of visual effect our eyes receives is larger to be noticed. we will produce an mask without pepper noise.

3.3 K-means clustering and Support vector machine learning

We will introduce two classification method employed in our experiments. One is k-means clustering and the other is support vector machine.

3.3.1 K-means clustering

K-means clustering is a method of clustering analysis that help to partition n observations into k cluster. Each observations belong to the same cluster is with the nearest mean. K-means group the data by minimizing the sum of square of distance between the data and the

corresponding centroids. This algorithm is a standard and popular algorithm for unsupervised learning of classification.

K-means use two-phase iterative algorithm to minimizing the sum of square of

point-to-centroid distance. First step is assignment step and the following step is update step.

Assignment step labels each observation to the cluster with the closest mean. This step partition these observations into several cluster temporarily. Update step will calculate the new mean to be the new centroid of the observation in each cluster. This algorithm will deem to converge when assignment step make no longer change.

In statistics toolbox of Matlab there is a function and we prepare our raw data fitting the input matrix format [9].

3.3.2 Support vector machine

Figure 3-4 Architecture of support vector machine training and predict model

Support Vector Machine is a supervised machine learning method used for classification.

Input data of SVM are two sets of vector data in high dimension. SVM will construct a Hyperplane between these two set. Hyperplane, which maximize the margins of these two set data, is used to predict which set that next income data belong to. Without discussing what model is better for our situation, we take SVM as a black box to use. Figure 3-4 show the architecture we use support vector machine in our study.

LIBSVM developed by Chih-Jen Lin in National Taiwan University [10] is a SVM library that is easy to use. It provides simple interface that use can link it with our program. This library includes source code in C++, Java and etc. we develop evaluation tool in Matlab and these is interface for Matlab.

Feature selection is very important portion in classification. In previous research done by Ming-Yu Wei [19], the average of transition count is meaningful. We chose the summation of transition which ideal as exposure. Further the numbers of peak of the signal of luminance difference maybe stand for the changes.

training predict

How to classify these website based on questionnaire rating is hard to decide. We can cut up these 23 website into two partitions, one is good and the other is bad. Moreover there are three partition, first is good, second is bad and the rest of them is neutral. In the two partitions, the negative rating is treated as a special index to indicate viewer bad feelings. Once one of these questionnaire are rated as 1 (uncomfortable), we take this website is bad. Regardless of these negative rating in questionnaire, it may be dispersed in the statistical process. A way to divide these websites into two groups is using mean value or median value for partition. The higher rating is good and vice versa, otherwise is bad.

3.4 Evaluation threshold

In WCAG 2.0 guideline 2.3.1 it have noted below formula.

Note 1: For the sRGB colorspace, the relative luminance of a color is defined as L = 0.2126 * R + 0.7152 * G + 0.0722 * B where R, G and B are defined as:

 if RsRGB <= 0.03928 then R = RsRGB/12.92 else R = ((RsRGB+0.055)/1.055) ^ 2.4

 if GsRGB <= 0.03928 then G = GsRGB/12.92 else G = ((GsRGB+0.055)/1.055) ^ 2.4

 if BsRGB <= 0.03928 then B = BsRGB/12.92 else B = ((BsRGB+0.055)/1.055) ^ 2.4

and RsRGB, GsRGB, and BsRGB are defined as:

 RsRGB = R8bit/255

 GsRGB = G8bit/255

 BsRGB = B8bit/255

The "^" character is the exponentiation operator [18].

Note 2: Almost all systems used today to view Web content assume sRGB encoding. Unless it is known that another color space will be used to process and display the content, authors

should evaluate using sRGB color space. If using other color spaces, see [8].

Note 3: If dithering occurs after delivery, then the source color value is used. For colors that are dithered at the source, the average values of the colors that are dithered should be used (average R, average G, and average B).

In document 網站舒適度的評價與估量 (Page 18-26)

Related documents