Concentration Assessment - 運用群眾外包之學生課堂專注度分析研究

CLAS use DAWM and model-driven crowdsourcing to assess students‘ concentration level.

To cooperate both parts, we develop our algorithm as algorithm 1, to assess students‘ con-centration level one after another.

else if DD.P rob > Conf.||random ≤ (1 − Conf.) then currentState ← Doze;

When the video segment has been inputted, start and end parameter will have the first time slot and the last time slot number of video, currentState will be assigned by crowdsourcing which record the current concentration level of student we have assessed,

3 CLAS: Concentration Level Assessment System 11

and rest of the process will repeat until the full segment has been assessed. The for-loop content describes how the CLAS triggers both DAWM and crowdsourcing cooperatively, we separate two cases by the currentState which should be Doze(D) or W ake(W ); if the currentState is W ake, process will get into the first case, because the Doze and W ake level of students in class can be effected by the other factors such as other stu-dent‘s interruption, laughing or claps which occur randomly, we add a random number by U nif ormDistribution() to deal this random situation, if random locate in the W ake − to − Doze probability (W D.P rob.) interval, the level of the student will be checked by the AM T () crowdsourcing, if not, process then checks if the W ake − to − W ake probabil-ity (W W.P rob.) is greater than the confidence threshold (Conf.) we set, or the random number is locate in the acceptable error interval, and decides to predict that student‘s con-centration level is still the same as W ake, or process will enter the next case which use AM T () to assess, after the currentState has been determined, the student‘s concentration level in this time slot will be recorded. The case which currentState is Doze will run in a similar way and so on.

4 Analysis 12

4 Analysis

In our analysis, we use ground-truth assessment by expert instead of AMT crowdsourc-ing result, to analysis the appropriate parameter which CLAS need to be set. Because our DAWM is trained by the past assessments, and the number of occurrences in the past for each states are not all the same, we use window size W = i time slots, ∀i ∈ N, 1 ≤ i ≤ ^T_t, (T , t in section 3.1); we use AMT minimal reward for each worker per assignment ver-sus Taiwanese 2012 minimal wage per hour (103NTD), to set appropriate duration about t = 10sec for each time slot, to separate T sec video into _t×W^T segments, and different con-fidence threshold (Conf. in section 3.4), to obtain the effective transition interval of DAWM and threshold value. We assume that each HIT needs 3 assignments to achieve majority vot-ing, to estimate the cost using crowdsourcing without our DAWM, and find the correlation between W , M oneySaved and Doze Accuracy.

In AMT, we use lowest reward setting 0.01 USD and commission 0.005 USD per as-signment, according to this setting as show in table 3, we evaluate the total cost by Eq. 7 if we use crowdsourcing without DAWM (84.6 USD), and compare the money we saved with DAWM . We use dataset as show in table 2, assess the 12 students‘ concentration level in the video segment, about 800 runs, the fig.5 display the cost we saved and the accuracy of doze assessment, because students‘ doze states are minority during class after our long term observation, we focus our works on these states assessment.

Tab. 2: Dataset

Input Oct 02, 2012 students‘ faces video in seminar class.

Video length 78 mins.

Assess target 12 students, separate into 4 groups. (base on row sits or closest ones) (8 males with glasses, 2males and 2 females without glasses)

Note

a) Assessable students only.

b) Exclude the late and left early ones.

c) Have license agreement with privacy issue.

4 Analysis 13

(a) MoneySaved v.s Window size (b) Accuracy v.s Window size

Fig. 5: M oneySaved and Doze Accuracy under different Conf idence threshold(Conf.) and W indow size

Figure 5a shows that when W has small value, the less money we saved, it is in accor-dance with our assessment in section 3.4, we use crowdsourcing to assess the initial students‘

concentration level for each video segment, the smaller the W is, the more video segments the CLAS need to assess, due to the incremental frequency of crowdsourcing method that CLAS used in small W setting, figure 5b also shows that accuracy is much higher in small W setting. We can have a conclusion here that increase the frequencies of crowdsourcing usages will enhance the accuracy of assessment, but also pay more cost. For the figure 5a which M oneySaved after W ≥ 50, curve has become flat, because the video segments we separate is no more than 2, which means the initial states we use crowdsourcing to assess are

4 Analysis 14

also no more than 2, cost has just converged, the similar phenomenon is occurring for figure 5b when W ≥ 300, Doze Accuracy than converged to a lower bound, in this decreasing curve, some W involve high Doze Accuracy such as while 125 ≤ W ≤ 175, this is due to some factors, first, the video segments cut point just match the initial states which are Doze, and we use ground-truth instead of crowdsourcing to assess the initial state for all segments, second, as mention in second paragraph in this section, doze states are minority during class, a little improvement can enhance the accuracy of doze and also assist DAWM to have more accurate prediction.

The Conf. between 75% ∼ 85% in figure 5, both M oneySaved and the Doze Accuracy are converged to the same curve, on the other hand, confidence value between 90% ∼ 95%

provides better result for doze accuracy and the cost we saved. Furthermore, we want to obtain the better trade off setting with W and Conf., in this study, we use Pareto frontier to reduce the options of W and Conf. combinations.

(a) Pareto frontier 0.75 ≤ Conf. ≤ 0.95 (b) Pareto frontier 0.9 and 0.95

Fig. 6: use Pareto frontier on each Conf. find trade off point between DozeAccuracy and M oneySaved

Figure 6 display the result between Doze Accuracy and M oneySaved which we ap-ply Pareto frontier to each confidence threshold value. In figure 6a, the value of 0.75 ≤ Conf. ≤ 0.85 are converged to the same curve, due to the result, we discard this Conf.

interval, and focus on value between 0.9 ∼ 0.95 as figure 6b, which shows that when Conf.

is higher, the DAWM‘s transition probabilities have lower chance to get over it, which

trig-4 Analysis 15

gers the CLAS to use crowdsourcing instead, and cause the increasing Doze Accuracy and low M oneySaved value. Figure 6a also shows that Conf. = 0.9 provide better tradeoff between M oneySaved and Doze Accuracy, as a result, we give out table 4 with recom-mended settings on W and M oneySaved for distinct Doze Accuracy interval while using CLAS to assess.

Tab. 4: Recommended Option with distinct level accuracy Doze Accuracy range Money saved Window size

0.9 ≤ · · · < 1 $53.17 10 0.8 ≤ · · · < 0.9 $67.19 20 0.7 ≤ · · · < 0.8 $81.76 170 0.65 ≤ · · · < 0.7 $81.86 230

5 Experiment 16

5 Experiment

CLAS is capable of robust concentration level assessment with stable accuracy by appropri-ate parameter setting, pre-trained DAWM and strappropri-ategic crowdsourcing method. In previous section, we verified the W and Conf. parameter by analyze their different combinations and assume that crowdsourcing will always get the ground-truth.

In this section, we implement our system on AMT, use dataset and cost evaluation as table 2, table 3 and Eq. 7, but set workers demand quantity to 20 per HIT (which means T otal Cost = $564), attempt to obtain the trade off between workers and Doze Accurcy.

Also, we use the workers personal-information (section 3.3)to do the demographic calcula-tion.

在文檔中運用群眾外包之學生課堂專注度分析研究 (頁 17-23)