• 沒有找到結果。

運用群眾外包之學生課堂專注度分析研究

N/A
N/A
Protected

Academic year: 2021

Share "運用群眾外包之學生課堂專注度分析研究"

Copied!
33
0
0

加載中.... (立即查看全文)

全文

(1)NATIONAL TAIWAN N ORMAL U NIVERSITY C OMPUTER S CIENCE AND I NFORMATION E NGINEERING. A Crowdsourcing-based Solution to Assess Concentration Levels of Students during Class. Supervisor:. Author:. Dr. Ling-Jyh C HEN. Hu-Cheng L EE. July 25, 2013.

(2) To my parents for providing me with the best education, to my supervisor for advising and teaching me with the greatest patient, to my girlfriend Hsin-Hung Hsieh for her wonderful love..

(3) i. 摘要 學生在課堂上的專注程度對於他們在學業成績方面的表現有著重要的影響力。當學生開 始打瞌睡,他們的學習效率就會開始降低,透過評量分析學生專注度的程度,我們可以 改善學生在課後的復習效率,並給予講師一些演講的意見,讓下次的演說更趨於完美。 目前的精神狀態評量方式有著高成本、儀器部署的限制和需要配戴的問題,像是使用 影像處理或是瞌睡警示器。僱用專家來對學生精神狀態做評量分析容易產生因疲憊和過 於主觀使得評量結果不準確、不客觀等問題。在本研究中,我們介紹Concentration Level Assessment System (CLAS),一個易於部署的精神狀態評量系統,透過Doze-and-Wake Model (DAWM)和model-driven的群眾外包方式,取得各種領域工作人員的貢獻來提供高準確度的 學生專注度評量結果,並找出評量作業的成本和評量準確度的平衡點。由於CLAS是一個輕 巧、準確度高且低成本的精神狀態評量系統,因此能用於其他場合使用,未來我們也希望 使用在辦公室內對於員工的精神狀態做監測,方便公司管理人對於下屬的管理。 關鍵字:學生、精神狀態、瞌睡、群眾外包.

(4) ii. Abstract Concentration is important for students to conduct efficient learning in a class, and an effective assessment of students’ concentration level in a class is useful for students to review class materials after lectures, as well as for lecturers to adjust their teaching strategies for self-improvement. Although a number of concentration assessment approaches have been proposed and deployed, these approaches are generally inaccurate (e.g., computer vision-based approaches), intrusive (e.g., wearable anti-doze alarm devices), and time/money expensive (e.g., expert assessment). In this study, we propose a novel approach, called Concentration Level Assessment System (CLAS), which combines a markovian Doze-and-Wake Model (DAWM) and emerging crowdsourcing marketplace to provide the concentration assessment service. Using a comprehensive set of real-world experiments and extensive data analysis, we demonstrate that the proposed system is capable of striking a good balance between assessment accuracy and monetary expense. Moreover, CLAS is simple, accurate, affordable, and promising in facilitating concentration assessment in different application scenarios. Key words:students, concentration, doze, crowdsourcing.

(5) iii. Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 1. 2. Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 4. 3. CLAS: Concentration Level Assessment System . . . . . . . . . . . . . . .. 6. 3.1. Video Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 6. 3.2. Concentration model . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 7. 3.3. Model-Driven Crowdsourcing . . . . . . . . . . . . . . . . . . . . . . . .. 9. 3.4. Concentration Assessment . . . . . . . . . . . . . . . . . . . . . . . . . .. 10. 4. Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 12. 5. Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 16. 5.1. System accuracy and Cost comparison . . . . . . . . . . . . . . . . . . . .. 16. 5.2. AMT results accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 17. 5.3. Demographics of Mechanical Turk . . . . . . . . . . . . . . . . . . . . . .. 18. 6. Related works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 21. 7. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 22. 8. Future works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 23. References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 24.

(6) iv. List of Tables 1. Confusion matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 4. 2. Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 12. 3. Cost Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 13. 4. Recommended Option with distinct level accuracy . . . . . . . . . . . . . .. 15.

(7) v. List of Figures 1. Four primary parts of CLAS and their correlation. . . . . . . . . . . . . . .. 6. 2. A video frame after video preprocessing . . . . . . . . . . . . . . . . . . .. 7. 3. Schematic diagram of DAWM . . . . . . . . . . . . . . . . . . . . . . . .. 8. 4. The worker interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 9. 5. Money Saved versus Doze accuracy . . . . . . . . . . . . . . . . . . . . .. 13. 6. Use Pareto frontier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 14. 7. System Accuracy and Cost comparison with and W/O DAWM . . . . . . .. 17. 8. AMT results 1/3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 17. 9. AMT results 2/3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 18. 10. AMT results 3/3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 19. 11. Demographics of AMT worker 1/2 . . . . . . . . . . . . . . . . . . . . . .. 19. 12. Demographics of AMT worker 2/2 . . . . . . . . . . . . . . . . . . . . . .. 20.

(8) A Crowdsourcing-based Solution to Assess Concentration Levels of Students during Class. 1. Introduction. Concentration level assessment is helpful for students to review class material after lectures. Recently, increasing number of universities have start to offer E-learning services (e.g., web-based courses, class video and virtual learning environment [24] ). Concentration assessment can mark out the part of the lesson that students have lost their attention. Furthermore, based on the assessment result, we can reflect the content‘s quality of lectures, and point out which part of the lesson is more abstruse or bored. Lecturers can optimize their teachings or speeches even better via assessment. Concentration level assessment can be done by image processing [7], anti-doze alarm device and hiring assesses experts. However, these methods still have disadvantages and restriction we need to deal with. Image processing has a basic accuracy on concentration level assessment, it detects the students‘ eyes position and analyzes about the frequency of winks to assess students‘ concentration level, the real problem is the camera shooting angle and background color effect, it have to be the position where can detect the students‘ eyes, otherwise it cannot do the assessment. Unfortunately, while students were dozing or sleeping, they might head down, with hand under their cheek or even rest their heads on the. 1.

(9) 1 Introduction. 2. table, some students may wear glasses or hat, make background color in dark or shadow, this can reduce the accuracy of assessment with image process method. Anti-doze alarm device is another method to detect students‘ doze in class, it can warn the students while they are doze in real time, but on the other hand, anti-doze device detects the head position [2] or pulse [6] who equipped it, this method can has low accuracy when assessing students who were taking notes or changing their sight between presenter and their textbook. Moreover, use intrusive device for assessment can have a great cost and can interfere students‘ learning. Hiring assess experts may have higher recognition rate of students‘ doze, but in a long term, experts can be exhausted and make mistakes in assessing procedure, when there are multiple targets need to be assessed, experts may distract and cannot focus on every single student, besides, individual worker may have subjective and unequal problem in assessment result. In this study, we introduce Concentration Level Assessment System (CLAS), provide a low cost, deployable and higher accuracy assessing process. By using the Doze-and-Wake Model (DAWM), CLAS can assess the input students‘ faces in-class video, and automatic generate high accuracy assessment of the students in video. CLAS is also use model-driven crowdsourcing method to assess the students‘ concentration level in the video, while the DAWM cannot confidently predict students‘ concentration states, CLAS will pick out the corresponding part of video segments, and hire a crowd of workers to assess the students‘ concentration level in the video segments. After collecting the crowd‘s assessment, DAWM then continue the state prediction works, based on the results we collected from the crowd. We simulate our system with different length of video segment, use Pareto front line to reduce the options of window size versus spending cost, and evaluate our system on Amazon Mechanical Turk, and validate that CLAS can successfully generate a high accuracy state of students‘ concentration level. CLAS has easy deployment and stable accuracy advantages, therefore, it is appropriate to use in different scenario environment, such as office workers surveillance, help directors to manage their workers conveniently, adjust the break time, improve company productivity..

(10) 1 Introduction. 3. Our contributions are three-fold • We propose a Concentration Level Assessment System (CLAS), build a Markovian Doze-and-Wake Model (DAWM) and introduce an algorithm to use model-driven crowdsourcing assessment. • We conduct experiment by real-world students in-class faces video, and implement CLAS on Amazon Mechanical Turk to analysis. • The results show that our system has high accuracy results and low cost assessment. The remainder of this paper is organized as follows. Section 2 described the problem statement in this paper. Section 3 discussed our system‘s overview and technical issue. In Section 4, we introduce our data set and analysis. Our experiment results are shown in Section 5. Related works are in Section 6. The conclusions of this study are given out in Section 7. The final Section 8 discussed the future works of our study..

(11) 2 Problem statement. 2. 4. Problem statement. Students‘ concentration is generally considered as the most important factor that affects the students‘ course performance directly. In this study, we propose a model-driven based crowdsourcing assess method, to obtain the concentration level of students in class. Our model is a Markovian model, training by the expert‘s assessment, based on the results, the model use the conditional probabilities to predict the next state of student‘s concentration level according the current state. We definition the Doze and Wake states below, Definition 1 (Doze state): The students who are the assess targets, have signs of dozing, and our experts or workers identify that the students have doze off greater or equal than once in the current t sec time slot of video, we set the states of students which are Doze for t sec. Definition 2 (Wake state): The students who are the assess targets, have signs of waking, and our experts or workers identify that the students are preserve wake in the current t sec time slot of video, we set the states of students which are Wake for t sec. We ask experts and workers to follow the definition above to assess the students‘ concentration level, and recorded the results for model training and system accuracy analysis. In accuracy analysis, we use the experts assessment results as a ground-truth and compare with our approach, we define the confusion matrix as follow, Tab. 1: Confusion matrix. Ground-truth. Doze (1) Wake (0). System results Doze (1) Wake (0) True Positive (TP) False Negative (FN) False Positive (FP) True Negative (TN). TP present the number of states that our system has successfully predicted Doze. TN indicates the states our system predict Wake correctly. FP and FN denotes the wrong states number that our system predict Doze and Wake but there are actually contrary, based on this, we also define the Doze accuracy, Wake accuracy and Total accuracy as follow. We.

(12) 2 Problem statement. use recall, which denotes the. 5. ] P redict states match to Ground−truth ] Ground−truth states. to present each of the Doze. accuracy Eq.1 and Wake accuracy Eq.2, and analyze system accuracy include both attributes as Total accuracy Eq.3, to evaluate our system performance in our system implementation and our experiments in the following section. Definition 3 (Doze accuracy):. Doze ACC =. TP T P +F N. (1). W ake ACC =. TN F P +T N. (2). T P +T N T P +F N +F P +T N. (3). Definition 4 (Wake accuracy):. Definition 5 (Total accuracy):. T otal ACC =.

(13) 3 CLAS: Concentration Level Assessment System. 3. 6. CLAS: Concentration Level Assessment System. Concentration Level Assessment System (CLAS) has four primary parts, A) Video preprocessing B) Concentration model C) Crowdsourcing D) Concentration Assessment, the detail of their function will be introduce in the following paragraph.. Fig. 1: Four primary parts of CLAS and their correlation.. 3.1. Video Preprocessing. Video preprocessing is the process to assist for later crowdsourcing states tagging and it has several steps, when a T sec video stream has been inputted, system will divide the video by t sec, into T /t time slots, and mark out the target students with red rectangle, help workers to focus on the students they are going to label the states. After the processing, all video‘s time slots will save in the database, prepare for the later crowd labeling. Because the crowdsourcing states tagging has privacy issue, CLAS will mask the students‘ face who do not want to participate in our experiment with black veil to protect their right of publicity..

(14) 3 CLAS: Concentration Level Assessment System. 7. Fig. 2: A video frame after video preprocessing. 3.2. Concentration model. DAWM is a Markovian model, use its transition probability to assist CLAS to assess the students‘ concentration level. DAWM can be trained by the students‘ concentration assessment results, which were assessed by the experts in our project; because experts can survey the whole parts of students in-class faces video, observing the cause and effect of all students behavior in class, and access play, pause, rewind and fast forward of the video stream and assess the most correct states for the students. Based on the training data, DAWM then creates a transition probabilities table, which records all the mean value of students‘ doze pattern transition probabilities. To obtain the transition probabilities of DAWM, first, count each occurring states for both Doze (Dx )and Wake (Wy ) conditions, as Eq. 4. Second, based on each number of states from Eq. 4, we then accumulate them to get Doze of f at least x slots(Dlx ) and W ake at least y slots(Wly ), from Eq. 5. Finally, we use Dlx , Dx,num , Wly and Wy,num , to calculate DAWM‘s transition probabilities as Eq. 6..    Dx,num = counting ] of Dx   Wy,num = counting ] of Wy. ∀x ∈ N, 1 ≤ x ≤ max ] of observed doze of f length ∀y ∈ N, 1 ≤ y ≤ max ] of observed wake length (4).

(15) 3 CLAS: Concentration Level Assessment System.    Dlx = Pm Di,num i=x Pn   Wly = j=y Wj,num.    (Dx , Dx+1 ) = DDlx+1  lx     D x,num  (Dx , W1 ) = Dlx W   (Wy , Wy+1 ) = Wly+1   ly     (W , D ) = Wy,num y 1 Wly. ∀n ∈ N, m = max ] of observed doze of f length. 8. (5). ∀m ∈ N, n = max ] of observed wake length. ∀x ∈ N, 1 ≤ x ≤ max ] of observed doze of f length. ∀y ∈ N, 1 ≤ y ≤ max ] of observed wake length (6). Figure 3 describe the overview of DAWM, each state represents the student‘s concentration level, we classify the states into Doze (Dx ) and Wake (Wy ), and has 4 types of transition probabilities, they are P (Wy , D1 ) (Wake-to-Doze), P (Wy , Wy + 1) (Wake-to-Wake), P (Dx , Dx + 1) (Doze-to-Doze) and P (Dx , Wy ) (Doze-to-Wake); We use crowdsourcing method to identify the student‘s initial state which should be D1 or W1 .. Fig. 3: Schematic diagram of DAWM, we use states to present students‘ concentration level, Doze (D) and Wake (W), the numbers represent the students‘ concentration continuous condition, and use transition probabilities to predict the students‘ states. When CLAS is running the assessment process, DAWM will provide the corresponding transition probability of the target student according to his/her current state..

(16) 3 CLAS: Concentration Level Assessment System. 3.3. 9. Model-Driven Crowdsourcing. In this paper, we use Amazon Mechanical Turk (AMT) [1], one of the most popular crowdsourcing online platforms, hiring crowd of workers to assess the students‘ concentration level. CLAS use crowdsourcing to assess students‘ concentration level with strategy, when DAWM cannot confidently generate the students‘ state, CLAS will send a request for crowdsourcing method to assess the students‘ state in current fragment of video, a Human Intelligent Task (HIT) will be generated during this step, each HIT may have S assignments which require S unique individual worker, and upload to the AMT, prepare for assessing by the crowds of workers. Also, because DAWM predict students‘ state based on the previous state condition, for the students‘ concentration level initial states, we use crowdsourcing directly to do the assessment.. Fig. 4: The worker interface. Workers who participate in our HIT will see the worker interface as fig.4, there is a video segment, with three highlight students in it, workers only need to focus on the students who have been enclosed by red rectangles, and choose the states which can represent the target students‘ concentration level the most precisely for each of the students. Below the states.

(17) 3 CLAS: Concentration Level Assessment System. 10. assessing part, we also collect the workers personal-information in this study, to analyze the demographic distribution of AMT workers. The workers in our study are recruited without any restriction and are composed of a variety of people.. 3.4. Concentration Assessment. CLAS use DAWM and model-driven crowdsourcing to assess students‘ concentration level. To cooperate both parts, we develop our algorithm as algorithm 1, to assess students‘ concentration level one after another. Algorithm 1 Assessment algorithm of CLAS start ← video start time slot; end ← video length; currentState ← AM T (); for i ← start + 1; i < end; i + + do random ← U nif ormDistribution(); switch currentState do case W ake if random ≤ W D.P rob. then currentState ← AM T (); else if W W.P rob > Conf.||random ≤ (1 − Conf.) then currentState ← W ake; else currentState ← AM T (); end if case Doze if random ≤ DW.P rob. then currentState ← AM T (); else if DD.P rob > Conf.||random ≤ (1 − Conf.) then currentState ← Doze; else currentState ← AM T (); end if record currentState for student‘s concentration level; end for. When the video segment has been inputted, start and end parameter will have the first time slot and the last time slot number of video, currentState will be assigned by crowdsourcing which record the current concentration level of student we have assessed,.

(18) 3 CLAS: Concentration Level Assessment System. 11. and rest of the process will repeat until the full segment has been assessed. The for-loop content describes how the CLAS triggers both DAWM and crowdsourcing cooperatively, we separate two cases by the currentState which should be Doze(D) or W ake(W ); if the currentState is W ake, process will get into the first case, because the Doze and W ake level of students in class can be effected by the other factors such as other student‘s interruption, laughing or claps which occur randomly, we add a random number by U nif ormDistribution() to deal this random situation, if random locate in the W ake − to − Doze probability (W D.P rob.) interval, the level of the student will be checked by the AM T () crowdsourcing, if not, process then checks if the W ake − to − W ake probability (W W.P rob.) is greater than the confidence threshold (Conf.) we set, or the random number is locate in the acceptable error interval, and decides to predict that student‘s concentration level is still the same as W ake, or process will enter the next case which use AM T () to assess, after the currentState has been determined, the student‘s concentration level in this time slot will be recorded. The case which currentState is Doze will run in a similar way and so on..

(19) 4 Analysis. 4. 12. Analysis. In our analysis, we use ground-truth assessment by expert instead of AMT crowdsourcing result, to analysis the appropriate parameter which CLAS need to be set. Because our DAWM is trained by the past assessments, and the number of occurrences in the past for each states are not all the same, we use window size W = i time slots, ∀i ∈ N, 1 ≤ i ≤. T , t. (T , t in section 3.1); we use AMT minimal reward for each worker per assignment versus Taiwanese 2012 minimal wage per hour (103NTD), to set appropriate duration about t = 10sec for each time slot, to separate T sec video into. T t×W. segments, and different con-. fidence threshold (Conf. in section 3.4), to obtain the effective transition interval of DAWM and threshold value. We assume that each HIT needs 3 assignments to achieve majority voting, to estimate the cost using crowdsourcing without our DAWM, and find the correlation between W , M oneySaved and Doze Accuracy. In AMT, we use lowest reward setting 0.01 USD and commission 0.005 USD per assignment, according to this setting as show in table 3, we evaluate the total cost by Eq. 7 if we use crowdsourcing without DAWM (84.6 USD), and compare the money we saved with DAWM . We use dataset as show in table 2, assess the 12 students‘ concentration level in the video segment, about 800 runs, the fig.5 display the cost we saved and the accuracy of doze assessment, because students‘ doze states are minority during class after our long term observation, we focus our works on these states assessment. Tab. 2: Dataset Input Video length Assess target Note. Oct 02, 2012 students‘ faces video in seminar class. 78 mins. 12 students, separate into 4 groups. (base on row sits or closest ones) (8 males with glasses, 2males and 2 females without glasses) a) Assessable students only. b) Exclude the late and left early ones. c) Have license agreement with privacy issue..

(20) 4 Analysis. 13. Tab. 3: Cost Evaluation Each HIT Workers 3 Each assignment Reward $0.01 Commission $0.005 Total HITs 1880(470HITs*4 groups). T otal cost = W orkers × [(Reward + Commission) × T otalHIT s]. (a) MoneySaved v.s Window size. (7). (b) Accuracy v.s Window size. Fig. 5: M oneySaved and Doze Accuracy under different Conf idence threshold(Conf.) and W indow size Figure 5a shows that when W has small value, the less money we saved, it is in accordance with our assessment in section 3.4, we use crowdsourcing to assess the initial students‘ concentration level for each video segment, the smaller the W is, the more video segments the CLAS need to assess, due to the incremental frequency of crowdsourcing method that CLAS used in small W setting, figure 5b also shows that accuracy is much higher in small W setting. We can have a conclusion here that increase the frequencies of crowdsourcing usages will enhance the accuracy of assessment, but also pay more cost. For the figure 5a which M oneySaved after W ≥ 50, curve has become flat, because the video segments we separate is no more than 2, which means the initial states we use crowdsourcing to assess are.

(21) 4 Analysis. 14. also no more than 2, cost has just converged, the similar phenomenon is occurring for figure 5b when W ≥ 300, Doze Accuracy than converged to a lower bound, in this decreasing curve, some W involve high Doze Accuracy such as while 125 ≤ W ≤ 175, this is due to some factors, first, the video segments cut point just match the initial states which are Doze, and we use ground-truth instead of crowdsourcing to assess the initial state for all segments, second, as mention in second paragraph in this section, doze states are minority during class, a little improvement can enhance the accuracy of doze and also assist DAWM to have more accurate prediction. The Conf. between 75% ∼ 85% in figure 5, both M oneySaved and the Doze Accuracy are converged to the same curve, on the other hand, confidence value between 90% ∼ 95% provides better result for doze accuracy and the cost we saved. Furthermore, we want to obtain the better trade off setting with W and Conf., in this study, we use Pareto frontier to reduce the options of W and Conf. combinations.. (a) Pareto frontier 0.75 ≤ Conf. ≤ 0.95. (b) Pareto frontier 0.9 and 0.95. Fig. 6: use Pareto frontier on each Conf. find trade off point between DozeAccuracy and M oneySaved Figure 6 display the result between Doze Accuracy and M oneySaved which we apply Pareto frontier to each confidence threshold value. In figure 6a, the value of 0.75 ≤ Conf. ≤ 0.85 are converged to the same curve, due to the result, we discard this Conf. interval, and focus on value between 0.9 ∼ 0.95 as figure 6b, which shows that when Conf. is higher, the DAWM‘s transition probabilities have lower chance to get over it, which trig-.

(22) 4 Analysis. 15. gers the CLAS to use crowdsourcing instead, and cause the increasing Doze Accuracy and low M oneySaved value. Figure 6a also shows that Conf. = 0.9 provide better tradeoff between M oneySaved and Doze Accuracy, as a result, we give out table 4 with recommended settings on W and M oneySaved for distinct Doze Accuracy interval while using CLAS to assess. Tab. 4: Recommended Option with distinct level accuracy Doze Accuracy range 0.9 ≤ · · · < 1 0.8 ≤ · · · < 0.9 0.7 ≤ · · · < 0.8 0.65 ≤ · · · < 0.7. Money saved $53.17 $67.19 $81.76 $81.86. Window size 10 20 170 230.

(23) 5 Experiment. 5. 16. Experiment. CLAS is capable of robust concentration level assessment with stable accuracy by appropriate parameter setting, pre-trained DAWM and strategic crowdsourcing method. In previous section, we verified the W and Conf. parameter by analyze their different combinations and assume that crowdsourcing will always get the ground-truth. In this section, we implement our system on AMT, use dataset and cost evaluation as table 2, table 3 and Eq. 7, but set workers demand quantity to 20 per HIT (which means T otal Cost = $564), attempt to obtain the trade off between workers and Doze Accurcy. Also, we use the workers personal-information (section 3.3)to do the demographic calculation.. 5.1. System accuracy and Cost comparison. Figure 7 shows our system accuracy and cost comparison of crowdsourcing W/W/O DAWM. In figure 7a, we use different number of workers try to obtain the proper range of numbers; we focus on the doze and total accuracy, because the average of wake accuracy in our long term observation, it is always accurate than the others, and doze states only account for a little part of data. We can see that the curve begin flat while the workers number add up to 5 ∼ 9 for both average doze accuracy and total accuracy, here we choose odd number only for the majority voting process. As a result, we recommend users who use our system to set the workers quantity about 5 ∼ 9 to get the most efficient and affordable results. In figure 7b, we compare the cost while using crowdsourcing assess to the class in our experiment with our DAWM and without it, we can see there is a obvious gap for these two method, although when window = 10 it cost much than others window size, but it is still save a lot when it compare to the crowdsourcing without our DAWM. These results can prove our CLAS has higher accuracy and lower cost than the traditional crowdsourcing..

(24) 5 Experiment. 17. (a) System accuracy. (b) With and W/O DAWM cost comparison. Fig. 7: System Accuracy and Cost comparison with and W/O DAWM. 5.2. AMT results accuracy. Figure 8 demonstrate AMT results accuracy with students‘ gender and the affect about W/W/O glasses. In this dataset, we don‘t have female students doze off in class, and cannot compare with male students doze off accuracy in figure 8a, but in the both gender of Wake accuracy, the result shows that they have only slight different. For the students who doze off with glasses, figure 8b shows that the accuracy is more unstable than the students without glasses, this might be the students eyes sight were been block by the glasses, make workers more difficult to assess.. (a) Students Gender. (b) Students with Glasses and W/O glasses. Fig. 8: AMT results 1/3. Figure 9 show the accuracy of workers with different genders and races, in our study,.

(25) 5 Experiment. 18. both attributes have no impact on AMT results accuracy.. (a) Workers Gender. (b) Workers Races. Fig. 9: AMT results 2/3 In figure 10a, we can see that the workers whose age are in the 50 ∼ 59 year-old, have better accuracy on their works, on the other hand, although the 20 ∼ 39 year-old workers have the most number, but the average accuracy of their works are not the best, this might be the elder assess more neatly and maturely than the youths. The final figure 10b shows the relation between results accuracy and the average spending time per assignment, the accuracy is more stable and accurate for the assignments which were finished in 60 sec, although the most workers finished the assignments in 20 sec, but their accuracy are not as good as others, some of the assignments has been finished almost 10 min, this could due to the workers‘ network low bandwidth, loading video in our assignments may take some time with lower bandwidth, also, in AMT, workers have honor system, after they accepted our HIT, they should not cancel it, or they will get a bad record on they working history, and have less chances to have some high reward HITs.. 5.3. Demographics of Mechanical Turk. In this subsection, we collected the information from workers who participant in our HITs assessment and analyze with Panos Ipeirotis‘s demographic data [9, 10]. We focus on four attributes, which are workers‘ Country, Gender, Age and Education level. Figure 11a shows.

(26) 5 Experiment. 19. (a) Workers Age. (b) spending time. Fig. 10: AMT results 3/3 the changes of demographics composition, since 2008 until 2013, the Indians has become the major group of AMT workers, due to the AMT allow workers in India receiving their payment in rupees, only workers from US and India can have their payment by their own country currency. In figure 11b, male workers number has catch up females and almost overrun for double in 2013, the ratio for male and female workers in India does not have much change, but in US, the male workers number have also catch up with females, too.. (a) Country. (b) Gender. Fig. 11: Demographics of AMT worker 1/2 Workers‘ age structure is almost the same which shows in figure 12a and 12b, 20 ∼ 40 year-old workers are still the main groups, especially 20 ∼ 30, this result is also consistent with figure 12c and 12d which demonstrate that bachelor level has the most workers number in AMT. In our study, from 2010 to 2013, there are more workers enter the master degree.

(27) 5 Experiment. 20. after they graduated from college in India (fig.12d).. (a) Age. (b) Age. (c) EDU. (d) EDU. Fig. 12: Demographics of AMT worker 2/2.

(28) 6 Related works. 6. 21. Related works. Crowdsourcing is a hot technique during this Internet explosion and human overpopulation centry [8], it can access scalable workforce and distributed problem solving on-line [23], users only need to set the requests, prepare the funds and find the proper crowdsourcing platform, workers then handle the rest of works; there are many of crowdsourcing platform, such as IStockPhoto [11], uTest [21], TopCoder [20], PeerToPatent [18] and Amazon mechanical turk. By choosing the correct platform and efficiently implement, crowdsourcing can have low cost and less time-consuming advantages. There are lot of research topics surround it, such as disable people assistant [4, 5, 12], activities learning [25, 26], real time process [3], speech recognition [16, 17], audio quality assessment by crowds [19]. and video annotation [15, 22]. Crowdsourcing is suitable for tasks which are easy for human, hard for computers, such as human activities recognition (AR) [14] using camera monitoring or body-worn surveillance with crowdsourcing, make human AR more deployable and extend conveniently. Human AR system needs training dataset, this is time consuming and repetitive process, via crowdsourcing, batches of training processes can be distributed to workers on the Internet, and be trained synchronously, which reduce the lengthy training procedure. Furthermore, model-driven crowdsourcing incisively trigger the crowdsourcing method in human AR [13], save the cost and time consumption to the lowest bound; they use model to predict the next possible activity, which is Hidden Markov model (HMM), only use crowdsourcing when the probabilities cannot support the predictions, based on this, system performance and cost reduction can also be improved. In CLAS, we propose DAWM-driven crowdsourcing, implement our system on AMT, to use crowdsourcing in human behavior assessment — students‘ concentration level..

(29) 7 Conclusion. 7. 22. Conclusion. In this study, we propose a model-driven based crowdsourcing method to assess the students concentration level in class, introduce the Doze-and-Wake model, which use its conditional probabilities to predict the students‘ concentration states, and trigger the crowdsourcing when model‘s probabilities are not support for the predictions. We implement our system on Amazon mechanical Turk and hired the non-restriction and untrained workers to assess the students‘ faces class video, through the majority voting, our results show that even the untrained workers can have reliability assessments. The results also show that our CLAS can have high accuracy with proper parameter setting, workers quantity and DAWM-driven crowdsourcing..

(30) 8 Future works. 8. 23. Future works. Our CLAS can output the students‘ or audiences‘ concentration level distribution during the class or a presentation. Based on the results, we can integrate CLAS with online course or class video online platform, help students who doze off in class to review the part of the lesson that they missed out, we can also give some advices to the lecturers, point out the abstruse / boring part of the presentations, and help them to have better performance for their next lectures. Due to the CLAS has advantages of easy deployment, we can build our system into office surveillance, help directors to manage their works, adjust the office break time or find the workers who work conscientiously for the future promotions, improve companys‘ productivity..

(31) 8 Future works. 24. References [1] Amazon. amazon mechanical turk. [2] Amazon. Wake up anti-doze earpiece alarm. [3] M. S. Bernstein, J. Brandt, R. C. Miller, and D. R. Karger. Crowds in two seconds: Enabling realtime crowd-powered interfaces. In Proceedings of the 24th annual ACM symposium on User interface software and technology, pages 33–42, 2011. [4] J. Bigham, E. Brady, S. White, and C. Esposti. Human-backed access technology. Proceedings of the CHI 2011, 2011. [5] E. Brady, M. R. Morris, Y. Zhong, S. White, and J. P. Bigham. Visual challenges in the everyday lives of blind people. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pages 2117–2126. ACM, 2013. [6] P.-Y. Chen, P.-H. Wu, W. J. Ong, Y.-J. Huang, W.-C. Lin, and T.-L. Pan. Development of a brand new system using rfid combining with wireless sensor network (wsns) for real-time doze alarm. In Anti-counterfeiting, Security, and Identification in Communication, 2009. ASID 2009. 3rd International Conference on, pages 197–201, 2009. [7] Y.-L. Chien. Eye opening detection with application for in-class attention monitoring. Master’s thesis, National Taiwan Normal University, 7 2012. [8] J. Howe. The rise of crowdsourcing. Wired magazine, 14(6):1–4, 2006. [9] P. Ipeirotis. Mechanical turk: The demographics. 2008. [10] P. Ipeirotis. 10-01,. New. Demographics of mechanical turk. York. University,. Stern. School. http://archive.nyu.edu/handle/2451/29585, 2010. [11] iStockphoto. http://www.istockphoto.com/.. Working paper CeDERof. Business.. Available. at.

(32) 8 Future works. 25. [12] W. Lasecki, C. Miller, A. Sadilek, A. Abumoussa, D. Borrello, R. Kushalnagar, and J. Bigham. Real-time captioning by groups of non-experts. In Proceedings of the 25th annual ACM symposium on User interface software and technology, pages 23–34. ACM, 2012. [13] W. S. Lasecki, Y. C. Song, H. Kautz, and J. P. Bigham. Real-time crowd labeling for deployable activity recognition. In Proceedings of the 2013 conference on Computer supported cooperative work, CSCW ’13, pages 1203–1212, New York, NY, USA, 2013. ACM. [14] L.-V. Nguyen-Dinh, C. Waldburger, D. Roggen, and G. Tr¨oster. Tagging human activities in video by crowdsourcing. In Proceedings of the 3rd ACM conference on International conference on multimedia retrieval, pages 263–270. ACM, 2013. [15] S. Nowak and S. R¨uger. How reliable are annotations via crowdsourcing: a study about inter-annotator agreement for multi-label image annotation. In Proceedings of the international conference on Multimedia information retrieval, pages 557–566. ACM, 2010. [16] G. Parent and M. Eskenazi. Toward better crowdsourced transcription: Transcription of a year of the let’s go bus information system data. In Spoken Language Technology Workshop (SLT), 2010 IEEE, pages 312–317, 2010. [17] G. Parent and M. Eskenazi. Speaking to the crowd: looking at past achievements in using crowdsourcing for speech and predicting future challenges. In Proceedings of Interspeech, 2011. [18] PEERTOPATENT. http://peertopatent.org/. [19] F. Ribeiro, D. Florˆencio, C. Zhang, and M. Seltzer. Crowdmos: An approach for crowdsourcing mean opinion score studies. In Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on, pages 2416–2419. IEEE, 2011..

(33) 8 Future works. 26. [20] TopCoder. http://www.topcoder.com/. [21] uTest. http://www.utest.com/. [22] C. Vondrick and D. Ramanan. Video annotation and tracking with active learning. In Neural Information Processing Systems (NIPS), 2011. [23] M. Vukovic. Crowdsourcing for enterprises. In Proceedings of the 2009 Congress on Services - I, SERVICES ’09, pages 686–692, Washington, DC, USA, 2009. IEEE Computer Society. [24] Wikipedia. Virtual learning environment. [25] L. Zhao, G. Sukthankar, and R. Sukthankar. Incremental relabeling for active learning with noisy crowdsourced annotations. In Privacy, security, risk and trust (passat), 2011 ieee third international conference on and 2011 ieee third international conference on social computing (socialcom), pages 728–733, 2011. [26] L. Zhao, G. Sukthankar, and R. Sukthankar. Robust active learning using crowdsourced annotations for activity recognition. In AAAI 2011 Workshop on Human Computation, 2011..

(34)

參考文獻

相關文件

Since we use the Fourier transform in time to reduce our inverse source problem to identification of the initial data in the time-dependent Maxwell equations by data on the

An algorithm is called stable if it satisfies the property that small changes in the initial data produce correspondingly small changes in the final results. (初始資料的微小變動

If w e sell you land, you m ust rem em ber that it is sacred, and you m ust teach your children that it is sacred and that each ghostly reflection in the clear w ater of the lakes

• Figure 26.26 at the right shows why it is safer to use a three-prong plug for..

• Figure 26.26 at the right shows why it is safer to use a three-prong plug for..

OGLE-III fields Cover ~ 100 square degrees.. In the top figure we show the surveys used in this study. We use, in order of preference, VVV in red, UKIDSS in green, and 2MASS in

At least one can show that such operators  has real eigenvalues for W 0 .   Æ OK. we  did it... For the Virasoro

According to the United Nations Educational, Scientific and Cultural Organization (UNESCO), a language is considered endangered when “its speakers cease to use it, use it in fewer