提升學生學習動機:競賽式課內期末專題
Hsuan-Tien Lin (林軒田) htlin@csie.ntu.edu.tw
Department of Computer Science
& Information Engineering
National Taiwan University
(國立台灣大學資訊工程系)
My Two Classes with
Competition-Based Final Projects
Machine Learning elective (junior+)
• 2008
• 2009
• . . .
• 2014 (7 years)
Data Structures & Algorithms required (freshmen)
• 2011
• 2012
• . . .
• 2015 (5 years)
Students’ Reactions (Selected)
游靖堂、宋彥頡,2013.06.27
Students’ Reactions (Selected)
游靖堂、宋彥頡,2013.06.27 好幾個晚上Debug或是寫新的算法and結構 甚至在考離散前一晚趕著競賽結束前趕工到四點
爆肝了好幾回
就是為了這個...DSA Final Project!!!!!
從第一次的submission 75%, 46秒(只做指令c) 到現在最後一次的submission 100%, 4秒
中間經過了180幾次的變化
從40幾秒進步到10秒(再加上-O2瞬間又降為4秒) 從2249162Byte變成1499446Byte
從陣列+二分搜變成Trie+Binary Search Tree 大回顧20天的努力終於...完成了!!!!
Students’ Reactions (Selected)
游靖堂、宋彥頡,2013.06.27 coding的過程當中也越來越熟悉class 很可惜沒能成功手爆AVL Tree代替BST 很可惜沒能在競賽結束前知道-O2的厲害很可惜...
能這樣彙整所有的spell checker功能
還很迅速的把完全沒學過, 甚至coding前根本不知道它的名字的Trie CO出來
自己都不由的覺得自己好罩(((((逃XDDDDDDD
趴尼游靖堂雖然很嘴砲 但他也很罩阿很多想法完全靠他
還有洪湧Cheng Yueh Han Han W神一直遙遙領先 也是讓我們想要力爭上游的進步動力
不管競賽結果如何或是Final Project Report的分數如何我想我們已經盡 力了
Students’ Reactions (Selected)
• 個人覺得最高明的地方還是final project的competition,這玩意兒實 在高明,寫到根本上癮,讓人想一寫在寫,快還要更快,小還要更 小,尤其是被其他對手超過的驚訝感,甚至讓人很想放棄其他科的 期末全力寫出超猛程式—DSA2013
• Final Project真的太誘人了...已經快期末考了每天卻還是忍不住去 改它一下戳它一下電它一下—DSA2013
• 去年都再寫這個然後微積分就QQ了—DSA2012
• 學了好多,final的實做思考了很多東西,很爽—ML2014
• 關於final project,我自己是被打擊得挺厲害的,因為,看別組的正 確率都那麼高,我們則雖然用了weka,卻連演算法都跑不
完—ML2013
Excitement of Competition
史丹佛這樣教創新
http:
//www.cw.com.tw/article/article.action?id=5059685
「第六、鼓勵學生競賽。從來沒有一件事像「競爭」這樣,能讓人廢寢 忘食、24小時工作絲毫不倦。我們鼓勵學生參加各式各樣的國際競賽,
我們的學生蓋了一間太陽能屋,做電動車、機器人,參加
DARPA(國防高等研究計劃署)挑戰賽,也參加企業營運書的競賽。」
Another “In-Class” Competition: KDD Cup
Background
• an annual competition on KDD (knowledge discovery and data mining)
• organized by ACM SIGKDD, starting from 1997, nowthe most prestigious data mining competition
• usually lasts 3-4 months
• participants include famous research labs (IBM, AT&T) and top universities (Stanford, Berkeley)
KDD Cups: 2008 to 2013 I
2008
• organizer: Siemens
• topic: breast cancer prediction (medical)
• data size: 0.2M
• teams: > 200
• NTU:co-championwith IBM (led by Prof. Shou-de Lin)
2009
• organizer: Orange
• topic: customer behavior prediction (business)
• data size: 0.1M
• teams: > 400
• NTU:3rd placeof slow track
KDD Cups: 2008 to 2013 II
2010
• organizer: PSLC Data Shop
• topic: student performance prediction (education)
• data size: 30M
• teams: > 100
• NTU:championandstudent-team champion
2011
• organizer: Yahoo!
• topic: music preference prediction (recommendation)
• data size: 300M
• teams: > 1000
• NTU:double champions
KDD Cups: 2008 to 2013 III
2012
• organizer: Tencent
• topic: webuser behavior prediction (Internet)
• data size: 150M
• teams: > 800
• NTU:champion of track 2
2013
• organizer: Microsoft
• topic: paper-author relationship prediction (text mining)
• data size: 700M
• teams: > 500
• NTU:double champions
KDD Cup 2011
Music Recommendation Systems
• host: Yahoo!
• 11 yearsof Yahoo! music data
• 2 tracksof competition
• official dates: March 15 to June 30
• 1878 teams submitted to track 1;
1854 teams submitted to track 2
NTU Team for KDD Cup 2011
• 3 faculties:
Profs. Chih-Jen Lin, Hsuan-Tien Lin and Shou-De Lin
• 1 course
Data Mining and Machine Learning: Theory and Practice
• 3 TAs and 19 students:
most wereinexperienced in music recommendation in the beginning
• official classes: April to June;
actual classes: December to June
our motto: study state-of-the-art approaches and thencreatively improve them
Previously: How Much Did You Like These Movies?
http://www.netflix.com
(1M dollar competition between 2007-2009)
goal: use “movies you’ve rated” to automatically
predict yourpreferenceson future movies
My Other Motivations
I HATE exams
even more than my students...
My Other Motivations
My Design: Time Line
key dates:
• report due (i.e. overall competition end): as late as possible
—often4 days before I need to submit the scores to NTU
• award ceremony (i.e. early competition end): usuallylast class
• announcement: best timing to beright after midterm
—but may highly depend on TAs’ schedule
• start designing:two or more weeks beforeannouncement
My Design: Story/Topic
an interesting story makes the competition exciting!
• ML2014:
In this final project, you are going to be part of an exciting machine learning competition. Consider a startup company that features a coming product on the mobile phone. The core of the product is a robust character recognition system... To win the prize, you need to fight for the leading positions on the score board. Then, you need to submit a comprehensive report that describes not only the
recommended approaches, but also the reasoning behind your recommendations. Well, let’s get started!
• more interesting ones:
• ML2014, ML2013:optical character recognition
• ML2012:ad click prediction(derived from KDDCup 2012)
• DSA2014, DSA2012:email searcher
• DSA2013, DSA2011:spell checker
—often okay toreuse with modifications
My Design: Team Size
• most ideal team size IMHO is 3:
• collaborative,dispute resolution,fewer free riders, etc.
• but can also allow 4if class size too bigfor the TAs to grade
• usually allow ≤ 3:
• so students do not have the burden to findexactly 3
• students canflexibly break teamsif needed
• butevaluate with workloads of 3for fairness
• still sometimes hard for some students to find team members:
• motto: provide matching mechanism, butnot force anyone to any team
• prevent free riders: needworkload distributionin report
My Design: Scoreboard
• core place that makes the gameexciting
• thanks to my TAsin all those years for creating and maintaining the service
• basically, a simplesubmit-judge-scoreboardsystem
• usually provide the students an additionaldescriptionfield to interact—though few use it for serious purposes
My Design: Team Names
• good (humerous) team namesmake the competition interesting
• 我是暴民拍拍肩膀好棒棒<3
• 耕者有軒田
• DSAGG(Don’t Submit A Goddamn Garbage)
• DSA 6= SAD
• HTLIN (Have To Learn In NTU)
• “bad” team names?
• 2014 ITSA 線上程式競賽:「閉上眼睛深吸氣,想想妹妹就打出來
囉」
—don’t know whether I should “educate” students about this, but up to nowno students crossed my line yet
My Design: Award Ceremony
• purpose: toadd more fun
• light presents(postcards, paper notebooks, etc.)
• some students list theirgood-performing awards in resume
• may serve someeducational purposes
• in addition to good-performing awards, can also giveinteresting awards
ML2012: How Much Overfitting Can We Get?
9472 submissions from 52 teams within 1.5 months...
Award 1: First Submission Award
team scoreboard hidden algorithm time
Not Here∼ Combo Three!!! 0.5018 0.4998 Random 2012/11/27 20:28:38
Award 4: Happy 2013 Award
team scoreboard hidden algorithm time
Minimaximizer 0.7632 0.7407 rwa 2013/01/01 00:00:08
Award 5: Goodbye 2012 Award
team scoreboard hidden algorithm time anything 0.7704 0.7527 b 2012/12/31 23:59:24
Award 7-8: Hard Working Awards
team submission count
A 1097
anything 1149
My Design: Grade
• generally based onreport, not competition, butcorrelated
• too much emphasis on competition ⇒ utilitarianism
• too little emphasis on competition ⇒ less interesting game
• ask TAs to act as “bosses”: The grading TAs would grade qualitatively with letters: A++[210], A+[196], A[186], B+[176], B[166], C+[156], C[146], D+[136], D[126], F+[116], F[76], F-[36], Z[0]
• listbasic requirementscorresponding toB
• to get B, students only need to work ≈ usual homeworks
• to get more, need more to convince the TAs
• generally“loose” about basic requirements
—most students perform way beyond the basic requirements anyway
• generally team grade, butadjust individual grade if workload unbalanced
My Design: Loading
• ideal: a bitharder than homework
• estimate: 60 to 90 man-hours to finish basic requirements (30 man-hour per member)
• sometimes need toadjust loading of other homeworks
—not an easy task, though
My Design: Coverage
• motto: trynot restrictingthe tools that students use
• but sometimes needing some restrictions in competition for fairnessandfocusof project
• parallel programming for freshmen?
• external data for optical character recognition?
• decision criterion: which makesmost people in the gamemost excited?
• try beingsuper-flexible in reportto still reward creativity
My Design: TAs
• good TAs’ helpessential—I cannot thank them enough!
• design,system setup,discuss with students
• unfortunately, NTUcannot pay many TAs
—many of our TAs arevolunteers
(joined undertotal free will, even for my lab students)
• some TAs evenplay in competition(good or bad)
My Design: TAs
always note: TAs arebusy!!
My Design: Instructor
my main job:heat up the competition
My Design: Instructor
my main job:heat up the competition
My Design: Instructor
my main job:heat up the competition
My Design: Instructor
my two other jobs:
• participateseriously in the design
• maintainfairnessof competition
Less Successful Stories
• DSA2015:announced late,hard homework
• DSA2011:decide to do final project too late
• ML2013:task too easy in some sense
Some Summary Thoughts
Positive Side
• funfor most students, TAs and instructor
• students, TAs and instructorlearn a lot
Negative Side
• exhaustingfor most students, TAs and instructor
• can be disappointingfor some students
Questions and Discussions?