• 沒有找到結果。

Teaching Machine Learning to a Diverse Audience: the Foundation-based Approach

N/A
N/A
Protected

Academic year: 2022

Share "Teaching Machine Learning to a Diverse Audience: the Foundation-based Approach"

Copied!
15
0
0

加載中.... (立即查看全文)

全文

(1)

Teaching Machine Learning to a Diverse Audience:

the Foundation-based Approach

Hsuan-Tien Lin, National Taiwan University Malik Magdon-Ismail, Rensselaer Polytechnic Institute Yaser S. Abu-Mostafa, California Institute of Technology

Teaching Machine Learning Workshop @ ICML 2012 June 30, 2012

Lin, Magdon-Ismail, Abu-Mostafa Foundation-based Approach 06/30/2012 1 / 15

(2)

Diversity in ML classes

NTU ML 2011 Fall (77 students)

background diversity “maturity” diversity junior: 8 senior: 20 master: 44 phd: 5

similarly diverse in RPI and in Caltech (online course)1 challenge:

serving CS students while accommodating the needs of diverse non-CS audience mindset of the audience?

1http://work.caltech.edu/telecourse

(3)

Observed Mindsets of the Diverse Audience

highlymotivatedto learn

—not satisfied with only shallow comic-book stories

often withminimum but non-emptymath/programming background

—capable of downloading and trying the latest packages words of a student from industry (Caltech online course 2012)2

demand:solid foundation(and better intuition)!

2http://book.caltech.edu/bookforum/showthread.php?p=3107

Lin, Magdon-Ismail, Abu-Mostafa Foundation-based Approach 06/30/2012 3 / 15

(4)

Our Proposed Teaching Approach

foundation-based, and foundation-first

then, compensate foundation witha couple ofuseful algorithms/techniques

comparison to techniques-based techniques-based:

hops through the forest ofmanylatest and greatest techniques foundation-based: illustrate themap (core)first to prevent getting lost in the forest

foundation-based:

prepare students foreasy learning of untaught/future techniques

(5)

Our Proposed Teaching Approach [Cont.]

foundation-based, and foundation-first

then, compensate foundation witha couple ofuseful algorithms/techniques

comparison to foundation-later foundation-later:

first, techniques to raise interests

then, foundations to consolidate understanding

foundation-first: build thebasis (core)first to perceive the techniques from the right angle

foundation-first:

let studentsknow when and how to use the powerful tools before getting addicted on the power

Lin, Magdon-Ismail, Abu-Mostafa Foundation-based Approach 06/30/2012 5 / 15

(6)

Our Proposed Foundation: Three Concepts

understand learnability, approximation and generalization when can we learn and what are the tradeoffs?

conducting machine learningproperly use simple models first

the linear model coupled with some nonlinear transforms is typically enough for most applications

conducting machine learningsafely deal with noise and overfitting carefully

how to tackle the “dark side” of learning?

conducting machine learningprofessionally

our experience: worth starting with those foundations, even for a diverse audience

(7)

learnability, approximation & generalization

—conducting machine learning properly

good learning (test performance)

= good approximation (training performance) + good generalization (complexity penalty)

a must-teach key message

can be illustrated indifferent forms(e.g. VC bound, bias-variance, even human-learning philosophy)

make learningnon-trivial and fascinatingto students

Lin, Magdon-Ismail, Abu-Mostafa Foundation-based Approach 06/30/2012 7 / 15

(8)

learnability, approximation & generalization

—conducting machine learning properly [Cont.]

wrong use of learning (beginner’s mistakes)

ensuregood approximation, pray forgood generalization

—praying for something out-of-control

right use of learning

ensuregood generalization, try best forgood approximation

—trying something possibly in-control

We cannot guarantee learning. We can“guarantee” no disasters. That is, after we learn we will either declare success or failure, and in both cases we will be right.

(9)

linear models

—conducting machine learning safely

linear models

= good generalization

withestablished optimization toolsforgood approximation

after knowingapproximation/generalization:

a good stagefor learning safe techniques

sufficiently usefulfor many practical problems (Yuan et al., 2012) building blockin sophisticated techniques throughfeature

transforms

make learningconcreteto students

Lin, Magdon-Ismail, Abu-Mostafa Foundation-based Approach 06/30/2012 9 / 15

(10)

linear models

—conducting machine learning safely [Cont.]

wrong use of learning (beginner’s mistakes)

start with the “greatest” techniques first —a point of no return right use of learning

start with thesimplesttechniques first —and yes, it can work well a rich and representative family of linear techniques

classification: approx. combinatorial optimization (perceptron-like) regression: analytic optimization (pseudo-inverse)

logistic regression: iterative optimization (SGD)

Students coming from diverse backgrounds not only get thebig picture, but also thefiner details in a concrete setting.

(11)

deal with noise and overfitting

—conducting machine learning professionally

overfit = difficult to ensure good generalization/learning withstochastic or deterministic noiseon finite data regularization= tools for further guaranteeinggood generalization validation= tools for certifyinggood learning

overfit(data size, noise level)

turn amateur students toprofessionals make learningartisticto students

Lin, Magdon-Ismail, Abu-Mostafa Foundation-based Approach 06/30/2012 11 / 15

(12)

deal with noise and overfitting

—conducting machine learning professionally [Cont.]

wrong use of learning (beginner’s mistakes)

apply all possible techniques and choose bybest approximation result

—high risk of overfitting right use of learning

apply a reasonable number of well-regularizedtechniques and choose bybest validation result—relatively immune to noise and overfitting

Complex situations call forsimplermodels.

(13)

Teaching/Learning Life After the Foundations

Support Vector Machine

generalization large-margin bound approximation quadratic programming linear model basic formulation feature transform through kernel regularization large-margin validation #-SV bound

Neural Network

#-neuron bound gradient decent et al.

neurons

through cascading

weight-decay or early-stopping for choices in regularization

[libsvm-2.9]$ ./svm-train -t 2 -g 0.05 -c 100 heart_scale optimization finished, #iter = 1966

Total nSV = 113

good approximation (by choosing kernel and optimization) good generalization (by regularization)

good learning (by using #SV as validation indicator)

Lin, Magdon-Ismail, Abu-Mostafa Foundation-based Approach 06/30/2012 13 / 15

(14)

Teaching/Learning Life After the Foundations [Cont.]

Caltech 2012: (mixed)7 weeksof foundations, 0.5 week of NNet, 0.5 week of RBF Net, 1 week of SVM

NTU 2011: (sequential)10 weeksof foundations, 2.5 weeks of SVM, 2.5 weeks of bagging/boosting

—with an in-class data mining competition3where students exploited taught/not-taughttechniques with ease

oftenincrementalefforts to teach/learn a new technique after solid foundations

3http://main.learner.csie.ntu.edu.tw/php/ml11fall/

(15)

Conclusion

foundation-based, foundation-first

—works well in our experience

learnability: philosophicalunderstanding, make learning non-trivial, conduct learningproperly

linear models: algorithmicmodeling, make learningconcrete, conduct learningsafely

overfitting: practicaltuning, make learningartistic, conduct learningprofessionally

Thank you. Questions?

Lin, Magdon-Ismail, Abu-Mostafa Foundation-based Approach 06/30/2012 15 / 15

參考文獻

相關文件

Hsuan-Tien Lin (NTU CSIE) Machine Learning Foundations 16/22.. If we use E loocv to estimate the performance of a learning algorithm that predicts with the average y value of the

propose a model of ITS to predict scooter whether scooters are going to stop or not by combining machine learning techniques. The training and testing data consist of

By expressing the inductive bias of a deeper solution (multiple non-linear learned stages at both embedding and relation modules), we make it easier to learn a

← This allows teachers to adapt the school-based English Language curriculum and devise learning/teaching materials that better suit the diverse abilities, needs

on good practices focusing on seven themes, which cover effective enhancement in the learning and teaching of English Language, enrichment of an English

Science education provides learning experiences for students to develop scientific literacy with a firm foundation on science, realise the relationship between science, technology,

NETs can contribute to the continuing discussion in Hong Kong about the teaching and learning of English by joining local teachers in inter-school staff development initiatives..

In order to achieve the learning objectives of the OLE – providing students with a broad and balanced curriculum with diverse learning experiences to foster whole-person development