Finale Machine Learning in Practice

## NTU KDDCup 2011 Track 1 World Champion Model

A linear ensemble of individual and blended models for music rating

prediction, Chen et al., KDDCup 2011

### NNet, DecTree-like, and then linear blending of

### • Matrix Factorization

variants, including probabilistic### PCA

### • Restricted Boltzmann Machines: an ‘extended’ autoencoder

### • k Nearest Neighbors

### • Probabilistic Latent Semantic Analysis:

an extraction model that has

### ‘soft clusters’

as hidden variables### •

linear regression, NNet, & GBDT**yes, you can ‘easily’**

Finale Machine Learning in Practice

## NTU KDDCup 2011 Track 1 World Champion Model

A linear ensemble of individual and blended models for music rating

prediction, Chen et al., KDDCup 2011

### NNet, DecTree-like, and then linear blending of

### • Matrix Factorization

variants, including probabilistic### PCA

### • Restricted Boltzmann Machines: an ‘extended’ autoencoder

### • k Nearest Neighbors

### • Probabilistic Latent Semantic Analysis:

an extraction model that has

### ‘soft clusters’

as hidden variables### •

linear regression, NNet, & GBDT**yes, you can ‘easily’**

**understand everything! :-)**

Finale Machine Learning in Practice

## NTU KDDCup 2011 Track 1 World Champion Model

A linear ensemble of individual and blended models for music rating

prediction, Chen et al., KDDCup 2011

### NNet, DecTree-like, and then linear blending of

### • Matrix Factorization

variants, including probabilistic### PCA

### • Restricted Boltzmann Machines: an ‘extended’ autoencoder

### • k Nearest Neighbors

### • Probabilistic Latent Semantic Analysis:

an extraction model that has

### ‘soft clusters’

as hidden variables### •

linear regression, NNet, & GBDT**yes, you can ‘easily’**

**understand everything! :-)**

Finale Machine Learning in Practice

## NTU KDDCup 2011 Track 1 World Champion Model

A linear ensemble of individual and blended models for music rating

prediction, Chen et al., KDDCup 2011

### NNet, DecTree-like, and then linear blending of

### • Matrix Factorization

variants, including probabilistic### PCA

### • Restricted Boltzmann Machines: an ‘extended’ autoencoder

### • k Nearest Neighbors

### • Probabilistic Latent Semantic Analysis:

an extraction model that has

### ‘soft clusters’

as hidden variables### •

linear regression, NNet, & GBDT**yes, you can ‘easily’**

**understand everything! :-)**

Finale Machine Learning in Practice

## NTU KDDCup 2011 Track 1 World Champion Model

A linear ensemble of individual and blended models for music rating

prediction, Chen et al., KDDCup 2011

### NNet, DecTree-like, and then linear blending of

### • Matrix Factorization

variants, including probabilistic### PCA

### • Restricted Boltzmann Machines: an ‘extended’ autoencoder

### • k Nearest Neighbors

### • Probabilistic Latent Semantic Analysis:

an extraction model that has

### ‘soft clusters’

as hidden variables### •

linear regression, NNet, & GBDT**yes, you can ‘easily’**

**understand everything! :-)**

Finale Machine Learning in Practice

## NTU KDDCup 2011 Track 1 World Champion Model

A linear ensemble of individual and blended models for music rating

prediction, Chen et al., KDDCup 2011

### NNet, DecTree-like, and then linear blending of

### • Matrix Factorization

variants, including probabilistic### PCA

### • Restricted Boltzmann Machines: an ‘extended’ autoencoder

### • k Nearest Neighbors

### • Probabilistic Latent Semantic Analysis:

an extraction model that has

### ‘soft clusters’

as hidden variables### •

linear regression, NNet, & GBDT**yes, you can ‘easily’**

**understand everything! :-)**

Finale Machine Learning in Practice

## NTU KDDCup 2011 Track 1 World Champion Model

A linear ensemble of individual and blended models for music rating

prediction, Chen et al., KDDCup 2011

### NNet, DecTree-like, and then linear blending of

### • Matrix Factorization

variants, including probabilistic### PCA

### • Restricted Boltzmann Machines: an ‘extended’ autoencoder

### • k Nearest Neighbors

### • Probabilistic Latent Semantic Analysis:

an extraction model that has

### ‘soft clusters’

as hidden variables### •

linear regression, NNet, & GBDTFinale Machine Learning in Practice

## NTU KDDCup 2012 Track 2 World Champion Model

A two-stage ensemble of diverse models for advertisement ranking in

KDD Cup 2012, Wu et al., KDDCup 2012

### NNet, GBDT-like, and then linear blending of

### • Linear Regression

variants, including### linear SVR

### • Logistic Regression

variants### • Matrix Factorization

variants### •

. . .‘key’ is to

**blend properly without overfitting**

Finale Machine Learning in Practice

## NTU KDDCup 2012 Track 2 World Champion Model

A two-stage ensemble of diverse models for advertisement ranking in

KDD Cup 2012, Wu et al., KDDCup 2012

### NNet, GBDT-like, and then linear blending of

### • Linear Regression

variants, including### linear SVR

### • Logistic Regression

variants### • Matrix Factorization

variants### •

. . .‘key’ is to

**blend properly without overfitting**

Finale Machine Learning in Practice

## NTU KDDCup 2012 Track 2 World Champion Model

A two-stage ensemble of diverse models for advertisement ranking in

KDD Cup 2012, Wu et al., KDDCup 2012

### NNet, GBDT-like, and then linear blending of

### • Linear Regression

variants, including### linear SVR

### • Logistic Regression

variants### • Matrix Factorization

variants### •

. . .‘key’ is to

**blend properly without overfitting**

Finale Machine Learning in Practice

## NTU KDDCup 2012 Track 2 World Champion Model

A two-stage ensemble of diverse models for advertisement ranking in

KDD Cup 2012, Wu et al., KDDCup 2012

### NNet, GBDT-like, and then linear blending of

### • Linear Regression

variants, including### linear SVR

### • Logistic Regression

variants### • Matrix Factorization

variants### •

. . .‘key’ is to

**blend properly without overfitting**

Finale Machine Learning in Practice

## NTU KDDCup 2012 Track 2 World Champion Model

A two-stage ensemble of diverse models for advertisement ranking in

KDD Cup 2012, Wu et al., KDDCup 2012

### NNet, GBDT-like, and then linear blending of

### • Linear Regression

variants, including### linear SVR

### • Logistic Regression

variants### • Matrix Factorization

variants### •

. . .‘key’ is to

**blend properly without overfitting**

Finale Machine Learning in Practice

## NTU KDDCup 2013 Track 1 World Champion Model

Combination of feature engineering and ranking models for paper-author identification in KDD Cup 2013, Li et al., KDDCup 2013

### linear blending of

### • Random Forest

with many many many trees### • GBDT

variants### with tons of efforts in designing features

‘another key’ is to

**construct features with**

**domain knowledge**

Finale Machine Learning in Practice

## NTU KDDCup 2013 Track 1 World Champion Model

Combination of feature engineering and ranking models for paper-author identification in KDD Cup 2013, Li et al., KDDCup 2013

### linear blending of

### • Random Forest

with many many many trees### • GBDT

variants### with tons of efforts in designing features

‘another key’ is to

**construct features with**

**domain knowledge**

Finale Machine Learning in Practice

## NTU KDDCup 2013 Track 1 World Champion Model

Combination of feature engineering and ranking models for paper-author identification in KDD Cup 2013, Li et al., KDDCup 2013

### linear blending of

### • Random Forest

with many many many trees### • GBDT

variants### with tons of efforts in designing features

‘another key’ is to

**construct features with**

**domain knowledge**

Finale Machine Learning in Practice

## NTU KDDCup 2013 Track 1 World Champion Model

### linear blending of

### • Random Forest

with many many many trees### • GBDT

variants### with tons of efforts in designing features

‘another key’ is to

**construct features with**

**domain knowledge**

Finale Machine Learning in Practice

## NTU KDDCup 2013 Track 1 World Champion Model

### linear blending of

### • Random Forest

with many many many trees### • GBDT

variants### with tons of efforts in designing features

‘another key’ is to

**construct features with**

**domain knowledge**

Finale Machine Learning in Practice

## ICDM 2006 Top 10 Data Mining Algorithms

### 1

C4.5: another**decision** **tree**

### 2

k -Means### 3

SVM### 4 Apriori: for frequent itemset mining

### 5

EM:**‘alternating**

**optimization’**

algorithm for
some models
### 6

PageRank: forlink-analysis, similar to

**matrix factorization**

### 7

AdaBoost### 8

k Nearest Neighbor### 9

Naive Bayes: a simple**linear model**

with ‘weights’
decided by data statistics
### 10

C&RTpersonal view of five missing ML competitors: