Experimental Studies of Selective Sampling

We follow the same setup in Section 3.5. For IWAL, we also use SVM as the learning algorithm and tune the parameter C0 to be 2 to make IWAL perform better than the original setting with C0 = 8 used by Beygelzimer et al. (2010).

The results are shown in Figure 5. Clearly, IWAL generally performs worse than ALHS-SVM under the pool-based setup. One possible reason for the big difference is that IWAL can only get the information of one instance in each iteration through the pool-to-stream simulation rather than the whole unlabeled pool. The partial information makes it difficult to query useful instances.

To make a fair comparison to IWAL, we further design another learning algorithm IWAL-Pool. The algorithm estimates p_r(˜x) of all instances in the unlabeled pool and query the one with the highest probability. The results are also presented in Figure 5.

We see that IWAL-Pool is competitive to IWAL and can significantly outperform IWAL in diabetes and letter V vs Y. Nevertheless, ALHS-SVM, with its simple yet direct use of the unlabelled pool, is generally still better than IWAL-Pool. The results demon-strate the importance of acquiring direct information of unlabeled pool in the pool-based setup. They also highlight the difference of pool-based setup and stream-based setup in active learning.

6 Conclusion

We propose a new framework of active learning, hinted sampling, which exploits the unlabeled instances as hints. Hinted sampling can take both uncertainty and

represen-0 5 10 15 20 25 30 0.5

0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9

Number of Queried Instances

Accuracy

ALHS−SVM IWAL IWAL−Pool

(a) australian

0 10 20 30

0.5 0.55 0.6 0.65 0.7 0.75

Number of Queried Instances

Accuracy

ALHS−SVM IWAL IWAL−Pool

(b) diabetes

0 10 20 30 40 50

0.5 0.55 0.6 0.65 0.7 0.75

Number of Queried Instances

Accuracy

ALHS−SVM IWAL IWAL−Pool

0 20 40 60

0.5 0.6 0.7 0.8 0.9 1

Number of Queried Instances

Accuracy

ALHS−SVM IWAL IWAL−Pool

(d) leterM vsN

0 20 40 60

0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9

Number of Queried Instances

Accuracy

ALHS−SVM IWAL IWAL−Pool

(e) letterV vsY

0 20 40 60 80 100

0.5 0.6 0.7 0.8 0.9 1

Number of Queried Instances

Accuracy

ALHS−SVM IWAL IWAL−Pool

(f) segment

0 20 40 60 80 100

0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85

Number of Queried Instances

Accuracy

ALHS−SVM IWAL IWAL−Pool

(g) splice

0 5 10 15 20 25 30

0.5 0.6 0.7 0.8 0.9 1

Number of Queried Instances

Accuracy

ALHS−SVM IWAL IWAL−Pool

(h) wdbc

Figure 5: Comparison between IWAL and ALHS-SVM on different datasets

tativeness into account concurrently in a more natural and simpler way. We design a novel active learning algorithm ALHS within the framework, and couple the algorithm with a promising hint selection strategy. Because ALHS models the representativeness by hints, it avoids the potential problems of other more sophisticated approaches that are employed by other representative sampling algorithms. Hence, ALHS results in a significantly better and more stable performance than other state-of-the-art algorithms, and can be used to immediately improve SVM-based uncertainty sampling and TSVM-based representative sampling. On the other hand, compared with selective sampling algorithms taking both uncertainty and representativeness into account, the proposed ALHS also has better performance. It not only justifies the effectiveness of ALHS again but also demonstrates the importance of considering the whole unlabeled pool in the querying stage for pool-based active learning problems.

Because of the simplicity and effectiveness of hinted sampling, it is worth study-ing more about this framework. An intensive research direction is to couple hinted sampling with other classification algorithms, and investigate deeper on the hint se-lection strategies. While we use SVM in ALHS, this framework could be generalized to other classification algorithms. In the future, we plan to investigate more general hint selection strategies and extend hinted sampling from binary classification to other classification problem.

Acknowledgments

A preliminary version of this paper appeared in the Asian Conference on Machine Learning 2012. We thank the reviewers of the conference as well as reviewers for

all versions of this paper for their many useful suggestions. This research has been supported by the National Science Council of Taiwan via NSC 101-2628-E-002-029-MY2..

References

Abu-Mostafa, Y. S. (1995). Hints. Neural Computation, 4:639–671.

Balcan, M.-F., Beygelzimer, A., and Langford, J. (2006). Agnostic active learning. In Proceedings of the 23rd International Conference on Machine Learning.

Bennett, K. P. and Demiriz, A. (1998). Semi-supervised support vector machines. In Advances in Neural Information Processing Systems 11, pages 368–374.

Beygelzimer, A., Hsu, D., Langford, J., and Zhang, T. (2010). Agnostic active learning without constraints. In Advances in Neural Information Processing Systems 23.

Cao, H., Bao, C., Li, X., and Woon, D. Y.-K. (2014). Class augmented active learning.

In Proceedings of the SIAM International Conference on Data Mining.

Chang, C.-C. and Lin, C.-J. (2011). LIBSVM: A library for support vector machines.

ACM Transactions on Intelligent Systems and Technology, pages 27:1–27:27.

Chen, T., Chen, Y., Guo, Q., Zhou, Z.-H., Li, L., and Xu, Z. (2014). Effective and effi-cient microprocessor design space exploration using unlabeled design configurations.

ACM Transactions on Intelligent Systems and Technology.

Cohn, D., Atlas, L., and Ladner, R. (1994). Improving generalization with active learn-ing. Machine Learnlearn-ing.

Cohn, D. A., Ghahramani, Z., and Jordan, M. I. (1996). Active learning with statistical models. Journal of Artificial Intelligence Research, 4:129–145.

Dasgupta, S. (2011). Two faces of active learning. Theoretical Computer Science.

Dasgupta, S. and Hsu, D. (2008). Hierarchical sampling for active learning. In Pro-ceedings of the 25th International Conference on Machine learning, pages 208–215.

Dasgupta, S., Hsu, D., and Monteleoni, C. (2007). A general agnostic active learning algorithm. In Advances in Neural Information Processing Systems 20.

Donmez, P., Carbonell, J. G., and Bennett, P. N. (2007). Dual strategy active learning. In Proceedings of the 18th European Conference on Machine Learning, pages 116–127.

Frank, A. and Asuncion, A. (2010). UCI machine learning repository.

Guo, Y. and Greiner, R. (2007). Optimistic active learning using mutual information.

In Proceedings of the 20th International Joint Conference on Artificial Intelligence, pages 823–829.

Hanneke, S. (2007). Teaching dimension and the complexity of active learning. In Proceedings of the Annual Conference on Learning Theory.

Hoi, S. C. H., Jin, R., Zhu, J., and Lyu, M. R. (2008). Semi-supervised SVM batch mode active learning for image retrieval. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 1–7.

Huang, S.-J., Jin, R., and Zhou, Z.-H. (2010). Active learning by querying informative and representative examples. In Advances in Neural Information Processing Systems 23, pages 892–900.

Joachims, T. (1999a). Advances in kernel methods. chapter Making large-scale support vector machine learning practical.

Joachims, T. (1999b). Transductive inference for text classification using support vector machines. In Proceedings of the Sixteenth International Conference on Machine Learning.

Langford, J. and Zhang, T. (2007). The epoch-greedy algorithm for contextual multi-armed bandits. In Advances in Neural Information Processing Systems 20.

Lewis, D. D. and Gale, W. A. (1994). A sequential algorithm for training text classi-fiers. In Proceedings of the 17th ACM International Conference on Research and Development in Information Retrieval, pages 3–12.

Li, C.-L., Ferng, C.-S., and Lin, H.-T. (2012). Active learning with hinted support vector machine. In Proceedings of the forth Asian Conference on Machine Learning.

Melville, P. and Mooney, R. J. (2004). Diverse ensembles for active learning. In Pro-ceedings of the 21st International Conference on Machine Learning, pages 584–591.

Nguyen, H. T. and Smeulders, A. (2004). Active learning using pre-clustering. In Proceedings of the 21st International Conference on Machine Learning, pages 623–

630.

Quanquan Gu, Tong Zhang, J. H. (2014). Batch-mode active learning via error bound minimization. In Proceedings of the 30th Conference Annual Conference on Uncer-tainty in Artificial Intelligence.

在文檔中 Active Learning Using Hint Information (頁 33-39)