Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1390156.1390303acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
research-article

Deep learning via semi-supervised embedding

Published: 05 July 2008 Publication History

Abstract

We show how nonlinear embedding algorithms popular for use with shallow semi-supervised learning techniques such as kernel methods can be applied to deep multilayer architectures, either as a regularizer at the output layer, or on each layer of the architecture. This provides a simple alternative to existing approaches to deep learning whilst yielding competitive error rates compared to those methods, and existing shallow semi-supervised techniques.

References

[1]
Ahmed, A., Yu, K., Xu, W., & Gong, Y. (2008). Training hierarchical feed-forward visual recognition models using transfer learning from pseudo-tasks. ECCV. Submitted.
[2]
Ando, R., & Zhang, T. (2005). A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data. The Journal of Machine Learning Research, 6, 1817--1853.
[3]
Belkin, M., & Niyogi, P. (2003). Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation, 15, 1373--1396.
[4]
Belkin, M., Niyogi, P., & Sindhwani, V. (2006). Manifold regularization: a geometric framework for learning from Labeled and Unlabeled Examples. Journal of Machine Learning Research, 7, 2399--2434.
[5]
Bengio, Y., Lamblin, P., Popovici, D., & Larochelle, H. (2007). Greedy layer-wise training of deep networks. Advances in Neural Information Processing Systems, NIPS 19.
[6]
Bromley, J., Bentz, J. W., Bottou, L., Guyon, I., LeCun, Y., Moore, C., Sackinger, E., & Shah, R. (1993). Signature verification using a siamese time delay neural network. International Journal of Pattern Recognition and Artificial Intelligence, 7.
[7]
Caruana, R. (1997). Multitask Learning. Machine Learning, 28, 41--75.
[8]
Chapelle, O., Schölkopf, B., & Zien, A. (2006). Semi-supervised learning. Adaptive computation and machine learning. Cambridge, Mass., USA: MIT Press.
[9]
Chapelle, O., Weston, J., & Schöölkopf, B. (2003). Cluster kernels for semi-supervised learning. NIPS 15 (pp. 585--592). Cambridge, MA, USA: MIT Press.
[10]
Chapelle, O., & Zien, A. (2005). Semi-supervised classification by low density separation. AISTATS (pp. 57--64).
[11]
Collobert, R., Sinz, F., Weston, J., & Bottou, L. (2006). Large scale transductive svms. Journal of Machine Learning Research, 7, 1687--1712.
[12]
Collobert, R., & Weston, J. (2007). Fast semantic extraction using a novel neural network architecture. Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, 25--32.
[13]
Hadsell, R., Chopra, S., & LeCun, Y. (2006). Dimensionality reduction by learning an invariant mapping. Proc. Computer Vision and Pattern Recognition Conference (CVPR'06). IEEE Press.
[14]
Hinton, G. E., Osindero, S., & Teh, Y.-W. (2006). A fast learning algorithm for deep belief nets. Neural Comp., 18, 1527--1554.
[15]
Japkowicz, N., Hanson, S., & Gluck, M. (2000). Nonlinear autoassociation is not equivalent to PCA. Neural Computation, 12, 531--545.
[16]
LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86.
[17]
Pradhan, S., Ward, W., Hacioglu, K., Martin, J., & Jurafsky, D. (2004). Shallow semantic parsing using support vector machines. Proceedings of HLT/NAACL-2004.
[18]
Ranzato, M., Huang, F., Boureau, Y., & LeCun, Y. (2007). Unsupervised learning of invariant feature hierarchies with applications to object recognition. Proc. Computer Vision and Pattern Recognition Conference (CVPR'07). IEEE Press.
[19]
Salakhutdinov, R., & Hinton, G. (2007). Learning a Nonlinear Embedding by Preserving Class Neighbourhood Structure. AISTATS.
[20]
Sindhwani, V., Niyogi, P., & Belkin, M. (2005). Beyond the point cloud: from transductive to semi-supervised learning. International Conference on Machine Learning, ICML.
[21]
Tenenbaum, J., de Silva, V., & Langford, J. (2000). A global geometric framework for nonlinear dimensionality reduction. Science, 290, 2319--2323.
[22]
Vapnik, V. N. (1998). Statistical learning theory. John Wiley and Sons, New York.
[23]
Williams, C. (2001). On a connection between kernel PCA and metric multidimensional scaling. Advances in Neural Information Processing Systems, NIPS 13.
[24]
Zhu, X., & Ghahramani, Z. (2002). Learning from labeled and unlabeled data with label propagation (Technical Report CMU-CALD-02-107). Carnegie Mellon University.

Cited By

View all
  • (2025)Exploring the Chameleon Effect of Contextual Dynamics in Temporal Knowledge Graph for Event PredictionTsinghua Science and Technology10.26599/TST.2024.901006730:1(433-455)Online publication date: Feb-2025
  • (2024)Data Augmentation and Graph Regularization for Adversarial TrainingGraph Theory - A Comprehensive Guide [Working Title]10.5772/intechopen.1006511Online publication date: 3-Oct-2024
  • (2024)Graph Adaptive Attention Network with Cross-EntropyEntropy10.3390/e2607057626:7(576)Online publication date: 4-Jul-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICML '08: Proceedings of the 25th international conference on Machine learning
July 2008
1310 pages
ISBN:9781605582054
DOI:10.1145/1390156
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

  • Pascal
  • University of Helsinki
  • Xerox
  • Federation of Finnish Learned Societies
  • Google Inc.
  • NSF
  • Machine Learning Journal/Springer
  • Microsoft Research: Microsoft Research
  • Intel: Intel
  • Yahoo!
  • Helsinki Institute for Information Technology
  • IBM: IBM

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 July 2008

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

Conference

ICML '08
Sponsor:
  • Microsoft Research
  • Intel
  • IBM

Acceptance Rates

Overall Acceptance Rate 140 of 548 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)125
  • Downloads (Last 6 weeks)11
Reflects downloads up to 14 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2025)Exploring the Chameleon Effect of Contextual Dynamics in Temporal Knowledge Graph for Event PredictionTsinghua Science and Technology10.26599/TST.2024.901006730:1(433-455)Online publication date: Feb-2025
  • (2024)Data Augmentation and Graph Regularization for Adversarial TrainingGraph Theory - A Comprehensive Guide [Working Title]10.5772/intechopen.1006511Online publication date: 3-Oct-2024
  • (2024)Graph Adaptive Attention Network with Cross-EntropyEntropy10.3390/e2607057626:7(576)Online publication date: 4-Jul-2024
  • (2024)Labeling small-degree nodes promotes semi-supervised community detection on graph convolutional networkThe European Physical Journal B10.1140/epjb/s10051-024-00817-x97:11Online publication date: 16-Nov-2024
  • (2024)Semi-Supervised and Unsupervised Deep Visual Learning: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2022.320157646:3(1327-1347)Online publication date: Mar-2024
  • (2024)Refining Euclidean Obfuscatory Nodes Helps: A Joint-Space Graph Learning Method for Graph Neural NetworksIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2024.340589835:9(11720-11733)Online publication date: Sep-2024
  • (2024)A Novel Composite Graph Neural NetworkIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.326876635:10(13411-13425)Online publication date: Oct-2024
  • (2024)Reconstructed Graph Constrained Auto-Encoders for Multi-View Representation LearningIEEE Transactions on Multimedia10.1109/TMM.2023.327998826(1319-1332)Online publication date: 1-Jan-2024
  • (2024)A Novel Semi-Supervised Learning Model for Smartphone-Based Health TelemonitoringIEEE Transactions on Automation Science and Engineering10.1109/TASE.2022.321813221:1(428-441)Online publication date: Jan-2024
  • (2024)GReAT: A Graph Regularized Adversarial Training MethodIEEE Access10.1109/ACCESS.2024.339597612(63130-63141)Online publication date: 2024
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media