Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2818869.2818902acmotherconferencesArticle/Chapter ViewAbstractPublication Pagesase-bigdataConference Proceedingsconference-collections
research-article

Visual-based Deep Learning for Clothing from Large Database

Published: 07 October 2015 Publication History

Abstract

Huge benefits can be obtained by mining information from Big Data. Analyzing large volumes of consumption behavior data that are limited by conventional machine learning techniques and computational analysis becomes a critical problem as Big Data is examined. Furthermore, there is a need for powerful visual-based analytics tools when pictures have become a core content component on the Internet. Hence, in this study, we explore Deep Learning with convolutional neural networks with a goal of resolving clothing style classification and retrieval tasks. To reduce training complexity, transfer learning is incorporated by fine-tuning pre-trained models on large scale datasets. Furthermore, because the parameters are vast for any given deep net, one architecture inspired from Adaboost is designed to use multiple deep nets that are trained with a sub-dataset. Thus, the training time can be accelerated if each net is computed in one client node in a distributed computing environment. Moreover, to increase system flexibility, two architectures with multiple deep nets with two outputs are proposed for binary-class classification. Therefore, when new classes are added, no additional computation is needed for all training data. Experiments are performed to compare existing systems with hand-crafted features and conventional learning models. According to the results, the proposed system can provide significant improvements on three public clothing datasets for style classifications.

References

[1]
US National Security Agency 2013. The National Security Agency: Missions, Authorities. Oversight and Partnerships, 5 (August. 2013).
[2]
Chen, X. W. and Lin, X. 2014. Big data deep learning: challenges and perspectives. IEEE Access, 514--525.
[3]
Gantz, J. and Reinsel, D. 2011. Extracting value from chaos, EMC.
[4]
Dumbill, E. 2012. What is big data? An introduction to the big data landscape, Strata.
[5]
Grobelnik, M. 2013. Big Data Tutorial, European Data Forum.
[6]
Najafabadi, M. M., Villanustre, F., Khoshgoftaar, T. M., Seliya, N., Wald, R., and Muharemagic, E. 2015. Deep learning applications and challenges in big data analytics, Journal of Big Data, Vol. 2, No. 1, 1--21.
[7]
Lin, J. and Kolcz, A. 2012. Large-scale machine learning at twitter. International conference on management of data, 793--804.
[8]
Sukumar, S. R. 2014. Machine learning in the big data era: are we there yet?. CISML.
[9]
Bengio, Y., Courville, A., and Vincent, P. 2013. Representation learning: a review and new perspectives. TPAMI, Vol. 35, No. 8, 1798--1828.
[10]
Arel, I., Rose, D. C., and Karnowski, T. P. 2010. Deep machine learning - a new frontier in artificial intelligence research. IEEE Computational Intelligence Magazine, Vol. 5, No. 4, 13--18.
[11]
Efrati, A. 2013. How deep learning works at Apple. Information.
[12]
Jones, N. 2014. Computer science: the learning machines. Nature, Vol. 505, No. 7482, 146--148.
[13]
Wang, Y., Yu, D., Ju, Y., and Acero, A. 2011. Voice search. Language understanding: systems for extracting semantic information from speech, 119--146.
[14]
Hinton, G., and Osindero, S. 2006. A fast learning algorithm for deep belief nets. Neural Computation, Vol. 18, No. 7, 1527--1554.
[15]
Bengio, Y., Lamblin, P., Popovici, D., and Larochelle, H. 2007. Greedy layer-wise training of deep networks. NIPS, 153--160.
[16]
Dahl, G. E., Yu, D., Deng, L., and Acero, A. 2012. Context-dependent pretrained deep neural networks for large-vocabulary speech recognition. IEEE Trans. on Audio, Speech and Language Processing, Vol. 20, No. 1, 30--41.
[17]
Mohamed, A., Dahl, G., and Hinton, G. 2012. Acoustic modeling using deep belief networks. IEEE Trans. on Audio, Speech and Language Processing, Vol. 20, No. 1, 14--22.
[18]
Socher, R., Huang, E. H., Pennington, J., Ng, A. Y., and Manning, C. D. 2011. Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. NIPS.
[19]
Ciresan, D. C., Meier, U., Gambardella, L. M., and Schmidhuber, J. 2010. Deep big simple neural nets excel on handwritten digit recognition. Neural Computing, Vol. 22, No. 12,3207--3220.
[20]
Krizhevsky, A., Sutskever, I., and Hinton, G., 2012. Imagenet classification with deep convolutional neural networks. NIPS, 1106--1114.
[21]
Ojala, T., Pietikainen, M., and Maenpaa, T. 2002. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. TPAMI, Vol. 24, No. 7,971--87.
[22]
Lowe, D. G. 2004. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, Vol.60, No.2,91110.
[23]
Le, Q., Ranzato, M., Monga, R., Devin, M., Chen, K., Corrado, G., Dean, J., and Ng, A. 2012. Building high-level features using large scale unsupervised learning. ICML.
[24]
LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. 1998. Gradient-Based Learning Applied to Document Recognition. Proceedings of the IEEE, Vol. 86, No. 11, 2278--2324.
[25]
Sun, Y., Wang, X., and Tang, X. 2013. Hybrid deep learning for face verification. ICCV.
[26]
Ciresan, D., Meier, U., and Schmidhuber, J. 2012. Multi-column deep neural networks for image classification. CVPR.
[27]
Girshick, R., Donahue, J., Darrell, T., and Malik, J. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. CVPR.
[28]
Sun, Y., Wang, X., and Tang, X. 2015. Deeply learned face representations are sparse, selective, and robust. CVPR.
[29]
Zhang, N., Paluri, M., Ranzato, M., Darrell, T., and Bourdev, L. 2014. PANDA: Pose aligned networks for deep attribute modeling. CVPR.
[30]
Hinton, G., Srivastava, N., Krizhevsky, A., Sutskever, I. and Salakhutdinov, R. 2012. Improving neural networks by preventing coadaptation of feature detectors. ArXiv:1207.0508.
[31]
Glorot, X., Bordes, A., and Bengio, Y. 2011. Deep sparse rectifier networks. ICAIS, 315--323.
[32]
Goodfellow, I. J., Warde-Farley, D., Mirza, M., Courville, A., and Bengio, Y. 2013. Maxout networks. arXiv preprint arXiv:1302.4389.
[33]
Lin, M., Chen, Q., and Yan, S. 2013. Network in network. ICLR.
[34]
Socher, R., Lin, C., and Ng, A. 2011. Parsing natural scenes and natural language with recursive neural Networks. ICML, 129--136.
[35]
Le, Q. V., Zou, W. Y., Yeung, S. Y., and Ng, A. Y. 2011. Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. CVPR, 3361--3368.
[36]
Gopalan, R., Li, R., and Chellappa, R. 2014. Unsupervised adaptationacross domain shifts by generating intermediate data representations. TPAMI, Vol. 36, No. 11, 2288--2302.
[37]
Razavian, A. S., Azizpour, H., Sullivan, J., and Carlsson, S. 2014. Cnn features off-the-shelf: an astounding baseline for recognition. CPVRW, 512--519.
[38]
Oquab, M., Bottou, L., Laptev, I., and Sivic, J. 2014. Learning and transferring mid-level image representations using convolutional neural networks. CVPR, 1717--1724.
[39]
Chen, Q., Huang, J., Feris, R., Brown, L. M., Dong, J., and S. Yan. 2015. Deep Domain Adaptation for Describing People Based on Fine-Grained Clothing Attributes. CVPR, 5315--5324.
[40]
Huang, J. Feris, R. S., Chen, Q., and Yan, S. 2015. Cross-domain Image Retrieval with a Dual Attribute-aware Ranking Network. arXiv preprint arXiv:1505.07922.
[41]
Yamaguchi, K., Berg, T. L., and Ortiz, L. E. 2014. Chic or Social: Visual Popularity Analysis in Online Fashion Networks. ACM Conference on Multimedia, 773--776.
[42]
Yamaguchi, K., Kiapour, M. H., Ortiz, L. E., and Berg, T. L. 2012. Parsing clothing in fashion photographs. CVPR, 3570--3577.
[43]
Yamaguchi, K., Kiapour, M. H., and Berg, T. L. 2013. Paper doll parsing: Retrieving similar styles to parse clothing items. ICCV, 3519--3526.
[44]
Kalantidis, Y., Kennedy, L., and Li, L. J. 2013. Getting the look: clothing recognition and segmentation for automatic product suggestions in everyday photos. ICMR, 105--112.
[45]
Jagadeesh, V., Piramuthu, R., Bhardwaj, A., Di, W., and Sundaresan, N. 2014. Large scale visual recommendations from street fashion images. ACM SIGKDD, 1925--1934.
[46]
Liu, S., Feng, J., Song, Z., Zhang, T., Lu, H., Xu, C., and Yan, S. 2012. Hi, magic closet, tell me what to wear! ICM, 619--628.
[47]
Kalantidis, Y., Kennedy, L., and Li, L. 2013. Getting the look: clothing recognition and segmentation for automatic product suggestions in everyday photos. ICMR, 105--112.
[48]
Bossard, L., Dantone, M., Leistner, C., Wengert, C., Quack, T., and Gool, L. V. 2013. Apparel classification with style. ACCV, Vol. 7727, 321--335.
[49]
Gallagher, A., and Chen, T. 2008. Clothing cosegmentation for recognizing people. CVPR, 1--8.
[50]
Song, Z., Wang., Hua, M. X., and Yan, S. 2011. Predicting occupation via human clothing and contexts. ICCV, 1084--1091.
[51]
Kwak, I., Murillo, A., Belhumeur, P., Kriegman, D., and Belongie, S. 2013. From bikers to surfers: visual recognition of urban tribes. BMVC.
[52]
Liu, S., Feng, J., Domokos, C., Xu, H., Huang, J., Hu, Z., and Yan, S. 2014. Fashion parsing with weak color-category labels. TMM, Vol. 16, No. 1, 253--265. DOI=http://dx.doi.org/10.1109/TMM.2013.2285526.
[53]
Dong, J., Chen, Q., Xia, W., Huang, Z., and Yan, S. 2013. A deformable mixture parsingmodel with parselets ICCV, pp. 3408--3415.
[54]
Yang W., Luo, P., and Lin, L. 2014. Clothing co-parsing by joint image segmentation and labeling. CVPR, 3182--3189.
[55]
Liu, C., Yuen, J., and Torralba, A. 2011. Nonparametric scene parsing via label transfer. TPAMI, Vol. 33, No. 12, 2368--2382.
[56]
Tung, F., and Little, J. J. 2014. Collage parsing: nonparametric scene parsing by adaptive overlapping windows. ECCV, Vol. 8694, 511--5252.
[57]
Farabet, C., Couprie, C., Najman, L., and LeCun, Y. 2013. Learning hierarchical features for scene labeling. TPAMI, Vol. 35, No. 8.
[58]
Long, J., Zhang, N., and Darrell, T. 2014. Do convnets learn correspondence. NIPS, 1601--1609.
[59]
Liu, S., Liang, X., Liu, L., Shen, X., Yang, J., Xu, C., Lin, L., Cao1, X., and Yan, S. 2015. Matching-CNN Meets KNN: Quasi-Parametric Human Parsing. arXiv:1504.01220.
[60]
Wah, W. Di, C., A., Bhardwaj, Piramuthu, R., and Sundaresan, N. 2013. Style finder: fine-grained clothing style recognition and retrieval, CVPRW, 8--13.
[61]
Borràs, A., Tous, F., Lladós, J., and Vanrell, M. 2003. High-level clothes description based on color-texture and structural features. PRIA, Vol. 2652, 108--116. DO: http://dx.doi.org/10.1007/978-3-540-44871-6_13.
[62]
Chen, J. C., Xue, B. F. and Lin, Kawuu, W. 2015. Dictionary Learning for Discovering Visual Elements of Fashion Styles. CEC workshop.
[63]
Kiapour, M. H., Yamaguchi, K., Berg A. C., and Berg, T. L. 2014. Hipster Wars: Discovering Elements of Fashion Styles. ECCV, 472--488.
[64]
Khosla, N. and Venkataraman, V. Building Image-Based Shoe Search Using Convolutional Neural Networks. CS231n Course Project Reports.
[65]
Lin, K., Yang, H. F., Liu, K. H., Hsiao, J. H., and Chen, C. S. 2015. Rapid clothing retrieval via deep learning of binary codes and hierarchical search. ICMR, 499--502. http://dx.doi.org/10.1145/2671188.2749318.
[66]
Dean, J. and Ghemawat, S. 2008. MapReduce: simplified data processing on large clusters. ACM Magazine. 107--113. DOI= http://dx.doi.org/10.1145/1327452.1327492.
[67]
Dean, J. 2012. Large scale distributed deep networks. NIPS, 1232--1240.
[68]
Yangqing, J., Evan, S., Jeff, D., Sergey, K., Jonathan, L., Ross, G., Sergio, G., and Trevor, D. 2014. Caffe: Convolutional architecture for fast feature embedding. ICM, 675--678.
[69]
Krizhevsky A., cuda-convnet. https://code.google.com/p/cuda-convnet/
[70]
Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., and Darrell, T. 2013. Decaf: A deep convolutional activation feature for generic visual recognition. arXiv preprint arXiv:1310.1531.
[71]
Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., and LeCun, Y. 2014. Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv preprintb arXiv: 1312.6229.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ASE BD&SI '15: Proceedings of the ASE BigData & SocialInformatics 2015
October 2015
381 pages
ISBN:9781450337359
DOI:10.1145/2818869
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 October 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Big Data Analytics
  2. Clothing Image Retrieval
  3. Convolution Neural Network
  4. Deep Learning
  5. Style Recognition

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ASE BD&SI '15
ASE BD&SI '15: ASE BigData & SocialInformatics 2015
October 7 - 9, 2015
Kaohsiung, Taiwan

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 594
    Total Downloads
  • Downloads (Last 12 months)22
  • Downloads (Last 6 weeks)1
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media