Nothing Special   »   [go: up one dir, main page]

skip to main content
Skip header Section
Imbalanced Learning: Foundations, Algorithms, and ApplicationsJuly 2013
Publisher:
  • Wiley-IEEE Press
ISBN:978-1-118-07462-6
Published:01 July 2013
Pages:
216
Skip Bibliometrics Section
Reflects downloads up to 12 Nov 2024Bibliometrics
Skip Abstract Section
Abstract

The first book of its kind to review the current status and future direction of the exciting new branch of machine learning/data mining called imbalanced learningImbalanced learning focuses on how an intelligent system can learn when it is provided with imbalanced data. Solving imbalanced learning problems is critical in numerous data-intensive networked systems, including surveillance, security, Internet, finance, biomedical, defense, and more. Due to the inherent complex characteristics of imbalanced data sets, learning from such data requires new understandings, principles, algorithms, and tools to transform vast amounts of raw data efficiently into information and knowledge representation. The first comprehensive look at this new branch of machine learning, this book offers a critical review of the problem of imbalanced learning, covering the state of the art in techniques, principles, and real-world applications. Featuring contributions from experts in both academia and industry, Imbalanced Learning: Foundations, Algorithms, and Applications provides chapter coverage on:Foundations of Imbalanced LearningImbalanced Datasets: From Sampling to ClassifiersEnsemble Methods for Class Imbalance LearningClass Imbalance Learning Methods for Support Vector MachinesClass Imbalance and Active LearningNonstationary Stream Data Learning with Imbalanced Class DistributionAssessment Metrics for Imbalanced LearningImbalanced Learning: Foundations, Algorithms, and Applications will help scientists and engineers learn how to tackle the problem of learning from imbalanced datasets, and gain insight into current developments in the field as well as future research directions.

Cited By

  1. Dong S, Yu T, Farahmand H and Mostafavi A (2020). A hybrid deep learning model for predictive flood warning and situation awareness using channel network sensors data, Computer-Aided Civil and Infrastructure Engineering, 36:4, (402-420), Online publication date: 12-Mar-2021.
  2. ACM
    Piras L, Boratto L and Ramos G Evaluating the Prediction Bias Induced by Label Imbalance in Multi-label Classification Proceedings of the 30th ACM International Conference on Information & Knowledge Management, (3368-3372)
  3. De Falco I, De Pietro G and Sannino G (2019). Evaluation of artificial intelligence techniques for the classification of different activities of daily living and falls, Neural Computing and Applications, 32:3, (747-758), Online publication date: 1-Feb-2020.
  4. Iosifidis V and Ntoutsi E (2019). Sentiment analysis on big sparse data streams with limited labels, Knowledge and Information Systems, 62:4, (1393-1432), Online publication date: 1-Apr-2020.
  5. Ding H, Wei B, Gu Z, Yu Z, Zheng H, Zheng B and Li J (2019). KA-Ensemble: towards imbalanced image classification ensembling under-sampling and over-sampling, Multimedia Tools and Applications, 79:21-22, (14871-14888), Online publication date: 1-Jun-2020.
  6. Zhu Z, Wang Z, Li D, Du W and Zhang J (2019). Efficient matrixized classification learning with separated solution process, Neural Computing and Applications, 32:14, (10609-10632), Online publication date: 1-Jul-2020.
  7. Kim T, Jung I and Hu Y (2020). Automatic, location-privacy preserving dashcam video sharing using blockchain and deep learning, Human-centric Computing and Information Sciences, 10:1, Online publication date: 26-Aug-2020.
  8. Starovoitov V and Golub Y (2020). New Function for Estimating Imbalanced Data Classification Results, Pattern Recognition and Image Analysis, 30:3, (295-302), Online publication date: 1-Jul-2020.
  9. Liu T, Zhu X, Pedrycz W and Li Z (2020). A design of information granule-based under-sampling method in imbalanced data classification, Soft Computing - A Fusion of Foundations, Methodologies and Applications, 24:22, (17333-17347), Online publication date: 1-Nov-2020.
  10. Chen T, An S, Zhang Y, Ma C, Wang H, Guo X and Zheng W Improving Monocular Depth Estimation by Leveraging Structural Awareness and Complementary Datasets Computer Vision – ECCV 2020, (90-108)
  11. Georgakopoulos S, Tasoulis S, Mallis G, Vrahatis A, Plagianakos V and Maglogiannis I (2020). Change detection and convolution neural networks for fall recognition, Neural Computing and Applications, 32:23, (17245-17258), Online publication date: 1-Dec-2020.
  12. Guzmán-Ponce A, Valdovinos R and Sánchez J A Cluster-Based Under-Sampling Algorithm for Class-Imbalanced Data Hybrid Artificial Intelligent Systems, (299-311)
  13. ACM
    Heidari A, McGrath J, Ilyas I and Rekatsinas T HoloDetect Proceedings of the 2019 International Conference on Management of Data, (829-846)
  14. Tian F, Wu F, Fei X, Shah N, Zheng Q and Wang Y (2019). Improving generalization ability of instance transfer-based imbalanced sentiment classification of turn-level interactive Chinese texts, Service Oriented Computing and Applications, 13:2, (155-167), Online publication date: 1-Jun-2019.
  15. Yu H, Sun C, Yang X, Zheng S and Zou H (2019). Fuzzy Support Vector Machine With Relative Density Information for Classifying Imbalanced Data, IEEE Transactions on Fuzzy Systems, 27:12, (2353-2367), Online publication date: 1-Dec-2019.
  16. Cao K, Wei C, Gaidon A, Arechiga N and Ma T Learning imbalanced datasets with label-distribution-aware margin loss Proceedings of the 33rd International Conference on Neural Information Processing Systems, (1567-1578)
  17. Benítez-Peña S, Blanquero R, Carrizosa E and Ramírez-Cobo P (2019). On support vector machines under a multiple-cost scenario, Advances in Data Analysis and Classification, 13:3, (663-682), Online publication date: 1-Sep-2019.
  18. Camiña J, Medina-Pérez M, Monroy R, Loyola-González O, Villanueva L and Gurrola L (2019). Bagging-RandomMiner, Machine Vision and Applications, 30:5, (959-974), Online publication date: 1-Jul-2019.
  19. Roy A, Cruz R, Sabourin R and Cavalcanti G (2018). A study on combining dynamic selection and data preprocessing for imbalance learning, Neurocomputing, 286:C, (179-192), Online publication date: 19-Apr-2018.
  20. Sun B, Chen H, Wang J and Xie H (2018). Evolutionary under-sampling based bagging ensemble method for imbalanced data classification, Frontiers of Computer Science: Selected Publications from Chinese Universities, 12:2, (331-350), Online publication date: 1-Apr-2018.
  21. Lu X, Chen M, Wu J, Chang P and Chen M (2018). A novel ensemble decision tree based on under-sampling and clonal selection for web spam detection, Pattern Analysis & Applications, 21:3, (741-754), Online publication date: 1-Aug-2018.
  22. Akkasi A, Varoğlu E and Dimililer N (2018). Balanced undersampling, Applied Intelligence, 48:8, (1965-1978), Online publication date: 1-Aug-2018.
  23. Guo H, Diao X, Liu H and Gutierrez P (2018). Embedding Undersampling Rotation Forest for Imbalanced Problem, Computational Intelligence and Neuroscience, 2018, Online publication date: 1-Jan-2018.
  24. ACM
    Santhiappan S, Chelladurai J and Ravindran B A novel topic modeling based weighting framework for class imbalance learning Proceedings of the ACM India Joint International Conference on Data Science and Management of Data, (20-29)
  25. Krawczyk B, Minku L, Gama J, Stefanowski J and Woniak M (2017). Ensemble learning for data stream analysis, Information Fusion, 37:C, (132-156), Online publication date: 1-Sep-2017.
  26. Peng L, Zhang H, Chen Y and Yang B (2017). Imbalanced traffic identification using an imbalanced data gravitation-based classification model, Computer Communications, 102:C, (177-189), Online publication date: 1-Apr-2017.
  27. Mrquez-Chamorro A, Resinas M, Ruiz-Corts A and Toro M (2017). Run-time prediction of business process indicators using evolutionary decision rules, Expert Systems with Applications: An International Journal, 87:C, (1-14), Online publication date: 30-Nov-2017.
  28. Brzezinski D and Stefanowski J (2017). Prequential AUC, Knowledge and Information Systems, 52:2, (531-562), Online publication date: 1-Aug-2017.
  29. Haixiang G, Yijing L, Shang J, Mingyun G, Yuanyue H and Bing G (2017). Learning from class-imbalanced data, Expert Systems with Applications: An International Journal, 73:C, (220-239), Online publication date: 1-May-2017.
  30. Wang L, Zhao L, Gui G, Zheng B, Huang R and Liu A (2017). Adaptive Ensemble Method Based on Spatial Characteristics for Classifying Imbalanced Data, Scientific Programming, 2017, Online publication date: 1-Jan-2017.
  31. Mills C Towards the automatic classification of traceability links Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering, (1018-1021)
  32. ACM
    Liu S, Zhang J and Xiang Y Statistical Detection of Online Drifting Twitter Spam Proceedings of the 11th ACM on Asia Conference on Computer and Communications Security, (1-10)
  33. ACM
    Branco P, Torgo L and Ribeiro R (2016). A Survey of Predictive Modeling on Imbalanced Domains, ACM Computing Surveys, 49:2, (1-50), Online publication date: 11-Nov-2016.
  34. Tian F, Wu F, Chao K, Zheng Q, Shah N, Lan T and Yue J (2016). A topic sentence-based instance transfer method for imbalanced sentiment classification of Chinese product reviews, Electronic Commerce Research and Applications, 16:C, (66-76), Online publication date: 1-Mar-2016.
  35. Peersman C, Schulze C, Rashid A, Brennan M and Fischer C (2016). iCOP, Digital Investigation: The International Journal of Digital Forensics & Incident Response, 18:C, (50-64), Online publication date: 1-Sep-2016.
  36. Cruz N, Taboada M and Mitkov R (2016). A machine-learning approach to negation and speculation detection for sentiment analysis, Journal of the Association for Information Science and Technology, 67:9, (2118-2136), Online publication date: 1-Sep-2016.
  37. ztrk M and Zengin A (2016). How repeated data points affect bug prediction performance, Applied Soft Computing, 49:C, (1051-1061), Online publication date: 1-Dec-2016.
  38. Timsina P, Liu J and El-Gayar O (2016). Advanced analytics for the automation of medical systematic reviews, Information Systems Frontiers, 18:2, (237-252), Online publication date: 1-Apr-2016.
  39. Rizzo G, D'Amato C, Fanizzi N and Esposito F Inductive Classification Through Evidence-Based Models and Their Ensembles Proceedings of the 12th European Semantic Web Conference on The Semantic Web. Latest Advances and New Domains - Volume 9088, (418-433)
  40. Lango M and Stefanowski J The usefulness of roughly balanced bagging for complex and high-dimensional imbalanced data Proceedings of the 4th International Conference on New Frontiers in Mining Complex Patterns, (93-107)
  41. Napoletano P, Boccignone G and Tisato F (2015). Attentive Monitoring of Multiple Video Streams Driven by a Bayesian Foraging Strategy, IEEE Transactions on Image Processing, 24:11, (3266-3281), Online publication date: 1-Nov-2015.
  42. ACM
    Krempl G, Žliobaite I, Brzeziński D, Hüllermeier E, Last M, Lemaire V, Noack T, Shaker A, Sievi S, Spiliopoulou M and Stefanowski J (2014). Open challenges for data stream mining research, ACM SIGKDD Explorations Newsletter, 16:1, (1-10), Online publication date: 25-Sep-2014.
  43. Peng L, Zhang H, Yang B and Chen Y (2014). A new approach for imbalanced data classification based on data gravitation, Information Sciences: an International Journal, 288:C, (347-373), Online publication date: 20-Dec-2014.
  44. ACM
    Rafi-Ur-Rashid M, Mahbub M and Adnan M Breaking the Curse of Class Imbalance: Bangla Text Classification, ACM Transactions on Asian and Low-Resource Language Information Processing, 0:0
Contributors
  • The University of Rhode Island

Index Terms

  1. Imbalanced Learning: Foundations, Algorithms, and Applications

    Reviews

    CK Raju

    Imagine an imbalanced dataset of cancer patients with highly skewed data having only 0.01 percent positive cancer cases. A naive or dumb machine that calls out "no cancer" to all queries would appear to be 99.99 percent accurate, and could even be misconstrued as a good prediction model over a competing machine learning algorithm. Additionally, the consequences of such a misclassification could be disastrous for patients with cancer. A comprehensive knowledge of machine learning, therefore, would be incomplete without a fair understanding of such predicaments and how to resolve them. This book promises to engage the reader by providing a vivid picture of the problems associated with imbalanced datasets, specific aspects and approaches to solve the problems, and assessment metrics. The narrative is ordered and easy to understand. A dozen authors contribute to the book's eight chapters: "Introduction," "Foundations of Imbalanced Learning," "Imbalanced Datasets: From Sampling to Classifiers," "Ensemble Methods for Class Imbalance Learning," "Class Imbalance Learning Methods for Support Vector Machines," "Class Imbalance and Active Learning," "Nonstationary Stream Data Learning with Imbalanced Class Distribution," and "Assessment Metrics for Imbalanced Learning." There aren't any competing books on imbalanced learning. Leaving aside the usual issues associated with multiple contributors-for example, the high chance for repetition or the difficulty of maintaining uniformity in presentation style-the book does justice to machine learning by bringing out issues related to imbalanced datasets. The significance of precision and recall is introduced or explained in multiple chapters, but presented as it is from varying perspectives, it doesn't affect the interest of the reader. The editors have succeeded in maintaining coherency and consistency while presenting content. For instance, while the terms F-score or F1 score could also have been used, the consistent use of F-measure throughout the book is noteworthy. Consistency is also visible in the illustrations involving precision and recall. With different authors assigned to different chapters, it is extremely difficult to trace out errors. Only one instance was detected: welding flaw was introduced as an example for imbalanced datasets in chapter 2. The case associated with welding flaws is a more apt example for discussions on anomaly detection and outliers. Anomalies, by definition, do not constitute a class or cluster by themselves, even if skewness is present as an attribute on the data. This book certainly qualifies as a reference for graduate studies in machine learning. Research students are sure to find it highly valuable and a prized possession, especially taking into account the wealth of supporting literature that the authors have brought to the fore. Online Computing Reviews Service

    Access critical reviews of Computing literature here

    Become a reviewer for Computing Reviews.

    Please enable JavaScript to view thecomments powered by Disqus.

    Recommendations