The first book of its kind to review the current status and future direction of the exciting new branch of machine learning/data mining called imbalanced learningImbalanced learning focuses on how an intelligent system can learn when it is provided with imbalanced data. Solving imbalanced learning problems is critical in numerous data-intensive networked systems, including surveillance, security, Internet, finance, biomedical, defense, and more. Due to the inherent complex characteristics of imbalanced data sets, learning from such data requires new understandings, principles, algorithms, and tools to transform vast amounts of raw data efficiently into information and knowledge representation. The first comprehensive look at this new branch of machine learning, this book offers a critical review of the problem of imbalanced learning, covering the state of the art in techniques, principles, and real-world applications. Featuring contributions from experts in both academia and industry, Imbalanced Learning: Foundations, Algorithms, and Applications provides chapter coverage on:Foundations of Imbalanced LearningImbalanced Datasets: From Sampling to ClassifiersEnsemble Methods for Class Imbalance LearningClass Imbalance Learning Methods for Support Vector MachinesClass Imbalance and Active LearningNonstationary Stream Data Learning with Imbalanced Class DistributionAssessment Metrics for Imbalanced LearningImbalanced Learning: Foundations, Algorithms, and Applications will help scientists and engineers learn how to tackle the problem of learning from imbalanced datasets, and gain insight into current developments in the field as well as future research directions.
Cited By
- Dong S, Yu T, Farahmand H and Mostafavi A (2020). A hybrid deep learning model for predictive flood warning and situation awareness using channel network sensors data, Computer-Aided Civil and Infrastructure Engineering, 36:4, (402-420), Online publication date: 12-Mar-2021.
- Piras L, Boratto L and Ramos G Evaluating the Prediction Bias Induced by Label Imbalance in Multi-label Classification Proceedings of the 30th ACM International Conference on Information & Knowledge Management, (3368-3372)
- De Falco I, De Pietro G and Sannino G (2019). Evaluation of artificial intelligence techniques for the classification of different activities of daily living and falls, Neural Computing and Applications, 32:3, (747-758), Online publication date: 1-Feb-2020.
- Iosifidis V and Ntoutsi E (2019). Sentiment analysis on big sparse data streams with limited labels, Knowledge and Information Systems, 62:4, (1393-1432), Online publication date: 1-Apr-2020.
- Ding H, Wei B, Gu Z, Yu Z, Zheng H, Zheng B and Li J (2019). KA-Ensemble: towards imbalanced image classification ensembling under-sampling and over-sampling, Multimedia Tools and Applications, 79:21-22, (14871-14888), Online publication date: 1-Jun-2020.
- Zhu Z, Wang Z, Li D, Du W and Zhang J (2019). Efficient matrixized classification learning with separated solution process, Neural Computing and Applications, 32:14, (10609-10632), Online publication date: 1-Jul-2020.
- Kim T, Jung I and Hu Y (2020). Automatic, location-privacy preserving dashcam video sharing using blockchain and deep learning, Human-centric Computing and Information Sciences, 10:1, Online publication date: 26-Aug-2020.
- Starovoitov V and Golub Y (2020). New Function for Estimating Imbalanced Data Classification Results, Pattern Recognition and Image Analysis, 30:3, (295-302), Online publication date: 1-Jul-2020.
- Liu T, Zhu X, Pedrycz W and Li Z (2020). A design of information granule-based under-sampling method in imbalanced data classification, Soft Computing - A Fusion of Foundations, Methodologies and Applications, 24:22, (17333-17347), Online publication date: 1-Nov-2020.
- Chen T, An S, Zhang Y, Ma C, Wang H, Guo X and Zheng W Improving Monocular Depth Estimation by Leveraging Structural Awareness and Complementary Datasets Computer Vision – ECCV 2020, (90-108)
- Georgakopoulos S, Tasoulis S, Mallis G, Vrahatis A, Plagianakos V and Maglogiannis I (2020). Change detection and convolution neural networks for fall recognition, Neural Computing and Applications, 32:23, (17245-17258), Online publication date: 1-Dec-2020.
- Guzmán-Ponce A, Valdovinos R and Sánchez J A Cluster-Based Under-Sampling Algorithm for Class-Imbalanced Data Hybrid Artificial Intelligent Systems, (299-311)
- Heidari A, McGrath J, Ilyas I and Rekatsinas T HoloDetect Proceedings of the 2019 International Conference on Management of Data, (829-846)
- Tian F, Wu F, Fei X, Shah N, Zheng Q and Wang Y (2019). Improving generalization ability of instance transfer-based imbalanced sentiment classification of turn-level interactive Chinese texts, Service Oriented Computing and Applications, 13:2, (155-167), Online publication date: 1-Jun-2019.
- Yu H, Sun C, Yang X, Zheng S and Zou H (2019). Fuzzy Support Vector Machine With Relative Density Information for Classifying Imbalanced Data, IEEE Transactions on Fuzzy Systems, 27:12, (2353-2367), Online publication date: 1-Dec-2019.
- Cao K, Wei C, Gaidon A, Arechiga N and Ma T Learning imbalanced datasets with label-distribution-aware margin loss Proceedings of the 33rd International Conference on Neural Information Processing Systems, (1567-1578)
- Benítez-Peña S, Blanquero R, Carrizosa E and Ramírez-Cobo P (2019). On support vector machines under a multiple-cost scenario, Advances in Data Analysis and Classification, 13:3, (663-682), Online publication date: 1-Sep-2019.
- Camiña J, Medina-Pérez M, Monroy R, Loyola-González O, Villanueva L and Gurrola L (2019). Bagging-RandomMiner, Machine Vision and Applications, 30:5, (959-974), Online publication date: 1-Jul-2019.
- Roy A, Cruz R, Sabourin R and Cavalcanti G (2018). A study on combining dynamic selection and data preprocessing for imbalance learning, Neurocomputing, 286:C, (179-192), Online publication date: 19-Apr-2018.
- Sun B, Chen H, Wang J and Xie H (2018). Evolutionary under-sampling based bagging ensemble method for imbalanced data classification, Frontiers of Computer Science: Selected Publications from Chinese Universities, 12:2, (331-350), Online publication date: 1-Apr-2018.
- Lu X, Chen M, Wu J, Chang P and Chen M (2018). A novel ensemble decision tree based on under-sampling and clonal selection for web spam detection, Pattern Analysis & Applications, 21:3, (741-754), Online publication date: 1-Aug-2018.
- Akkasi A, Varoğlu E and Dimililer N (2018). Balanced undersampling, Applied Intelligence, 48:8, (1965-1978), Online publication date: 1-Aug-2018.
- Guo H, Diao X, Liu H and Gutierrez P (2018). Embedding Undersampling Rotation Forest for Imbalanced Problem, Computational Intelligence and Neuroscience, 2018, Online publication date: 1-Jan-2018.
- Santhiappan S, Chelladurai J and Ravindran B A novel topic modeling based weighting framework for class imbalance learning Proceedings of the ACM India Joint International Conference on Data Science and Management of Data, (20-29)
- Krawczyk B, Minku L, Gama J, Stefanowski J and Woniak M (2017). Ensemble learning for data stream analysis, Information Fusion, 37:C, (132-156), Online publication date: 1-Sep-2017.
- Peng L, Zhang H, Chen Y and Yang B (2017). Imbalanced traffic identification using an imbalanced data gravitation-based classification model, Computer Communications, 102:C, (177-189), Online publication date: 1-Apr-2017.
- Mrquez-Chamorro A, Resinas M, Ruiz-Corts A and Toro M (2017). Run-time prediction of business process indicators using evolutionary decision rules, Expert Systems with Applications: An International Journal, 87:C, (1-14), Online publication date: 30-Nov-2017.
- Brzezinski D and Stefanowski J (2017). Prequential AUC, Knowledge and Information Systems, 52:2, (531-562), Online publication date: 1-Aug-2017.
- Haixiang G, Yijing L, Shang J, Mingyun G, Yuanyue H and Bing G (2017). Learning from class-imbalanced data, Expert Systems with Applications: An International Journal, 73:C, (220-239), Online publication date: 1-May-2017.
- Wang L, Zhao L, Gui G, Zheng B, Huang R and Liu A (2017). Adaptive Ensemble Method Based on Spatial Characteristics for Classifying Imbalanced Data, Scientific Programming, 2017, Online publication date: 1-Jan-2017.
- Mills C Towards the automatic classification of traceability links Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering, (1018-1021)
- Liu S, Zhang J and Xiang Y Statistical Detection of Online Drifting Twitter Spam Proceedings of the 11th ACM on Asia Conference on Computer and Communications Security, (1-10)
- Branco P, Torgo L and Ribeiro R (2016). A Survey of Predictive Modeling on Imbalanced Domains, ACM Computing Surveys, 49:2, (1-50), Online publication date: 11-Nov-2016.
- Tian F, Wu F, Chao K, Zheng Q, Shah N, Lan T and Yue J (2016). A topic sentence-based instance transfer method for imbalanced sentiment classification of Chinese product reviews, Electronic Commerce Research and Applications, 16:C, (66-76), Online publication date: 1-Mar-2016.
- Peersman C, Schulze C, Rashid A, Brennan M and Fischer C (2016). iCOP, Digital Investigation: The International Journal of Digital Forensics & Incident Response, 18:C, (50-64), Online publication date: 1-Sep-2016.
- Cruz N, Taboada M and Mitkov R (2016). A machine-learning approach to negation and speculation detection for sentiment analysis, Journal of the Association for Information Science and Technology, 67:9, (2118-2136), Online publication date: 1-Sep-2016.
- ztrk M and Zengin A (2016). How repeated data points affect bug prediction performance, Applied Soft Computing, 49:C, (1051-1061), Online publication date: 1-Dec-2016.
- Timsina P, Liu J and El-Gayar O (2016). Advanced analytics for the automation of medical systematic reviews, Information Systems Frontiers, 18:2, (237-252), Online publication date: 1-Apr-2016.
- Rizzo G, D'Amato C, Fanizzi N and Esposito F Inductive Classification Through Evidence-Based Models and Their Ensembles Proceedings of the 12th European Semantic Web Conference on The Semantic Web. Latest Advances and New Domains - Volume 9088, (418-433)
- Lango M and Stefanowski J The usefulness of roughly balanced bagging for complex and high-dimensional imbalanced data Proceedings of the 4th International Conference on New Frontiers in Mining Complex Patterns, (93-107)
- Napoletano P, Boccignone G and Tisato F (2015). Attentive Monitoring of Multiple Video Streams Driven by a Bayesian Foraging Strategy, IEEE Transactions on Image Processing, 24:11, (3266-3281), Online publication date: 1-Nov-2015.
- Krempl G, Žliobaite I, Brzeziński D, Hüllermeier E, Last M, Lemaire V, Noack T, Shaker A, Sievi S, Spiliopoulou M and Stefanowski J (2014). Open challenges for data stream mining research, ACM SIGKDD Explorations Newsletter, 16:1, (1-10), Online publication date: 25-Sep-2014.
- Peng L, Zhang H, Yang B and Chen Y (2014). A new approach for imbalanced data classification based on data gravitation, Information Sciences: an International Journal, 288:C, (347-373), Online publication date: 20-Dec-2014.
- Rafi-Ur-Rashid M, Mahbub M and Adnan M Breaking the Curse of Class Imbalance: Bangla Text Classification, ACM Transactions on Asian and Low-Resource Language Information Processing, 0:0
Index Terms
- Imbalanced Learning: Foundations, Algorithms, and Applications
Recommendations
Learning from Imbalanced Data
With the continuous expansion of data availability in many large-scale, complex, and networked systems, such as surveillance, security, Internet, and finance, it becomes critical to advance the fundamental understanding of knowledge discovery and ...
Multiset feature learning for highly imbalanced data classification
AAAI'17: Proceedings of the Thirty-First AAAI Conference on Artificial IntelligenceWith the expansion of data, increasing imbalanced data has emerged. When the imbalance ratio of data is high, most existing imbalanced learning methods decline in classification performance. To address this problem, a few highly imbalanced learning ...
Imbalanced Sentiment Classification with Multi-Task Learning
CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge ManagementSupervised learning methods are widely used in sentiment classification. However, when sentiment distribution is imbalanced, the performance of these methods declines. In this paper, we propose an effective approach for imbalanced sentiment ...