No abstract available.
[Front cover]
Presents the front cover or splash screen of the 2012 IEEE 12th International Conference on Data Mining proceedings record.
[Title page i]
Presents the title page the proceedings record.
[Title page iii]
Presents the title page of contents of the proceedings record.
Message from General Co-chairs
This is the twelfth annual IEEE International Conference on Data Mining. It has grown in size and stature, and is considered today as a premier international research conference on data mining. After San Jose, USA (2001), Maebashi City, Japan (2001), ...
Message from Program Co-chairs
The ICDM conference is truly an international forum. During its twelve-year history, the conference has been held in nine countries around the world. This year s conference continues this global trend: our organizing and program committee members ...
Program Committee
Provides a listing of current committee members and society officers.
Keynotes [3 abstracts]
Provides an abstract for each of the three keynote presentations and a brief professional biography of each presenter.
Differentially Private Histogram Publishing through Lossy Compression
Differential privacy has emerged as one of the most promising privacy models for private data release. It can be used to release different types of data, and, in particular, histograms, which provide useful summaries of a dataset. Several differentially ...
Spotting Culprits in Epidemics: How Many and Which Ones?
Given a snapshot of a large graph, in which an infection has been spreading for some time, can we identify those nodes from which the infection started to spread? In other words, can we reliably tell who the culprits are? In this paper we answer this ...
Self-Adjusting Models for Semi-supervised Learning in Partially Observed Settings
We present a new direction for semi-supervised learning where self-adjusting generative models replace fixed ones and unlabeled data can potentially improve learning even when labeled data is only partially-observed. We model each class data by a ...
Stream Classification with Recurring and Novel Class Detection Using Class-Based Ensemble
Concept-evolution has recently received a lot of attention in the context of mining data streams. Concept-evolution occurs when a new class evolves in the stream. Although many recent studies address this issue, most of them do not consider the scenario ...
Feature Weighting and Selection Using Hypothesis Margin of Boosting
Utilizing the concept of hypothesis margins to measure the quality of a set of features has been a growing line of research in the last decade. However, most previous algorithms have been developed under the large hypothesis margin principles of the 1-...
GPU-Accelerated Feature Selection for Outlier Detection Using the Local Kernel Density Ratio
Effective outlier detection requires the data to be described by a set of features that captures the behavior of normal data while emphasizing those characteristics of outliers which make them different than normal data. In this work, we present a novel ...
Sequential Alternating Proximal Method for Scalable Sparse Structural SVMs
Structural Support Vector Machines (SSVMs) have recently gained wide prominence in classifying structured and complex objects like parse-trees, image segments and Part-of-Speech (POS) tags. Typical learning algorithms used in training SSVMs result in ...
Computational Television Advertising
Ever wonder why that Kia Ad ran during Iron Chef? Traditional advertising methodology on television is a fascinating mix of marketing, branding, measurement, and predictive modeling. While still a robust business, it is at risk with the recent growth of ...
Topic-Aware Social Influence Propagation Models
We study social influence from a topic modeling perspective. We introduce novel topic-aware influence-driven propagation models that experimentally result to be more accurate in describing real-world cascades than the standard propagation models studied ...
GUISE: Uniform Sampling of Graphlets for Large Graph Analysis
Graphlet frequency distribution (GFD) has recently become popular for characterizing large networks. However, the computation of GFD for a network requires the exact count of embedded graphlets in that network, which is a computationally expensive task. ...
Hierarchical Multilabel Classification with Minimum Bayes Risk
Hierarchical multilabel classification (HMC) allows an instance to have multiple labels residing in a hierarchy. A popular loss function used in HMC is the H-loss, which penalizes only the first classification mistake along each prediction path. However,...
The Mixture of Multi-kernel Relevance Vector Machines Model
We present a new regression mixture model where each mixture component is a multi-kernel version of the Relevance Vector Machine (RVM). In the proposed model, we exploit the enhanced modeling capability of RVMs due to their embedded sparsity enforcing ...
Diffusion of Information in Social Networks: Is It All Local?
Recent studies on the diffusion of information in social networks have largely focused on models based on the influence of local friends. In this paper, we challenge the generalizability of this approach and revive theories introduced by social ...
Efficient Pattern-Based Time Series Classification on GPU
Time series shapelet discovery algorithm finds subsequences from a set of time series for use as primitives for time series classification. This algorithm has drawn a lot of interest because of the interpretability of its results. However, computation ...
Inferring the Root Cause in Road Traffic Anomalies
We propose a novel two-step mining and optimization framework for inferring the root cause of anomalies that appear in road traffic data. We model road traffic as a time-dependent flow on a network formed by partitioning a city into regions bounded by ...
Student-t Based Robust Spatio-temporal Prediction
This paper describes an efficient and effective design of Robust Spatio-Temporal Prediction based on Student's $t$ distribution, namely, St-RSTP, to provide estimations based on observations over spatio-temporal neighbors. The proposed St-RSTP is more ...
Efficient Kernel Clustering Using Random Fourier Features
Kernel clustering algorithms have the ability to capture the non-linear structure inherent in many real world data sets and thereby, achieve better clustering performance than Euclidean distance based clustering algorithms. However, their quadratic ...
Detecting Anomalies in Bipartite Graphs with Mutual Dependency Principles
Bipartite graphs can model many real life applications including users-rating-products in online marketplaces, users-clicking-webpages on the World Wide Web and users referring- users in social networks. In these graphs, the anomalousness of nodes in ...
Link Prediction and Recommendation across Heterogeneous Social Networks
Link prediction and recommendation is a fundamental problem in social network analysis. The key challenge of link prediction comes from the sparsity of networks due to the strong disproportion of links that they have potential to form to links that do ...
Multi-task Semi-supervised Semantic Feature Learning for Classification
Multi-task learning has proven to be useful to boost the learning of multiple related but different tasks. Meanwhile, latent semantic models such as LSA and LDA are popular and effective methods to extract discriminative semantic features of high ...
Robust Nonnegative Matrix Factorization via Half-Quadratic Minimization
Nonnegative matrix factorization (NMF) is a popular technique for learning parts-based representation and data clustering. It usually uses the squared residuals to quantify the quality of factorization, which is optimal specifically to zero-mean, ...
RankTopic: Ranking Based Topic Modeling
Topic modeling has become a widely used tool for document management due to its superior performance. However, there are few topic models distinguishing the importance of documents on different topics. In this paper, we investigate how to utilize the ...
Index Terms
- Proceedings of the 2012 IEEE 12th International Conference on Data Mining