No abstract available.
Proceeding Downloads
Distributing a trust framework for utilitarian data exchanges in inter-organizational collaborations
Inter-organizational collaborations involve exchange of sensitive, utilitarian data. Such data exchanges require efficiently designed trust frameworks to explain data accesses in terms of the business logic of the collaboration, without becoming an ...
A knowledge reuse framework for improving novelty and diversity in recommendations
Recommender system (RS) is an important instrument in e-commerce, which provides personalized recommendations to individual user. Classical algorithms in recommender system mainly emphasize on recommendation accuracy in order to match individual user's ...
Multi-sensor event detection using shape histograms
Vehicular sensor data consists of multiple time-series arising from a number of sensors. Using such multi-sensor data we would like to detect occurrences of specific events that vehicles encounter, e.g., corresponding to particular maneuvers that a ...
Fast approximate dynamic warping kernels
The dynamic time warping (DTW) distance is a popular similarity measure for comparing time series data. It has been successfully applied in many fields like speech recognition, data mining and information retrieval to automatically cope with time ...
"Whom-to-interact": does conference networking boost your citation count?
Recently, conference publications have gained a wide popularity, specially in the domain of computer science. In conferences, the opportunity of personal interactions between the fellow researchers opens up a new dimension for the citation network ...
Categorising videos using a personalised category catalogue
Video is an extremely effective way of reaching farmers with the latest agricultural technology and stories of other farmers. With a well-organised multifaceted video library, we can provide the farmers with services such as easy navigation, search and ...
Measuring network centrality using hypergraphs
Networks abstracted as graph lose some information related to the super-dyadic relation among the nodes. We find natural occurrence of hyperedges in co-authorship, co-citation, social networks, e-mail networks, weblog networks etc. Treating these ...
Community reaction: from blogs to Facebook
Online social media is all pervasive in this digitally connected world. It provides a great platform to share information and news, and have public discussions on these topics. These interactions happen on owned-sites as well as on earned social media. ...
Correlating night-time satellite images with poverty and other census data of India and estimating future trends
Given India's night-time satellite images and census data, this paper proposes a method to correlate light intensity from images with state-wise poverty, population, GDP, and forest cover, and forecast future values of the same for each state. We use ...
Direct acyclic graph based multi-class twin support vector machine for pattern classification
In this paper, we propose a novel Multi-class Twin Support Vector Machine (MTWSVM) classifier on the basis of Direct Acyclic Graph (DAG) approach. MTWSVM is the multi-class extension of the recently proposed binary Twin Support Vector Machine (TWSVM) ...
Enhancement to community-based multi-relational link prediction using co-occurrence probability feature
Predicting future links or missing links is one of the useful application tasks in the analysis of social networks. Time and memory are major challenges for the link prediction task in large multi-relational social networks. This challenge is addressed ...
Monotonous (semi-)nonnegative matrix factorization
Nonnegative matrix factorization (NMF) factorizes a non-negative matrix into product of two non-negative matrices, namely a signal matrix and a mixing matrix. NMF suffers from the scale and ordering ambiguities. Often, the source signals can be ...
Efficiently discovering frequent motifs in large-scale sensor data
While analyzing vehicular sensor data, we found that frequently occurring waveforms could serve as features for further analysis, such as rule mining, classification, and anomaly detection. The discovery of waveform patterns, also known as time-series ...
From multiple views to single view: a neural network approach
In most general learning problems, data is obtained from multiple sources. Hence, the features can be inherently partitioned into multiple views or feature sets. For example, a media clip can have both audio and video features. If we concatenate these ...
Time stamp based set covering greedy algorithm
Influence maximization deals with finding a small set of target nodes that can be initially activated, such that the influence spread beginning with this causes maximum number of expected activated nodes in the network. Most of the existing algorithms ...
SimCat: an entity similarity measure for heterogeneous knowledge graph with categories
Establishing similarity between heterogeneous entities in a complex knowledge graph is a challenging task due to the unrestricted nature of categories and relation types. In large graphs, the semantic roles of relation types and entity categories are ...
Parallel algorithms for merging topic trees and their application in meta search engines
This paper describes the design of three parallel algorithms for merging the topic hierarchies generated by a probabilistic topic model. These algorithms have been implemented on a shared memory multi-processor workstation and are primarily suitable to ...
What the user does not want?: query reformulation through term inclusion-exclusion
In information retrieval, keyword-based queries often fail to capture actual information need, especially when the need is very specific and particular. Using natural language, however, a user can clearly tell what she wants (positive part) and what she ...
A biclustering approach for crowd judgment analysis
Collection of multiple annotations from the crowd workers is useful for diverse applications. In this paper, the problem of obtaining the final judgment from such crowd-based annotations has been addressed in an unsupervised way using a biclustering-...
Spatio-textual similarity joins using variable prefix filtering
Spatio-textual similarity join retrieves a set of pairs of objects which are close spatially and have similar textual contents. Due to the high cost of matching complex objects, most of the algorithms proposed for join run in two phases. In the first ...
Pattern set kernel
Frequent pattern mining has been used in many applications of data mining. One of the reasons for the effectiveness of frequent pattern methods is that frequently occurring patterns can capture crucial aspects of the underlying semantics of the data. ...
GPU-based out-of-core MDL clustering algorithm
The time complexity of Minimum Description Length based greedy agglomerative clustering algorithm is poor for a large data set. In this paper, we propose three different versions of GPU-based parallel algorithms, namely, time-efficient, memory-efficient ...
A quick algorithm for incremental mining closed frequent itemsets over data streams
In this paper we have proposed an efficient algorithm QMINE to find closed frequent itemsets over a data stream. Our approach performs a few operations. Experiments have shown that our approach outperforms the previous approaches, significantly.
A fuzzy version of generalized DBSCAN clustering algorithm
In this paper, we propose a fuzzy version of GDBSCAN called generalized fuzzy density based clustering algorithm (GFDBSCAN) that can be used to cluster people around key socio-economic parameters. GFDBSCAN can also be used to cluster geographical ...
An approach for search result topic identification and labeling
Organizing search results is one of the challenging task of the search engines due to various and dynamic intentions of the queries. As a consequence search engines are not able to understand the exact user context, and thus retrieve large volumes of ...
Online data stream classification with incremental semi-supervised learning
This paper proposes an online data stream classification that learns with limited labels using selective self-training. Data partitioning steps are proposed to improve stream mining efficiency. Simulation on Cambridge and KDD'99 datasets shows up to ...
Unsupervised gene selection using particle swarm optimization and k-means
Microarray experiments generate large scale data in the form of gene expression values. An unsupervised feature selection approach to perform sample based clustering on gene expression data is proposed. The proposed work uses Particle Swarm Optimization(...
Mutual information based weighted clustering for mixed attributes
There exists large number of clustering algorithms either for numeric or for categorical data sets. There are relatively less algorithms for clustering mixed attributes. This paper proposes Mutual Information based Weighted Clustering for Mixed ...
Relaxed neighbor based approach for improving protein function prediction
Protein-protein interaction (PPI) networks are valuable biological data source which contain rich information useful for protein function prediction. The PPI network data obtained from high-throughput experiments is known to be noisy and incomplete. In ...
Using social connections to improve collaborative filtering
In this paper, we test the hypothesis that for a particular item recommendation, it matters more how a friend (i.e., another user who is socially connected) has rated than a random user. To test this, we propose a matrix factorization based ...
Index Terms
- Proceedings of the 2nd ACM IKDD Conference on Data Sciences
Recommendations
Acceptance Rates
Year | Submitted | Accepted | Rate |
---|---|---|---|
CoDS COMAD 2020 | 275 | 78 | 28% |
CODS-COMAD '19 | 198 | 62 | 31% |
CODS-COMAD '18 | 150 | 50 | 33% |
CODS '14 | 57 | 7 | 12% |
Overall | 680 | 197 | 29% |