Nothing Special   »   [go: up one dir, main page]

Academia.eduAcademia.edu
Bio-Inspired Algorithms for preserving the privacy of data D.Menaga1, I.Humaira Begum2 1Assistant Professor(Sr.Scale), Department of Computer Science and Engineering, KCG College of Technology (Affiliated to Anna University), Karapakkam, 600097, India. 2Assistant Professor, Department of Computer Science and Engineering, KCG College of Technology (Affiliated to Anna University), Karapakkam, 600097, India. Email: dev.menaga@gmail.com ABSTRACT Security of the data is also concerned with the privacy of the data since the data or the information can be easily disclosed. Data sharing also plays a key role in security. Recently, patterns are disclosed using associative rule mining and the sensitive information are one of the imposing threats to the security aspects in data mining. Preserving the data as well as the privacy of the user using several PPDM approaches leads to provide authorized access for such sensitive information. The security threats for preserving privacy are provided by developing a sanitization process. The sanitization process is considered to be one of the biggest challenges in the mining of data. In this paper, different approaches such as GA-based and PSO based algorithms are surveyed and analyzed for preserving the privacy of data..The purpose of data sanitization and the use of Bio-Inspired algorithms such as Particle Swarm Optimization (PSO) and Genetic Algorithm (GA) are discussed. Keyword: PPDM, PSO, GA 1. INTRODUCTION Intuitively developing social media as well as the internet based utilities indulge in disclosure of personal information to the irrelevant public users that result in a vulnerable ambience. Hence, the necessity for availing a privacy preserving scenario arises abruptly. The implemented privacy protocols are ought to preserve the user privacy provided; the user-beloved activities must also be permitted irrespective of any restrictions. Those data are sanitized and then left for being archived in the public servers. The data being sanitized from the user identities are formulated through a series of subjugations that leads to a perfect user identity preservations. The data are either disguised to form differed information and so none of the identity disclosure happens. General analysis of the individual information should also be prohibited through an effective sanitization procedure. The personal opinion of the user may also get exposed toward certain public issue that may lead to some sort of complication in person. The utility of sensitive information may also contribute toward usefulness of personal information and it should also be enabled in this sanitization. Enhancement of inaccuracy for those least utilized information and mitigation of accuracy in order to predict the accessibility of third party user for information with an utmost utility is the requisite necessitated. It is accomplished only after realizing the pros and cons involved in implementing a sanitizing procedure to a complete data set involved within a public forum. Fig. 1 illustrates the overall process flow involved in sanitizing the data. The organizational stages of data sanitizing approach is devised here by applying the concept of privacy utility in information recognized and quantification of clusters supplemented with various robust mechanisms involved in data mining methodology. Procedure involved in transforming the information existing within the transaction database exploits a Pearson Correlation [1] as a means of obtaining a Similarity Matrix (SM) from the actual information matrix that comprises of scaled attributes residing in particular periodic intervals. After adapting the first space transformation procedures the data matrix into first space, the SM formulated is again processed with utilization of additive noise [2] on the basis of user requisite projected through metrics. Aforementioned privacy utility is obtained by Exploiting Gaussian distribution [3] on that processed information. On realizing such a sequential procedure, ultimately necessitated knowledge is acquired with a minimal complexity incurred in searching with respect to the user requirement obtained. Since the sanitized database comprises all information in the sanitized form, the process of gathering knowledge on the basis frequently mined patterns suffice user requirement with diminished memory consumption. Fig. 1 Sanitization Process Flow The schematic arrangement of this paper traverses in a way given as revealing various mechanisms assisting in Privacy Preservation and Data hiding fabricate during conventional Particle Swarm Optimization (PSO) and Genetic Algorithm (GA) are plotted in section II and as an end point, the entire methodological discussion gets concluded in section III. 2. Data Sanitizing Schemes 2.1 Data Sanitization: Sanitization of data is highly necessitated in order to safeguard the valuable data regarding any sorts of sensitive business information, personal information or any confidential as well as legal based information [4]. As a means of preserving privacy and maintaining it to an optimal extent usually secures this information. It gets categorized into varied types like nulling out, masking data, substitution approach, shuffling records, number variance, and Gibberish generation and encryption as well as decryption strategies. The data being archived in the database in the form of transactions are abstracted out through association rules. Subsequently, these rules typically extract other related information too and thus the existences of all sorts of other private information are unveiled without any prohibitions towards privacy preservations. The data availed is initially parsed in search of sensitive data through differed syntactic parsing in case of documents and are quantified by means of segregating the valued terms [5]. Beimel et al. [6] devised a recursive approach for opting a low-sensitivity function usually termed as the bounded-growth utilities. Sanitization is certainly achieved through a proper learning algorithm that also assures a concept of differential privacy. Thus, the issue of mining data accomplished with a privacy preservation is trending as for now. The impact of association rule being generated is mitigated by means of utilizing the deployment of Privacy Preserving Data Mining (PPDM) methodology. There are many such sanitizing methodologies prevailing in the existing scenario as if developed by utilizing Collaborative Search Log Sanitization (CELS) protocol [7]. Basically, this mechanism is capable of creating a probabilistic outcome for a NLP issue in a proficient manner. The search logs are horizontally apportioned in to r number of shares indicated as that is possessed by the groups. The probabilistic outcomes are generated on the basis of supposition stated as; segregating the NLP pairs of query-url pairs generated on the basis of linear inequality checks. Afterwards, employing homomorphic encryption to protect the item set pairs being exposed to external users. Proliferation of generated arbitrary matrix pair is accomplished via homomorphic cryptosystem that induces variable permutation. 1. Differential privacy is sufficed 2. Making use of sampling mechanism in order to get the most out of acquired outcome 3. For the entire procedure, the privacy in between the groups are never interrupted or shattered A semi-honest prototype is employed in this approach that accounts for the privacy preservation in between groups whereas, the probing tendency of the prevailing parties to deduce others information is also maintained. The input data gets sanitized by means of enhancing the efficacy of the outcome and also generating an optimal solution for resolving a NLP problem. The sensitivity of the data is analyzed for assuring the utility of the data through an iterative Bernoulli trial accompanied with attributes in a binomial randomized manner. The probability functionality for defining the utility is specified as given in (1). (1) Where, n= , , q= 1-p Here, accounts for the overall number of trials being succeeded, s gives the sensitive values, denotes the true frequency involved in accordance with s,-signifies the overall trial count, p-gives the success probability and q-resembles the failure probability. H. Abdullah et.al [8] devised an inventive data sanitizing methodology by means of incorporating noise with the prevailing information and mining through it for acquiring knowledge by clustering. The secrecy of the sensitive information acquired is significantly enhanced through noise intensification. The sanitized data is accomplished by means of implementing a first space transformation procedure in the information processed so far. Preservation of differential privacy is quite challenging with respect to abruptly growing data size. Along with provision of privacy preservation of utility measure is complicated. There are various Data Sanitization techniques which are discussed in Table 1. Table 1: Data Sanitization Techniques 2.2 Data Hiding Mechanisms Realized for Sensitive: Some sort of sensitive information is found as a derivative of prediction made with that information leaked in social media by means of deploying some inference attack [8]. The correlation between the inter-linked classes ( of information found in between associated persons are explored through a Naïve Bayes relation. The association between any two persons to is defined as given in (2), (2) Where, - resembles the association of some person with the information, F- signifies the feature set involved in association edges that exemplifies the relationship between any two-friendship links. The friendship associations are alleviated through exploitation of a link manipulation strategy in order to preserve the privacy as a replacement for including fake or ideal links in between the profiles. In accordance with the removal of friendship links prevailing in between the nodes, the likelihood among any two different classes are identified for a maximized level. It is accomplished by means of incorporating the likelihood between classes of nodes that is given in (3), (3) Where, d-infers the likelihood between nodes, NOMR- indicates the constant of all di incurred i -opts for getting a maximizing factor for d - Accounts for the true class - signifies the adjacent likelihood class P-denotes the probability of friendship links assessed Thus, the friendly associativity between the links are assessed and alleviated in between nodes to avoid the side effects of associative rules. Lin, et al. [10] deliberated a skilful approach to opt for sensitive links, stating on condition that demands for preserving secrecy in transactions by exploiting PSO2DT algorithm. An utmost support threshold is evaluated for mitigating the overall dimension of item sets that are reflected as particles. This activity principally reveals that item set that possesses an extreme amount of sensitivity. Hence, an out-sized item set can also be optimized to get into predefined concept of analysis that certainly alleviates the need of traversing all through the entire dataset. A clustering based [11] approach deployed in a bottom up approach is analyzed for its effective processing strategy. A comprehensive methodology of clustering is trailed here for updating the clusters. The approach defined on the basis of distance between the clusters incurs a utility loss in it for the necessitated information. Lin, et al. [12] suggested an overlapping methodology for hiding the data that is tend to be sensitive along with a concept of mitigating the computational complexity of the entire procedure that entails in eradicating the victim transactions. Cai, et al.[13]endorsed another perturbing link handling mechanism which certainly introduced an insolvable trade-off between utility and privacy criterion of the user information being obtained from any sorts of social media. There are various data hiding mechanisms which are discussed in Table 2. Table 2: Sensitive Data Hiding Mechanisms 2.3 Privacy preserving approaches: Privacy lapsing in case of acquiring some personal data even from the either a Patient Health Record (PHR) or from a social media eases the third person to disclose one’s identity that may lead to some other issue. Creating anonymity plays a crucial role in case of determining one’s privacy in order to mitigate the risk of identity revelation. Incorporation of privacy preserving criterion is employed by varied set of approaches such as Quasi-identifier transformation through a manifold set of stages given as micro aggregation that involved in generating a static summary for swapping the individual identities of a person by opting for a feasible centroid or a median. Deliberation of some other consistent standards to disguise the personal identity through generalization. The procedure involves in disguising the identity as well as de-identifying the same. The approach for de-identification involves in unveiling the direct identifiers through a Named Entity Recognition (NER). These methodologies are accomplished by incorporation of some privacy models designed on the basis of semantic approach. The utility of the preserved data is enhanced through a mathematically formulated model given as, (4) Where, –acquires utility measure from sanitized information by processing the original data. Viejo and Sánchez[16] devised a sequential procedure for sanitizing the user information given as follows, Step 1: privacy accomplishments are gathered in person from the user in accordance with their own preferences of sensitivity. Step 2: formulate generalization with respect to the user preferences at once when the sensitive terms are completely perceived. Step 3: preserve those data that are found sensitive. Step 4: wrap those sensitive data even from the focus of operating authority. Step 5: sanitize user communications in a dynamic manner. Lin et al. [17] suggested two varied category of approaches for preserving the privacy of sensitive information termed as Maximum Sensitive Utility – Maximum Item Utility (MSU-MIU) and Maximum Sensitive Utility- Minimum Item Utility (MSU-MAU) and certainly achieved a maximum utility with requisites defined but still confined with similarity identification of data on the basis of semantic criterion. Intensively focused on preserving the privacy of the user information through Privacy Preserving Utility Mining (PPUM) for utilizing the appropriate item sets abiding with respect to rigorous user preferences in a mitigated time span. There are various Privacy Preserving approaches which is discussed in Table 3. Table 3: Privacy Preserving Approaches 2.4 Genetic Algorithm based Approaches for Data Sanitization: The important information from the database is identified in a safe manner by using the privacy-preserving data mining approach. Recognizing an appropriate set of transactions from those vitally preserved information, inhibiting the access to sensitive information induces the formulation of NP-hard problem. One of the algorithms to preserve the significant information and hide sensitive information is Genetic Algorithm (GA) by performing transaction deletion in an original database. To obtain the nearby optimal solution for variations of NP-hard problems at low cost, the population-based approach of GA is utilized in evolutionary computing [12]. Generally, the solution obtained from GA is known as chromosome that is evaluated by using fitness function. In GA, there are three operations such as selection, crossover, and mutation. Selection: The best off springs are selected as the surviving chromosomes by applying the fitness function to the selection operation. Successive generations are transmuted with features of the preeminent off springs chosen by utilizing an optimal solution. Crossover: Some amount of bits available between two chromosomes is swapped by using the crossover operation to generate offspring of the population. Either attributes or characteristics of the two parent chromosome are obtained in offspring. Mutation: Either single or various bits are changed randomly by using the mutation operation to produce variations of the parent characteristics. An evolution procedure is trailed to accomplish an optimal solution that is made available at the vicinity instead of searching for an indigenously optimal solution. Based on these three operations, the GA algorithm is performed with following operations. Step 1: Primarily feasible elucidations are acquired through accomplishment of a particular category of chromosomes in order to exemplify those results. A type of chromosomes is defined to represent possible solutions that are called as the initial set of possible solution. Step 2: To produce the next generation, the three operations of crossover, selection, and mutation are applied to chromosomes. Step 3: The goodness of the chromosomes is assessed by evaluating each chromosome with the designed fitness function. Step 4: Reiterate step 3 for reaching ceasing measure. An ideal and finest form of solution is acquired through deploying heuristic approach belonging to GA, a soft computing category based mechanism as a means of hiding sensitive rules accompanied with association rules manipulated in single order. The support and confidence value of the patterns suggested as per the user preference, are curtailed for a minimized level than a threshold quantified by auser [21]. Fig. 2 depicts the overall work flow trailed by the genetic algorithm. Fig. 2 Work flow of Genetic Algorithm In genetic algorithm, the population is initialized in an arbitrary manner and the fitness function is appropriated on the basis of constraint necessitated. On satisfying the constraints, the particular fitness function assessed for a specific population is fixed as the optimal outcome necessitated. If the satisfying criterion is not resolved, the population is varied and the fitness function is evaluated for the next iteration with an arbitrarily varied set of population. At once, when the satisfying criterion is accomplished the function concerned is chosen as the optimal solution. The binary dataset that comprises of sensitive items to a maximum level and non-sensitive items to a minimized level is operated on the basis of this methodology. Both statistical analytics and System of Systems (SoS) is associated to optimize micro grid SoS by means of utilizing Computational Intelligence (CI) and the statistical analysis tools [22]. The security test cases are generated automatically by combining the genetic algorithm and concrete symbolic execution [23]. The time required to generate security test cases is less by using the GA algorithm and the role of using concrete symbolic execution is covering of higher number of vulnerabilities. A specific fitness function with three adaptable weight ages is employed to remove those transactions associated with sensitive item sets with deployment of compact prelarge GA-based (cpGA2DT) algorithm [24].The gains such as diminished population size, lessened execution time and curbed side effects are acquired out of consuming GA algorithm. There are various approaches in Genetic Algorithm which is discussed in Table 4. Table 4: Genetic Algorithm (GA) Approaches 2.5 Particle Swarm Optimization based approaches for Data Sanitization A motivation obtained from flocking activity of birds is applied to recognize the finest form of food source is endorsed to manipulate a population-based approach, Particle Swarm Optimization (PSO).The particles described in the PSO technique represents the problem solution and each particle represents a flying direction toward other solution by having velocity value. The procedure involved in traditional PSO algorithm is described below. Step 1: Initialize the velocity and position value for PSO algorithm. Step 2: Identify location and velocity of movement involved with every single particle. Step 3: The fitness function formulated is consumed to fabricate the global best value (gbest) and personal best value (pbest). Step 4: Based on the pbest and gbest value, the velocity, and position are updated for each particle. Step 5: Either next iteration or final gbest solution is provided on the basis of designed termination criteria. Fig. 3 shows the flowchart of PSO algorithm. Initially, the PSO process is initiated to find the velocity and position for each particle. The global best solution and possible solution are determined by using the fitness function value. After determination of best solution, the velocity and position are updated to the PSO algorithm in order to analyze the termination of process. The global solution is utilized for all possible solution and if the evaluated fitness function is not enough for termination, then the PSO process is continued for next iteration. Else the global solution is termed as the best optimal solution for NP-hard problem. Primarily, the particle is initialized in an arbitrary manner by applying PSO approach and is consistently reiterated for every distinct particle. Process of updating these location values of these particles are accomplished on the basis of gbest values as well as pbest values computed. Fig 3: Work flow of PSO algorithm The gbest solution obtained holds the best resolution among all sorts of pbest solution acquired within those populations generated. By using these two best values, the velocity value corresponded to particles is evaluated and updated by using the below equation for each particle. (5) (6) Where signifies the harmonizing feature stipulating both of the searches influencing global as well as indigenous factors. The velocity of the particle in a population is indicated in notation as. Both constants such as and are named as individual weight and social weight, respectively. These two constant are utilized to find the significance of two best solutions of pbest and gbest, usually, these two values are set as two. Every specific iteration involved through the entire set of procedure is meant for creating distinct arbitrary numbers, stipulated within a range of (0, 1). The global solution is periodically updated as per [5] by means of promoting the updated velocity of a particle for every single iteration being articulated as in [6]. By using the PSO algorithm, the solution is obtained for continuous problems in order to apply this solution on different applications. Zuo, et al. [25] recognized a robust self-adaptive structure based learning methodology using PSO approach for managing inter cloud resource via scheduling. Bonam[26] developed a PSO-based approach with the help of data distortion in order to analyze the issue of association rule mining method for preserving privacy on sensitive data. Then, Sarath and Ravi [27] mined association rules by designing the appropriate PSO optimization approach. In order to efficiently mine high utility item sets, the binary PSO-based algorithm is developed by Tian [28] to overcome the computational complexity problem. In a communication network, the multicast routing problem is overwhelmed by applying the discrete PSO approach [29]. Lin, et al. [30] influenced PSO with discrete patterns obtained from database for an accomplishing a proficient association in scheduling. Table 5: Particle Swam Optimization Algorithm PSO techniques 3. Conclusion and Future Work: Several data sanitizing methodologies prevailing are analyzed for its proficiency and limitations exposed in processing sensitive information. The inference attained from this survey signifies vital characterizations exposed by soft computing methodologies deployed for inhibiting the accessibility of sensitive information inspite of preserving the usefulness of the association rules employed. Optimization accomplished through genetic algorithm completely mitigated the confidence level of sensitive rules framed via associations in accordance with variations inferred in the database. Claiming of Applicable crossover and mutation rates is chosen for realizing non-trivial rules to get hold of an ideal solution. Moreover, the necessity of opting for initial fitness functionality purely relies on population initialized and works well if it is in a modest level. At this juncture, PSO certainly mitigates the complexity involved in choosing the initial deciding criterion that sounds superior to GA. Conversely, the GA proceeds with a proper functionality in creation of subsequent deciding criterion for fitness functions that is capable of fixing the level of differential privacy for preserving sensitive information. The same optimal decisions are not achievable with PSO. Hence, the effectiveness of both GA and PSO is collated to form a fused approach that is particularly capable of preserving usefulness of association rule along with provision of differential privacy. On deployment of a robust sanitization procedure with the capabilities of sensitive information handling and privacy preservation avoids its exposure to an unauthorized access with a mitigated complexity. References [1.] Eisinga R, Grotenhuis Mt, Pelzer B. The reliability of a two-item scale: Pearson, Cronbach, or Spearman- Brown? International journal of public health. 2013:1-6. [2.] Sankar L, Rajagopalan SR, Poor HV. Utility-privacy tradeoffs in databases: An information-theoretic approach. IEEE Transactions on Information Forensics and Security. 2013;8(6):838-52 [3.] Allard T, Nguyen B, Pucheral P. MET: revisiting Privacy-Preserving Data Publishing using secure devices. Distributed and Parallel Databases. 2014;32(2):191-244. [4.] Xu L, Jiang C, Wang J, Yuan J, Ren Y. Information security in big data: privacy and data mining. IEEE Access. 2014; 2:1149-76. [5.] Sanchez D, Batet M, Viejo A. Automatic general-purpose sanitization of textual documents. IEEE Transactions on Information Forensics and Security. 2013;8(6):853-62. [6.] Beimel A, Nissim K and Stemmer U. Private learning and sanitization: Pure vs. approximate differential privacy. Approximation, Randomization, and Combinatorial Optimization Algorithms and Techniques: Springer; 2013. p. 363-78. [7.] Hong Y, Vaidya J, Lu H, Karras P, Goel S. Collaborative search log sanitization: Toward differential privacy and boosted utility. IEEE Transactions on Dependable and Secure Computing. 2015;12(5):504-18 [8.] Abdullah H, Siddiqi A, Bajaber F, editors. A novel approach of data sanitization by noise addition and knowledge discovery by clustering. Computer Networks and Information Security (WSCNIS), 2015 World Symposium on; 2015: IEEE. [9.] Fu AW-C, Wang K, Wong RC-W, Wang J, Jiang M. Small sum privacy and large sum utility in data publishing. Journal of biomedical informatics. 2014;50:20-31. [10.] Lin JC-W, Liu Q, Fournier-Viger P, Hong T-P, Pan J-S, editors. A Swarm-Based Sanitization Approach for Hiding Confidential Item sets. Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP), 2015 International Conference on; 2015: IEEE. [11.] Bonomi L, Fan L, Jin H, editors. An Information-Theoretic Approach to Individual Sequential Data Sanitization. Proceedings of the Ninth ACM International Conference on Web Search and Data Mining; 2016: ACM. [12.] Lin JC-W, Liu Q, Fournier-Viger P, Hong T-P, Voznak M, Zhan J. A sanitization approach for hiding sensitive itemsets based on particle swarm optimization. Engineering Applications of Artificial Intelligence. 2016; 53:1-18. [13.] Cai Z, He Z, Guan X, Li Y. Collective data-sanitization for preventing sensitive information inference attacks in social networks. IEEE Transactions on Dependable and Secure Computing. 2016. [14.] Heatherly R, Kantarcioglu M, Thuraisingham B. Preventing private information inference attacks on social networks. IEEE Transactions on Knowledge and Data Engineering. 2013; 25(8):1849-62 [15.] Kukkala VB, Saini JS, Iyengar S. Privacy preserving network analysis of distributed social networks. Information Systems Security: Springer; 2016. p. 336-55. [16.] Viejo A, Sánchez D. Enforcing transparent access to private content in social networks by means of automatic sanitization. Expert Systems with Applications. 2016;62: 148-60. [17.] Lin JC-W, Wu T-Y, Fournier-Viger P, Lin G, Zhan J, Voznak M. Fast algorithms for hiding sensitive high-utility itemsets in privacy-preserving utility mining. Engineering Applications of Artificial Intelligence. 2016; 55:269-84. [18.] Gkoulalas-Divanis A, Loukides G, Sun J. Publishing data from electronic health records while preserving privacy: A survey of algorithms. Journal of biomedical informatics. 2014; 50:4-19. [19.] Fang Y, Godavarthy A, Lu H, editors. A Utility Maximization Framework for Privacy Preservation of User Generated Content. Proceedings of the 2016 ACM on International Conference on the Theory of Information Retrieval; 2016: ACM. [20.] Lin JC-W, Gan W, Fournier-Viger P, Yang L, Liu Q, Frnda J, et al. High utility-itemset mining and privacy-preserving utility mining. Perspectives in Science. 2016;7:74-80. [21.] Darwish SM, Madbouly MM, El-Hakem MA. A Database Sanitizing Algorithm for Hiding Sensitive Multi-Level Association Rule Mining. International Journal of Computer and Communication Engineering. 2014;3(4):285. [22.] Tannahill BK, Jamshidi M. System of Systems and Big Data analytics–Bridging the gap. Computers & Electrical Engineering. 2014;40(1):2-15. [23.] Avancini A, Ceccato M. Comparison and integration of genetic algorithms and dynamic symbolic execution for security testing of cross-site scripting vulnerabilities. Information and Software Technology. 2013;55(12):2209-22. [24.] Lin C-W, Zhang B,Yang K-T, Hong T-P. Efficiently hiding sensitive item sets with transaction deletion based on genetic algorithms. The Scientific World Journal. 2014. [25.] Zuo X, Zhang G, Tan W. Self-adaptive learning PSO-based deadline constrained task scheduling for hybrid IaaS cloud. IEEE Transactions on Automation Science and Engineering. 2014;11(2):564-73. [26.] Bonam J, Reddy AR, Kalyani G, editors. Privacy preserving in association rule mining by data distortion using PSO. ICT and Critical Infrastructure: Proceedings of the 48th Annual Convention of Computer Society of India-Vol II; 2014: Springer. [27.] Sarath K, Ravi V. Association rule mining using binary particle swarm optimization. Engineering Applications of Artificial Intelligence. 2013;26(8):1832-40. [28.] Tian Y, Liu D, Yuan D, Wang K. A discrete PSO for two-stage assembly scheduling problem. The International Journal of Advanced Manufacturing Technology. 2013;66(1-4):481-99. [29.] Shen M, Zhan Z-H, Chen W-N, Gong Y-J, Zhang J, Li Y. Bi-velocity discrete particle swarm optimization and its application to multicast routing problem in communication networks. IEEE Transactions on Industrial Electronics. 2014; 61(12):7141-51. [30.] Lin JC-W, Yang L, Fournier-Viger P, Wu M-T, Hong T-P, Wang LS-L, editors. A Swarm-based approach to mine high-utility item sets. International Conference on Multidisciplinary Social Networks Research; 2015: Springer.