Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3277104.3277108acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiccbdConference Proceedingsconference-collections
research-article

Performance Analysis Using Apriori Algorithm Along with Spark and Python

Published: 08 September 2018 Publication History

Abstract

We have proposed an improved Apriori algorithm based on comparing different data structures to obtain a better and improved performance level than presently available approaches. Our approach is to apply on large transaction data where space and time management has been a center of attraction. The improved algorithm is using an existing Apriori approach and gives us a more time efficient output. Our approach is implemented on a spark framework along with the PySpark facility that can process data on a much-improved rate compared to the Hadoop framework. Moreover, we have proposed that using python as our programming language has a faster computational rate. We have used a local file system for our data to be stored. In addition, we have shown our time efficiency on spark framework and generated a report using those data to compare spark based analysis on our proposed algorithm. Furthermore, this proposed method can also be effectively applied for a big data mining optimization purpose.

References

[1]
Parikh, H. and Liu, J. 2015. "A Survey on Big Data Analysis and Challenges." ICERI2015 Proceedings 8th International Conference of Education, Research and Innovation, Seville, Spain, November 18-20, 4451--4460.
[2]
Tokar, Joyce L., Jones, F David., Black, Paul E., Dupilka, Chris E. 2011. "Software vulnerabilities precluded by spark." Proceeding SIGAda '11 Proceedings of the 2011 ACM annual international conference on Special interest group on the ada programming language. Denver, Colorado, USA, November 6-10, 39--46.
[3]
Zeng, Zhiyong., Yang, Hui. Feng, Tao. 2011. "Using HMT and HASH_TREE to Optimize Apriori Algorithm." International Conference on Business Computing and Global Informatization (BCGIN), July
[4]
Zaharia, Matei, Chowdhury, Mosharaf., Das, Tathagata., Dave, Ankur, Ma, Justin., Mccauley, Murphy., Franklin, Michael C., Shenker, Scott., Stoica, Ion. 2012a. "Fast and Interactive Analytics over Hadoop Data with Spark." The Usenix Magazine, August 2012. Volume 37, Number 4. Accessed January 28, 2017.

Cited By

View all
  • (2022)Time-Based Distributed Collaborative Filtering Recommendation AlgorithmMobile Networks and Management10.1007/978-3-030-94763-7_19(245-255)Online publication date: 17-Jan-2022

Index Terms

  1. Performance Analysis Using Apriori Algorithm Along with Spark and Python

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICCBD '18: Proceedings of the 2018 International Conference on Computing and Big Data
    September 2018
    103 pages
    ISBN:9781450365406
    DOI:10.1145/3277104
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 08 September 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Apriori Algorithm
    2. association rule
    3. item set

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    ICCBD '18

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)12
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 18 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Time-Based Distributed Collaborative Filtering Recommendation AlgorithmMobile Networks and Management10.1007/978-3-030-94763-7_19(245-255)Online publication date: 17-Jan-2022

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media