Nothing Special   »   [go: up one dir, main page]

Top 20 R Machine Learning and Data Science Packages

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

8/7/2017 Top 20 R Machine Learning and Data Science packages

KDnuggets

Subscribe to KDnuggets News | | Contact


search KDnuggets Search

SOFTWARE
NEWS
Top stories
Opinions
Tutorials
JOBS
Companies
Courses
Datasets
EDUCATION
Certificates
Meetings
Webinars

PAW Business London, 11-12 October: Delivering on the promise of Data Science. Learn more!

KDnuggets Home » News » 2015 » Jun » Software » Top 20 R Machine Learning and Data Science packages ( 15:n21 )

Top 20 R Machine Learning and Data Science packages


Previous post
Next post

Like 499 Share 499 Share 446 Tweet 16 Share 93

Tags: CRAN, Data Science, Machine Learning, R, R Packages, Top list

We list out the top 20 popular Machine Learning R packages by analysing the most downloaded R packages from Jan-May 2015.

NYU MS in Business Analytics


for Professionals - apply now

By Geethika Bhavya Peddibhotla, KDnuggets.


comments
The CRAN Package repository features 6778 active packages. Which of these should you know? Here is an analysis. See also link to the raw data at the
bottom of the post.

http://www.kdnuggets.com/2015/06/top-20-r-machine-learning-packages.html 1/5
8/7/2017 Top 20 R Machine Learning and Data Science packages

Most of these R packages are favorites of Kagglers, endorsed by many


authors, rated based on one package's dependency on other packages. They are also rated & reviewed by users as a crowdsourced solution by
Crantastic.org. However, these user ratings are too few to be based on for analysis.

Let us explore how many machine learning packages are being downloaded from Jan to May by analysing CRAN daily downloads.

1. e1071 Functions for latent class analysis, short time Fourier transform, fuzzy clustering, support vector machines, shortest path computation,
bagged clustering, naive Bayes classifier etc (142479 downloads)
2. rpart Recursive Partitioning and Regression Trees. (135390)
3. igraph A collection of network analysis tools. (122930)
4. nnet Feed-forward Neural Networks and Multinomial Log-Linear Models. (108298)
5. randomForest Breiman and Cutler's random forests for classification and regression. (105375)
6. caret package (short for Classification And REgression Training) is a set of functions that attempt to streamline the process for creating predictive
models. (87151)
7. kernlab Kernel-based Machine Learning Lab. (62064)
8. glmnet Lasso and elastic-net regularized generalized linear models. (56948)
9. ROCR Visualizing the performance of scoring classifiers. (51323)
10. gbm Generalized Boosted Regression Models. (44760)
11. party A Laboratory for Recursive Partitioning. (43290)
12. arules Mining Association Rules and Frequent Itemsets. (39654)
13. tree Classification and regression trees. (27882)
14. klaR Classification and visualization. (27828)
15. RWeka R/Weka interface. (26973)
16. ipred Improved Predictors. (22358)
17. lars Least Angle Regression, Lasso and Forward Stagewise. (19691)
18. earth Multivariate Adaptive Regression Spline Models. (15901)
19. CORElearn Classification, regression, feature evaluation and ordinal evaluation. (13856)
20. mboost Model-Based Boosting. (13078)

It is interesting to note that some open source R tools are gaining popularity such as Rattle, a GUI for data mining using R (35539 downloads), and
fastcluster, fast hierarchical clustering routines for R and Python (14214 downloads).

Did we miss your favorites? Light up this space and contribute to the community by letting us know which R packages you use!!

For completeness, here is data on 135 R package downloads, from Jan to May 2015.

Bio: Bhavya Geethika is pursuing a masters in Management Information Systems at University of Illinois at Chicago. Her areas of
interests include Statistics & Data Mining for Business, Machine learning and Data-Driven Marketing.

Related:

Top 10 R Packages to be a Kaggle Champion


Machine Learning 201: Does Balancing Classes Improve Classifier Performance?
Top KDnuggets tweets, Mar 14-16: Is Apache Spark the Next Big Thing? R Meta-Book – best CRAN posts assembled

http://www.kdnuggets.com/2015/06/top-20-r-machine-learning-packages.html 2/5
8/7/2017 Top 20 R Machine Learning and Data Science packages

9 Comments KDnuggets 
1 Login

Sort by Best
 Recommend 5 ⤤ Share

Join the discussion…

LOG IN WITH

OR SIGN UP WITH DISQUS ?

Name

Bhavya Geethika • 2 years ago


Hello! This data is based on daily downloads analysis from http://cran-logs.rstudio.com/ . From Jan - May 2015.
Yes. Here is the list of all CRAN mirrors http://cran.r-project.org/m... and this analysis can be performed on each one.
△ ▽ • Reply • Share ›

Thomas W Dinsmore > Bhavya Geethika • 2 years ago


Actually, it can't -- RStudio is the only mirror that publishes its logs.

You can't assume that RStudio's downloaders represent the R population at large.
△ ▽ • Reply • Share ›

Bhavya Geethika > Thomas W Dinsmore • 2 years ago


Yes Agree! This analysis is on a sample of R users who download from this mirror. It does not represent the R community at
large who download from other mirrors or github/external download sources. Also does not account for users who already
have certain packages & download only new packages. Nevertheless the download numbers (refer excel link) are too high
for certain packages and hence this analysis is specific from Jan-May 2015 to get an idea.
△ ▽ • Reply • Share ›

Nicholas Miller > Bhavya Geethika • 2 years ago


Thank you for sharing!
△ ▽ • Reply • Share ›

Shashank Iyer • 2 years ago


Good work! :)
△ ▽ • Reply • Share ›

Thomas W Dinsmore • 2 years ago


Where did this data come from? There are ~100 CRAN mirrors worldwidel do these statistics represent downloads from all of them, some
of them...?
△ ▽ • Reply • Share ›

Vivek Otari > Thomas W Dinsmore • a year ago


Data comes as the contribution from all the developers , Scientist & general forum users of R. One needs to have good web server
with reasonably significant bandwidth capabilities. The number presented in the article is from all the possible mirrors available on
CRAN.

Hope it helps.
△ ▽ • Reply • Share ›

Thomas W Dinsmore > Vivek Otari • a year ago


According to the author, the data comes solely from the RStudio CRAN repository, and not all CRAN mirrors.
△ ▽ • Reply • Share ›

Sam > Thomas W Dinsmore • 8 months ago


Both (the author and the above poster) are of Indian origin. They just support each others. That's how so many
Indians get employed in American corporations.
△ ▽ • Reply • Share ›

✉ Subscribe d Add Disqus to your siteAdd DisqusAdd 🔒 Privacy

http://www.kdnuggets.com/2015/06/top-20-r-machine-learning-packages.html 3/5
8/7/2017 Top 20 R Machine Learning and Data Science packages
Previous post
Next post

Top Stories Past 30 Days

Most Popular Most Shared

1. Top 15 Python Libraries for Data Science in 2017 1. Top 15 Python Libraries for Data Science in 2017
2. The 10 Algorithms Machine Learning Engineers Need to 2. 6 Interesting Things You Can Do with Python on Facebook
Know Data
3. 6 Interesting Things You Can Do with Python on Facebook 3. Is Regression Analysis Really Machine Learning?
Data 4. Deep Learning Papers Reading Roadmap
4. 10 Free Must-Read Books for Machine Learning and Data 5. A Practical Guide to Machine Learning: Understand,
Science Differentiate, and Apply
5. 7 Steps to Mastering Data Preparation with Python 6. Applying Deep Learning to Real-world Problems
6. Is Regression Analysis Really Machine Learning? 7. Emerging Ecosystem: Data Science and Machine Learning
7. Machine Learning Workflows in Python from Scratch Part Software, Analyzed
1: Data Preparation

Latest News
Analytically Speaking Featuring Pedro Saraiva...
Fidelity Investments: Vice President (AI Lead...
Improving Zillow Zestimate with 36 Lines of Code
Exploratory Data Analysis in Python
Inference Made Simple – Applying the re...
Deploying Data Science Projects [Whitepaper]

Kanri Purpose Driven Insight - Get Free Version

Top Stories
Last Week
Most Popular
1. Top 10 Quora Machine Learning Writers and Their Best Advice, Updated

http://www.kdnuggets.com/2015/06/top-20-r-machine-learning-packages.html 4/5
8/7/2017 Top 20 R Machine Learning and Data Science packages

2. Emerging Ecosystem: Data Science and Machine Learning Software, Analyzed


3. Applying Deep Learning to Real-world Problems
4. Top 15 Python Libraries for Data Science in 2017
5. Using the TensorFlow API: An Introductory Tutorial Series
6. Text Clustering: Get quick insights from Unstructured Data
7. The 10 Algorithms Machine Learning Engineers Need to Know

Most Shared
1. Applying Deep Learning to Real-world Problems

2. Text Clustering: Get quick insights from Unstructured Data


3. Using the TensorFlow API: An Introductory Tutorial Series
4. Top 10 Quora Machine Learning Writers and Their Best Advice, Updated
5. Why Artificial Intelligence and Machine Learning?
6. For data scientists, now is the time to act; Forrester has insights to help you get started
7. Web Scraping with R: Online Food Blogs Example

KDnuggets Home » News » 2015 » Jun » Software » Top 20 R Machine Learning and Data Science packages ( 15:n21 )

© 2017 KDnuggets. About KDnuggets

Subscribe to KDnuggets News

http://www.kdnuggets.com/2015/06/top-20-r-machine-learning-packages.html 5/5

You might also like