Big Data and Analytics: The key concepts and practical applications of big data analytics (English Edition)

Ebook452 pages3 hours

Big Data and Analytics: The key concepts and practical applications of big data analytics (English Edition)

By Dr. Jugnesh Kumar, Dr. Anubhav Kumar and Dr. Rinku Kumar

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Big data and analytics is an indispensable guide that navigates the complex data management and analysis. This comprehensive book covers the core principles, processes, and tools, ensuring readers grasp the essentials and progress to advanced applications.

It will help you understand the different analysis types like descriptive, predictive, and prescriptive. Learn about NoSQL databases and their benefits over SQL. The book centers on Hadoop, explaining its features, versions, and main components like HDFS (storage) and MapReduce (processing). Explore MapReduce and YARN for efficient data processing. Gain insights into MongoDB and Hive, popular tools in the big data landscape.

Skip carousel

LanguageEnglish

PublisherBPB Online LLP

Release dateMar 5, 2024

ISBN9789355517050

Author

Dr. Jugnesh Kumar

Related authors

Skip carousel

Related to Big Data and Analytics

Related ebooks

Skip carousel

Big Data: Statistics, Data Mining, Analytics, And Pattern Learning
Ebook
Big Data: Statistics, Data Mining, Analytics, And Pattern Learning
byRob Botwright
Rating: 0 out of 5 stars
0 ratings
Optimizing Hadoop for MapReduce
Ebook
Optimizing Hadoop for MapReduce
byKhaled Tannir
Rating: 0 out of 5 stars
0 ratings
Big Data Modeling and Management Systems
Ebook
Big Data Modeling and Management Systems
byAlexander Afriyie
Rating: 0 out of 5 stars
0 ratings
Mastering Snowflake Platform: Generate, fetch, and automate Snowflake data as a skilled data practitioner (English Edition)
Ebook
Mastering Snowflake Platform: Generate, fetch, and automate Snowflake data as a skilled data practitioner (English Edition)
byPooja Kelgaonkar
Rating: 0 out of 5 stars
0 ratings
SQL and NoSQL Interview Questions: Your essential guide to acing SQL and NoSQL job interviews (English Edition)
Ebook
SQL and NoSQL Interview Questions: Your essential guide to acing SQL and NoSQL job interviews (English Edition)
byVishwanathan Narayanan
Rating: 0 out of 5 stars
0 ratings
“Mastering Relational Databases: From Fundamentals to Advanced Concepts”: GoodMan, #1
Ebook
“Mastering Relational Databases: From Fundamentals to Advanced Concepts”: GoodMan, #1
byPatrick Mukosha
Rating: 0 out of 5 stars
0 ratings
Hands-on Data Virtualization with Polybase: Administer Big Data, SQL Queries and Data Accessibility Across Hadoop, Azure, Spark, Cassandra, MongoDB, CosmosDB, MySQL and PostgreSQL (English Edition)
Ebook
Hands-on Data Virtualization with Polybase: Administer Big Data, SQL Queries and Data Accessibility Across Hadoop, Azure, Spark, Cassandra, MongoDB, CosmosDB, MySQL and PostgreSQL (English Edition)
byPablo Alejandro Echeverria Barrios
Rating: 0 out of 5 stars
0 ratings
Mastering Amazon Relational Database Service for MySQL: Building and configuring MySQL instances (English Edition)
Ebook
Mastering Amazon Relational Database Service for MySQL: Building and configuring MySQL instances (English Edition)
byJeyaram Ayyalusamy
Rating: 0 out of 5 stars
0 ratings
Getting Started with Greenplum for Big Data Analytics
Ebook
Getting Started with Greenplum for Big Data Analytics
byGollapudi Sunila
Rating: 0 out of 5 stars
0 ratings
AWS Data Analytics: Unleashing the Power of Data: Insights and Solutions with AWS Analytics
Ebook
AWS Data Analytics: Unleashing the Power of Data: Insights and Solutions with AWS Analytics
byBrian Murray
Rating: 0 out of 5 stars
0 ratings
Data Catalog Third Edition
Ebook
Data Catalog Third Edition
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings
Data Engineering with Python: Work with massive datasets to design data models and automate data pipelines using Python
Ebook
Data Engineering with Python: Work with massive datasets to design data models and automate data pipelines using Python
byPaul Crickard
Rating: 0 out of 5 stars
0 ratings
Database testing Third Edition
Ebook
Database testing Third Edition
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings
NoSQL Databases A Complete Guide - 2020 Edition
Ebook
NoSQL Databases A Complete Guide - 2020 Edition
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings
Spark SQL A Complete Guide
Ebook
Spark SQL A Complete Guide
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings
Query Optimization A Complete Guide - 2020 Edition
Ebook
Query Optimization A Complete Guide - 2020 Edition
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings
HDInsight Essentials - Second Edition
Ebook
HDInsight Essentials - Second Edition
byRajesh Nadipalli
Rating: 0 out of 5 stars
0 ratings
Beginning Microsoft SQL Server 2012 Programming
Ebook
Beginning Microsoft SQL Server 2012 Programming
byPaul Atkinson
Rating: 1 out of 5 stars
1/5
Governance Policies A Complete Guide - 2019 Edition
Ebook
Governance Policies A Complete Guide - 2019 Edition
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings
Oracle Business Intelligence Enterprise Edition 12c A Complete Guide - 2020 Edition
Ebook
Oracle Business Intelligence Enterprise Edition 12c A Complete Guide - 2020 Edition
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings
Ultimate Data Engineering with Databricks
Ebook
Ultimate Data Engineering with Databricks
byMayank Malhotra
Rating: 0 out of 5 stars
0 ratings
Data Analysis and Harmonization: A Simple Guide
Ebook
Data Analysis and Harmonization: A Simple Guide
byJeff Voivoda
Rating: 0 out of 5 stars
0 ratings
Data Quality Strategies A Complete Guide - 2020 Edition
Ebook
Data Quality Strategies A Complete Guide - 2020 Edition
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings
Big Data Architecture A Complete Guide - 2019 Edition
Ebook
Big Data Architecture A Complete Guide - 2019 Edition
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings
Data Modelling and Metadata The Ultimate Step-By-Step Guide
Ebook
Data Modelling and Metadata The Ultimate Step-By-Step Guide
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings
WS-BPEL 2.0 Beginner's Guide
Ebook
WS-BPEL 2.0 Beginner's Guide
byMatjaz B. Juric
Rating: 0 out of 5 stars
0 ratings
Azure Data Engineering Cookbook: Design and implement batch and streaming analytics using Azure Cloud Services
Ebook
Azure Data Engineering Cookbook: Design and implement batch and streaming analytics using Azure Cloud Services
byAhmad Osama
Rating: 0 out of 5 stars
0 ratings
Getting Started with Big Data Query using Apache Impala
Ebook
Getting Started with Big Data Query using Apache Impala
byAgus Kurniawan
Rating: 0 out of 5 stars
0 ratings
Data Warehousing Fundamentals for IT Professionals
Ebook
Data Warehousing Fundamentals for IT Professionals
byPaulraj Ponniah
Rating: 3 out of 5 stars
3/5
Data Modeling and Database Design: Turn Your Data into Actionable Insights
Ebook
Data Modeling and Database Design: Turn Your Data into Actionable Insights
byBrian Murray
Rating: 0 out of 5 stars
0 ratings

Computers For You

Skip carousel

101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters
Ebook
101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters
byTriumph Books
Rating: 4 out of 5 stars
4/5
The Simulation Hypothesis: An MIT Computer Scientist Shows Why AI, Quantum Physics and Eastern Mystics All Agree We Are In a Video Game
Ebook
The Simulation Hypothesis: An MIT Computer Scientist Shows Why AI, Quantum Physics and Eastern Mystics All Agree We Are In a Video Game
byRizwan Virk
Rating: 5 out of 5 stars
5/5
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Ebook
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
byMargot Lee Shetterly
Rating: 4 out of 5 stars
4/5
Standard Deviations: Flawed Assumptions, Tortured Data, and Other Ways to Lie with Statistics
Ebook
Standard Deviations: Flawed Assumptions, Tortured Data, and Other Ways to Lie with Statistics
byGary Smith
Rating: 4 out of 5 stars
4/5
Elon Musk
Ebook
Elon Musk
byWalter Isaacson
Rating: 4 out of 5 stars
4/5
The Invisible Rainbow: A History of Electricity and Life
Ebook
The Invisible Rainbow: A History of Electricity and Life
byArthur Firstenberg
Rating: 5 out of 5 stars
5/5
The Innovators: How a Group of Hackers, Geniuses, and Geeks Created the Digital Revolution
Ebook
The Innovators: How a Group of Hackers, Geniuses, and Geeks Created the Digital Revolution
byWalter Isaacson
Rating: 4 out of 5 stars
4/5
Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are
Ebook
Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are
bySeth Stephens-Davidowitz
Rating: 4 out of 5 stars
4/5
Machine Learning for Beginners: An Introduction for Beginners, Why Machine Learning Matters Today and How Machine Learning Networks, Algorithms, Concepts and Neural Networks Really Work
Ebook
Machine Learning for Beginners: An Introduction for Beginners, Why Machine Learning Matters Today and How Machine Learning Networks, Algorithms, Concepts and Neural Networks Really Work
bySteven Cooper
Rating: 4 out of 5 stars
4/5
Slenderman: Online Obsession, Mental Illness, and the Violent Crime of Two Midwestern Girls
Ebook
Slenderman: Online Obsession, Mental Illness, and the Violent Crime of Two Midwestern Girls
byKathleen Hale
Rating: 4 out of 5 stars
4/5
Uncanny Valley: A Memoir
Ebook
Uncanny Valley: A Memoir
byAnna Wiener
Rating: 4 out of 5 stars
4/5
Alan Turing: The Enigma: The Book That Inspired the Film The Imitation Game - Updated Edition
Ebook
Alan Turing: The Enigma: The Book That Inspired the Film The Imitation Game - Updated Edition
byAndrew Hodges
Rating: 4 out of 5 stars
4/5
CompTIA IT Fundamentals (ITF+) Study Guide: Exam FC0-U61
Ebook
CompTIA IT Fundamentals (ITF+) Study Guide: Exam FC0-U61
byQuentin Docter
Rating: 0 out of 5 stars
0 ratings
Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad
Ebook
Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad
byAaron Smith
Rating: 0 out of 5 stars
0 ratings
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
Ebook
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
bySteven Cooper
Rating: 4 out of 5 stars
4/5
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
Ebook
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
byCea West
Rating: 5 out of 5 stars
5/5
Storytelling with Data: Let's Practice!
Ebook
Storytelling with Data: Let's Practice!
byCole Nussbaumer Knaflic
Rating: 4 out of 5 stars
4/5
The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology
Ebook
The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology
byTJ Books
Rating: 4 out of 5 stars
4/5
Deep Search: How to Explore the Internet More Effectively
Ebook
Deep Search: How to Explore the Internet More Effectively
byAlan Pearce
Rating: 5 out of 5 stars
5/5
How to Create Cpn Numbers the Right way: A Step by Step Guide to Creating cpn Numbers Legally
Ebook
How to Create Cpn Numbers the Right way: A Step by Step Guide to Creating cpn Numbers Legally
byAlex Parkinson
Rating: 4 out of 5 stars
4/5
ChatGPT Money Machine 2024 - The Ultimate Chatbot Cheat Sheet to Go From Clueless Noob to Prompt Prodigy Fast! Complete AI Beginner’s Course to Catch the GPT Gold Rush Before It Leaves You Behind
Ebook
ChatGPT Money Machine 2024 - The Ultimate Chatbot Cheat Sheet to Go From Clueless Noob to Prompt Prodigy Fast! Complete AI Beginner’s Course to Catch the GPT Gold Rush Before It Leaves You Behind
byAlec Rowe
Rating: 0 out of 5 stars
0 ratings
Neural Networks: A Practical Guide for Understanding and Programming Neural Networks and Useful Insights for Inspiring Reinvention
Ebook
Neural Networks: A Practical Guide for Understanding and Programming Neural Networks and Useful Insights for Inspiring Reinvention
bySteven Cooper
Rating: 4 out of 5 stars
4/5
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
Ebook
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
byWalter Shields
Rating: 4 out of 5 stars
4/5
The Professional Voiceover Handbook: Voiceover training, #1
Ebook
The Professional Voiceover Handbook: Voiceover training, #1
byPeter Baker
Rating: 5 out of 5 stars
5/5
CompTIA Security+ Get Certified Get Ahead: SY0-701 Study Guide
Ebook
CompTIA Security+ Get Certified Get Ahead: SY0-701 Study Guide
byJoe Shelley
Rating: 5 out of 5 stars
5/5
Tor and the Dark Art of Anonymity
Ebook
Tor and the Dark Art of Anonymity
byLance Henderson
Rating: 5 out of 5 stars
5/5
How to Write a Book: An 11-Step Process to Build Habits, Stop Procrastinating, Fuel Self-Motivation, Quiet Your Inner Critic, Bust Through Writer's Block, & Let Your Creative Juices Flow (Short Read)
Ebook
How to Write a Book: An 11-Step Process to Build Habits, Stop Procrastinating, Fuel Self-Motivation, Quiet Your Inner Critic, Bust Through Writer's Block, & Let Your Creative Juices Flow (Short Read)
byDavid Kadavy
Rating: 5 out of 5 stars
5/5
Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)
Ebook
Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)
byArthur T. Brooks
Rating: 0 out of 5 stars
0 ratings
The Best Hacking Tricks for Beginners
Ebook
The Best Hacking Tricks for Beginners
byRAJ TYAGI
Rating: 4 out of 5 stars
4/5
Summary of Digital Minimalism: by Cal Newport - Choosing a Focused Life in a Noisy World - A Comprehensive Summary
Ebook
Summary of Digital Minimalism: by Cal Newport - Choosing a Focused Life in a Noisy World - A Comprehensive Summary
byAlexander Cooper
Rating: 5 out of 5 stars
5/5

Related podcast episodes

Skip carousel

SQL Server 2022
Podcast episode
SQL Server 2022
byThe Azure Security Podcast
0 ratings
0% found this document useful
Renee M. P. Teate, "SQL for Data Scientists: A Beginner's Guide for Building Datasets for Analysis" (John Wiley & Sons, 2021): An interview with Renee M. P. Teate
Podcast episode
Renee M. P. Teate, "SQL for Data Scientists: A Beginner's Guide for Building Datasets for Analysis" (John Wiley & Sons, 2021): An interview with Renee M. P. Teate
byNew Books in Science, Technology, and Society
0 ratings
0% found this document useful
An Agile Approach To Master Data Management with Mark Marinelli - Episode 46: Building A Master Data Catalog Using Machine Learning (Interview)
Podcast episode
An Agile Approach To Master Data Management with Mark Marinelli - Episode 46: Building A Master Data Catalog Using Machine Learning (Interview)
byData Engineering Podcast
100%
100% found this document useful
A Multipurpose Database For Transactions And Analytics To Simplify Your Data Architecture With Singlestore: An interview with Shireesh Thota about how the Singlestore database engine allows you to reduce architectural sprawl in your data systems by combining performant and scalable transactional and analytical capabilities into a single platform
Podcast episode
A Multipurpose Database For Transactions And Analytics To Simplify Your Data Architecture With Singlestore: An interview with Shireesh Thota about how the Singlestore database engine allows you to reduce architectural sprawl in your data systems by combining performant and scalable transactional and analytical capabilities into a single platform
byData Engineering Podcast
0 ratings
0% found this document useful
DataFramed Careers Series Special Announcement!
Podcast episode
DataFramed Careers Series Special Announcement!
byDataFramed
0 ratings
0% found this document useful
Production data labeling workflows: with Mark Christensen, CEO of Xelex.ai
Podcast episode
Production data labeling workflows: with Mark Christensen, CEO of Xelex.ai
byPractical AI: Machine Learning, Data Science, LLM
0 ratings
0% found this document useful
What "Data Lineage Done Right" Looks Like And How They're Doing It At Manta: An interview with Ernie Ostic about the Manta platform and how it approaches the collection and processing of metadata to build a comprehensive view of data lineage across your various data systems
Podcast episode
What "Data Lineage Done Right" Looks Like And How They're Doing It At Manta: An interview with Ernie Ostic about the Manta platform and how it approaches the collection and processing of metadata to build a comprehensive view of data lineage across your various data systems
byData Engineering Podcast
0 ratings
0% found this document useful
Spanner Myths Busted with Pritam Shah and Vaibhav Govil: This week, we’re busting myths around Google Cloud Spanner with our guests Pritam Shah and Vaibhav Govil. and host this episode and learn about the fantastic capabilities of Cloud Spanner. Our guests give us a quick run-down of Spanner database...
Podcast episode
Spanner Myths Busted with Pritam Shah and Vaibhav Govil: This week, we’re busting myths around Google Cloud Spanner with our guests Pritam Shah and Vaibhav Govil. and host this episode and learn about the fantastic capabilities of Cloud Spanner. Our guests give us a quick run-down of Spanner database...
byGoogle Cloud Platform Podcast
0 ratings
0% found this document useful
Low Friction Data Governance With Immuta: An interview about how the Immuta platform simplifies the work of managing access control and data security as part of your data governance strategy.
Podcast episode
Low Friction Data Governance With Immuta: An interview about how the Immuta platform simplifies the work of managing access control and data security as part of your data governance strategy.
byData Engineering Podcast
0 ratings
0% found this document useful
What Happened to Data Marts?
Podcast episode
What Happened to Data Marts?
byInsights Tomorrow
0 ratings
0% found this document useful
344: Responsible Consumption and Production of Research with Elizabeth Engel and Polly Karpowicz: It’s critical that learning business professionals pay careful attention to the research they create and the research they rely on for making decisions. This means asking questions, knowing the research methods used, and understanding the...
Podcast episode
344: Responsible Consumption and Production of Research with Elizabeth Engel and Polly Karpowicz: It’s critical that learning business professionals pay careful attention to the research they create and the research they rely on for making decisions. This means asking questions, knowing the research methods used, and understanding the...
byLeading Learning Podcast
0 ratings
0% found this document useful
Episode 143: Smart Technologies and Cloud Computing for a Greener Future: In this episode of Cloud Talk, Jeff DeVerter and Srinivas Koushik discuss sustainability in the context of IT. They explore how IT affects and is affected by environmental and social issues, and how to make IT more sustainable with smart technologies and cloud computing. They also share some best practices and solutions for implementing a sustainable IT strategy.
Podcast episode
Episode 143: Smart Technologies and Cloud Computing for a Greener Future: In this episode of Cloud Talk, Jeff DeVerter and Srinivas Koushik discuss sustainability in the context of IT. They explore how IT affects and is affected by environmental and social issues, and how to make IT more sustainable with smart technologies and cloud computing. They also share some best practices and solutions for implementing a sustainable IT strategy.
byCloud Talk
0 ratings
0% found this document useful
The Ethics of Procedural Fidelity: Session 272 with Claire St. Peter: Whether one calls it Procedural Fidelity, Treatment Integrity, or any combination of those, and/or many other related terms, this is an important and often overlooked issue when it comes to implementing behavior analytic interventions. Think about it...
Podcast episode
The Ethics of Procedural Fidelity: Session 272 with Claire St. Peter: Whether one calls it Procedural Fidelity, Treatment Integrity, or any combination of those, and/or many other related terms, this is an important and often overlooked issue when it comes to implementing behavior analytic interventions. Think about it...
byThe Behavioral Observations Podcast with Matt Cicoria
0 ratings
0% found this document useful
343: Forging Effective Learning with Bror Saxberg
Podcast episode
343: Forging Effective Learning with Bror Saxberg
byLeading Learning Podcast
0 ratings
0% found this document useful
Fast.ai, AutoML, and Software Engineering for ML: Jeremy Howard // Coffee Session #47
Podcast episode
Fast.ai, AutoML, and Software Engineering for ML: Jeremy Howard // Coffee Session #47
byMLOps.community
0 ratings
0% found this document useful
5 Proven Tips for Creating Effective Digital Adoption Documentation: About Maiken Blok Maiken Blok is a Senior Manager, Technical Documentation, at BK Medical. She holds a master’s degree in Technical Communication and Localisation from the University of Strasbourg, France, and the TCTrainNet certification. Both as a mentor for tech com students and as part of her commitment to Danmark as President, she aspires to spread the word about our profession and motivate even more people to join the field of technical communication.
Podcast episode
5 Proven Tips for Creating Effective Digital Adoption Documentation: About Maiken Blok Maiken Blok is a Senior Manager, Technical Documentation, at BK Medical. She holds a master’s degree in Technical Communication and Localisation from the University of Strasbourg, France, and the TCTrainNet certification. Both as a mentor for tech com students and as part of her commitment to Danmark as President, she aspires to spread the word about our profession and motivate even more people to join the field of technical communication.
byThe Digital Adoption Show | Upskilling the Future Digital Workforce
0 ratings
0% found this document useful
Implementing niche solutions at scale | Urvashi Bhatnagar | 435: Putting scientific solutions into a book for the general audience.
Podcast episode
Implementing niche solutions at scale | Urvashi Bhatnagar | 435: Putting scientific solutions into a book for the general audience.
byLeveraging Thought Leadership
0 ratings
0% found this document useful
FoA 245: Agtech Product Strategy with Climate Corp Chief Product Officer Ranjeeta Singh: Today’s show connects back to episode 241 with Craig Rupp of Sabanto, where we talked about, among many other things, how the Climate Corp has been able to become a central data collection platform on so many large scale farms. Ranjeeta Singh,...
Podcast episode
FoA 245: Agtech Product Strategy with Climate Corp Chief Product Officer Ranjeeta Singh: Today’s show connects back to episode 241 with Craig Rupp of Sabanto, where we talked about, among many other things, how the Climate Corp has been able to become a central data collection platform on so many large scale farms. Ranjeeta Singh,...
byFuture of Agriculture
0 ratings
0% found this document useful
Analytics for a Better World - Parvathy Krishnan
Podcast episode
Analytics for a Better World - Parvathy Krishnan
byDataTalks.Club
0 ratings
0% found this document useful
352 — Supporting neuroinclusion at work: Cognitive diversity brings enormous benefits to teams. How can we proactively recruit and support people who are neurodivergent? In this week's episode of The Mind Tools L&D Podcast, speaker and trainer Reena Anand speaks to Gemma and...
Podcast episode
352 — Supporting neuroinclusion at work: Cognitive diversity brings enormous benefits to teams. How can we proactively recruit and support people who are neurodivergent? In this week's episode of The Mind Tools L&D Podcast, speaker and trainer Reena Anand speaks to Gemma and...
byThe Mind Tools L&D Podcast
0 ratings
0% found this document useful
Building a Carbon-Focused Tech Startup With CoveTool Co-Founder Patrick Chopson: Patrick Chopson, is the co-founder od the carbon-focused Atlanta-based startup, cove.tool. As the Chief Product Officer, he leads product development for cove.tool, a web-based software for analyzing, drawing, engineering, and connecting...
Podcast episode
Building a Carbon-Focused Tech Startup With CoveTool Co-Founder Patrick Chopson: Patrick Chopson, is the co-founder od the carbon-focused Atlanta-based startup, cove.tool. As the Chief Product Officer, he leads product development for cove.tool, a web-based software for analyzing, drawing, engineering, and connecting...
byThe Green Building Matters Podcast with Charlie Cichetti
0 ratings
0% found this document useful
How to Become a Lean Leader: Richard Knaster: This podcast is centered around helping you refine those skills that help you do your work in tech, and a lot of those skills center around using agile ways of working. Today, I speak with an expert in one of those frameworks His name is and he has...
Podcast episode
How to Become a Lean Leader: Richard Knaster: This podcast is centered around helping you refine those skills that help you do your work in tech, and a lot of those skills center around using agile ways of working. Today, I speak with an expert in one of those frameworks His name is and he has...
byHardcore Soft Skills Podcast
0 ratings
0% found this document useful
#113 Successful Frameworks for Scaling Data Maturity
Podcast episode
#113 Successful Frameworks for Scaling Data Maturity
byDataFramed
0 ratings
0% found this document useful
Scaling UX Research : Democratization 2.0: with Roberta Dombrowski
Podcast episode
Scaling UX Research : Democratization 2.0: with Roberta Dombrowski
byUX Cake
0 ratings
0% found this document useful
422: Harvard grad and Cerebral data science team leader, Akshay Swaminathan, on winning with Data Science: Welcome to Strategy Skills episode 422, an interview with the author of Winning with Data Science: A Handbook for Business Leaders. This book is a compelling and comprehensive guide to data science, emphasizing its real-world business applications and...
Podcast episode
422: Harvard grad and Cerebral data science team leader, Akshay Swaminathan, on winning with Data Science: Welcome to Strategy Skills episode 422, an interview with the author of Winning with Data Science: A Handbook for Business Leaders. This book is a compelling and comprehensive guide to data science, emphasizing its real-world business applications and...
byThe Strategy Skills Podcast: Strategy | Leadership | Critical Thinking | Problem-Solving
0 ratings
0% found this document useful
Keeping ourselves honest when we work with observational healthcare data: The abundance of data in healthcare, and the valu…
Podcast episode
Keeping ourselves honest when we work with observational healthcare data: The abundance of data in healthcare, and the valu…
byLinear Digressions
0 ratings
0% found this document useful
IDEAS Framework Toolkit: Contents Podcast Panelists Additional Resources Transcript - In April, we hosted a webinar about the recently released IDEAS Impact Framework Toolkit—a free online resource designed to help innovators in the field of early childhood build impr...
Podcast episode
IDEAS Framework Toolkit: Contents Podcast Panelists Additional Resources Transcript - In April, we hosted a webinar about the recently released IDEAS Impact Framework Toolkit—a free online resource designed to help innovators in the field of early childhood build impr...
byThe Brain Architects
0 ratings
0% found this document useful
Manifesto for Agile Software Development
Podcast episode
Manifesto for Agile Software Development
byALEPH - GLOBAL SCRUM TEAM - Agile Coaching. Agile Training and Digital Marketing Certifications
0 ratings
0% found this document useful
Collaborators: Data-driven decision-making with Jina Suh and Shamsi Iqbal
Podcast episode
Collaborators: Data-driven decision-making with Jina Suh and Shamsi Iqbal
byMicrosoft Research Podcast
0 ratings
0% found this document useful
Introducing NRPA’s Equity in Practice Program – October Bonus Episode: The October issue of Parks & Recreation magazine is out now, and on today’s episode, I’m joined by two of my colleagues to discuss this month’s feature story, “NRPA’s Equity in Practice Initiative.” We recently launched this program at the...
Podcast episode
Introducing NRPA’s Equity in Practice Program – October Bonus Episode: The October issue of Parks & Recreation magazine is out now, and on today’s episode, I’m joined by two of my colleagues to discuss this month’s feature story, “NRPA’s Equity in Practice Initiative.” We recently launched this program at the...
byOpen Space Radio
0 ratings
0% found this document useful

Skip carousel

Data Fabric
PC Pro Magazine
Article
Data Fabric
Aug 13, 2020
3 min read
3 Salesforce Buys Slack In A $27.4B Deal.
Techfastly
Article
3 Salesforce Buys Slack In A $27.4B Deal.
Jan 6, 2021
5 min read
Grafana Terminology
Linux Format
Article
Grafana Terminology
Jan 14, 2020
A Grafana data source is a database, file or service that provides data to Grafana – it cannot operate without data. A Grafana panel is the basic building block of Grafana. Panels are made of visualisations or queries. A Grafana query is used for req
1 min read
MARIADB Optimise And Control Your Databases
Linux Format
Article
MARIADB Optimise And Control Your Databases
Jul 30, 2019
9 min read
Machine Learning Makes A Cost-effective Environmental Watchdog
Futurity
Article
Machine Learning Makes A Cost-effective Environmental Watchdog
Oct 10, 2018
Machine learning could help safeguard public health and spot environmental dangers, according to new research. As Hurricane Florence ground its way through North Carolina, it released what might politely be called an excrement storm. Massive hog farm
3 min read
Budget Strategies for Maximizing Big Data
Entrepreneur
Article
Budget Strategies for Maximizing Big Data
Jun 1, 2016
1 min read
A.I. Scans For Big Farms That Might Be Polluters
Futurity
Article
A.I. Scans For Big Farms That Might Be Polluters
Apr 9, 2019
3 min read
Transforming To Paper-lite In Your Business
Facility Management
Article
Transforming To Paper-lite In Your Business
Jun 27, 2019
4 min read
Human-centred Design
Facility Management
Article
Human-centred Design
Dec 23, 2018
There was a recent report in The Sydney Morning Herald that was very misleading in its representation of the discipline of ergonomics. Citing the testimony of an authority on back pain, Sydney University professor, Chris Maher, the piece fundamentall
4 min read
The Five Essential Pillars of Multidisciplinary Process/Service Design
The European Business Review
Article
The Five Essential Pillars of Multidisciplinary Process/Service Design
Feb 4, 2019
6 min read
Leadership Forum: Making Digital Transformation A Reality
Rotman Management
Article
Leadership Forum: Making Digital Transformation A Reality
Jan 1, 2018
Glenda Crisp Senior Vice President and Chief Data Officer, TD Bank Group + Connie Bonello Associate Partner, Financial Services, IBM Canada IN MOST OF TODAY’S ORGANIZATIONS, data underpins every transaction, operation and interaction. And yet, the ab
8 min read
The Era of Human + Machine Innovation
Rotman Management
Article
The Era of Human + Machine Innovation
Jan 1, 2019
Interview by Karen Christensen In today's environment, organizations that don't keep up with customers' evolving needs are doomed. What is the best way to get a handle on these evolving needs? The first step in understanding your customers is to acce
5 min read
How To Make Sense From And With AI ?
The European Business Review
Article
How To Make Sense From And With AI ?
Sep 25, 2021
4 min read
THE ART OF FUTURE DESIGN — PART II: Deployment, Wholeness, and Impact on Human Beings
The European Business Review
Article
THE ART OF FUTURE DESIGN — PART II: Deployment, Wholeness, and Impact on Human Beings
Oct 2, 2023
18 min read
Strategic Foresight: Creating Visions Of The Future
Rotman Management
Article
Strategic Foresight: Creating Visions Of The Future
May 1, 2024
10 min read
Scrum Project Management: The Ideal Agile Practice
Techfastly
Article
Scrum Project Management: The Ideal Agile Practice
May 3, 2021
7 min read
Stand By Me
Architectural Review Asia Pacific
Article
Stand By Me
Jun 24, 2019
When it comes to good business practice in the architectural profession you can’t beat the value derived from peer to peer exchange. Perhaps the best piece of advice I can give to young and emerging architects, and it’s something I have suggested to
2 min read
Growing Communities of Practice
Rotman Management
Article
Growing Communities of Practice
Sep 1, 2019
According to the SAP Digital Transformation Executive Study, 80 per cent of companies that have embraced digital transformation have experienced increased profitability. So why have only 21 per cent of companies completed their digital transformation
5 min read
Behavioural Insights at Work
Rotman Management
Article
Behavioural Insights at Work
May 1, 2019
Why is it such a great time to be a cognitive scientist right now? More than ever, people are talking about how the brain works, and everyone wants to understand how to change behaviour. As a result, the inner workings of the brain are no longer the
5 min read
A New Tomorrow For Geoscience Technology
NZBusiness and Management
Article
A New Tomorrow For Geoscience Technology
Jul 20, 2022
4 min read
Adoption of Cognitive Computing Across Various Industries
Techfastly
Article
Adoption of Cognitive Computing Across Various Industries
Dec 1, 2021
5 min read
Data-driven Decision Making That Uses Data, Mind And Heart
The European Business Review
Article
Data-driven Decision Making That Uses Data, Mind And Heart
Jan 31, 2020
14 min read
The Procurement Call For Agile, What Does It Mean?
The European Business Review
Article
The Procurement Call For Agile, What Does It Mean?
Dec 3, 2019
11 min read
The Chain
Tatler Hong Kong
Article
The Chain
Jun 2, 2022
4 min read
Naga Chandrasekaran
HWM Singapore
Article
Naga Chandrasekaran
Dec 6, 2022
Micron’s 232-layer NAND technology provided the high-performance storage necessary to support advanced solutions and real-time services required in data centre and automotive applications, thanks to benefits like longer battery life, better performan
3 min read
The Democratization of Judgment
Rotman Management
Article
The Democratization of Judgment
Jan 1, 2018
8 min read
Tech Tools Of The Trade
Landscape Architecture Australia
Article
Tech Tools Of The Trade
Jan 29, 2024
6 min read
Better Together: Behavioural Science + Data Science
Rotman Management
Article
Better Together: Behavioural Science + Data Science
May 1, 2020
IMAGINE THIS SCENARIO: You are designing a new customer experience to drive a shift in customer behaviour. You have reviewed the reports and dashboards describing current behaviour. You have asked customers how they felt and incorporated their feedba
5 min read
In Conversation with RAJIV JAYARAMAN Founder-CEO, Knolskape
Techfastly
Article
In Conversation with RAJIV JAYARAMAN Founder-CEO, Knolskape
Sep 1, 2021
14 min read
Saxo Bank And Thoughtworks: Enabling Data Democratization At A Global Investment Bank
Business Today
Article
Saxo Bank And Thoughtworks: Enabling Data Democratization At A Global Investment Bank
Jan 20, 2023
2 min read

Related categories

Skip carousel

Reviews for Big Data and Analytics

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Big Data and Analytics - Dr. Jugnesh Kumar

HAPTER

Introduction to Big Data

Introduction

The amount of data produced by humanity is increasing exponentially because of the rapid development of technology, the proliferation of devices, and the widespread use of social networking sites. To put things in perspective, humankind produced 5 billion gigabytes of data between the beginning of time and 2003, which could cover an entire football pitch if represented as physical discs.

Amazingly, however, the same amount of data was generated every ten minutes in 2013, up from every two days in 2011, and it has continued to increase significantly. Even though this vast amount of information has many valuable insights and the potential to be helpful when processed, it is frequently underutilized and ignored. The enormous volume of data being produced at an unprecedented rate worldwide is called Big Data. Both structured and unstructured data types are possible. Businesses heavily rely on data in today's knowledge-based economy to fuel their success. So, it becomes crucial and enormously rewarding to make sense of this data, identify patterns, and expose hidden connections within this vast ocean of information. The urgent need is to turn big data into easily usable, actionable business intelligence for enterprises. Businesses of all sizes, locations, market shares, and customer segments can develop successful strategies by accessing and analyzing high-quality data. This is where Hadoop, the go-to platform for processing enormous volumes of data, comes into play.

Structure

In this chapter, we will discuss the following topics:

Diverse facets of big data

Digital data and its types

Characteristics of big data

Types of big data

Evolution of big data

Applications and challenges of big data

3Vs of big data

Non-deﬁnitional traits of big data

Big data work ﬂow management

Business intelligence versus big data

Data science process steps

Foundations for big data systems and programming

Distributed ﬁlesystems

Data warehouse and Hadoop environment

Coexistence

Diverse facets of big data

Alternatively, we can deﬁne big data as a collection of sizable datasets processed faster than traditional computing techniques. It has developed into a broad discipline that includes various tools, techniques, and frameworks, not just a single technique or tool. The data consists of the enormous amount produced by different devices and applications. The following industries fall under the umbrella of big data, as shown in Table 1.1:

Table 1.1 : Shows the involvement of big data in various organizations

Digital data and its types

Digital data can be classiﬁed into several types based on their characteristics and formats. Here are some common types of digital data given below:

Textual data: This type includes written or typed text, such as documents, emails, webpages, and social media posts. Textual data is typically represented as a sequence of characters.

Numeric data: Numeric data consists of numbers and mathematical values. It can be discrete (whole numbers) or continuous (decimal numbers). Examples of numeric data include measurements, ﬁnancial data, and statistical records.

Image data: Image data represents visual information through pictures or graphical content. It consists of a grid of pixels, where each pixel contains color or grayscale information. Image data is commonly used in photography, digital art, and computer vision applications.

Audio data: Audio data represents sound or audio signals. It can be in the form of speech, music, or other audio recordings. Audio data is typically stored as waveform samples, capturing variations in air pressure over time.

Video data: Video data consists of an order of images (frames) presented in rapid succession. It combines image and audio data to represent moving visual content. Video data is commonly used in movies, television, surveillance systems, and video streaming platforms.

Geospatial data: Geospatial data refers to data with geographical or spatial information. It includes coordinates, maps, satellite imagery, and location-based data. Geospatial data is widely used in navigation, urban planning, mapping, and environmental analysis.

Time series data: Time series data capture measurements or observations taken at different points in time. It includes data points recorded at regular intervals, such as stock prices, weather data, sensor readings, and device logs.

Structured data: This type of data follows a predeﬁned format and schema. It is organized in a tabular or relational form, with well-deﬁned rows and columns. Structured data is stored in databases and spreadsheets and can be easily queried and analyzed.

Unstructured data: Unstructured data refers to data that does not have a predeﬁned format or structure. It includes free-form text, multimedia content, social media posts, emails, and documents. Unstructured data requires advanced techniques like machine learning and natural language processing to extract meaningful insights.

Metadata: Metadata provides descriptive information about other types of data. It includes ﬁle names, creation dates, author information, data sources, and formats. Metadata helps in organizing, managing, and understanding other data types.

Characteristics of big data

Data can possess several characteristics that impact its management, analysis, and interpretation. Some important features of data include:

Volume: Volume denotes the amount or size of data. It can range from small-scale data sets to massive volumes of data from various sources.

Velocity: Velocity denotes the speed at which data is created, processed, collected, and analyzed. Real-time data requires fast processing capabilities to extract timely insights.

Variety: The diversity of data types and formats is called variety. Text, images, audio, video, and other data types can exist in structured, unstructured, or semi-structured forms.

Semi-structured: The data shows some organization but lacks a strict structure, in contrast to structured data, which is prearranged in a tabular format with a predetermined schema. A certain level of hierarchy or relationship is possible because this kind of data frequently contains elements like tags, keys, or attributes.

Veracity: Veracity refers to the quality and reliability of data. Data may contain errors, inconsistencies, or inaccuracies that must be addressed to ensure data veracity.

Value: Value refers to data's usefulness, relevance, and potential insights. Extracting value from data involves analysis, interpretation, and decision-making based on the obtained insights.

Variability: Variability refers to the dynamic landscape of data. Data can exhibit variations in volume, velocity, and variety over time. Handling data variability requires adaptability and ﬂexibility in data.

Types of big data

Big data can be classiﬁed into three main types based on the nature of the data and its characteristics. These types are mentioned in Table1.2:

Table 1.2 : Difference between structured and unstructured data based on different criteria

Structured data

Information that has been organized and can be processed, saved, and retrieved in a semiformal is structured data. It is typically kept in databases and is readily accessible utilizing simple algorithms. Since the data format is known beforehand, managing structured data is simple. Structured data includes information that is kept by a business in databases such as tables and spreadsheets. Structured data in big data refers to data that has a predeﬁned format and ﬁts into a well-deﬁned schema or model. It is organized and stored in a tabular correlational format, typically found in traditional databases. Structured data follows a consistent and predeﬁned structure, making it easier to query, analyze, and process using conventional database management systems. Figure 1.1 depicts the structure data in different colors:

Figure 1.1: Illustrates the structure data in different color

(Source: https://dryviq.com/unstructured-vs-structured-data-4-key-management-differences/)

Key characteristics of structured data in big data include:

Fixed schema: Structured data has a ﬁxed and predeﬁned schema that deﬁnes the structure of the data. The schema determines the kinds of data that can be stored, the relationships among different data elements, and the constraints on data values. This ﬁxed schema enables eﬃcient data storage, indexing, and retrieval.

Organized format: Structured data is systematized into rows, tables, as well as columns, where each column represents a speciﬁc attribute or data ﬁeld, and each row represents an individual record or data instance. This tabular format allows for easy organization, storage, and manipulation of data.

Consistent data types: Structured data adheres to consistent data types, such as integers, ﬂoats, strings, dates, or Booleans, which ensure uniformity and facilitate data processing and analysis. These predeﬁned data types provide clarity on the nature of the data and enable eﬃcient storage and computation.

Querying and analysis: Structured data can be easily queried, ﬁltered, and analyzed using Structured Query Language (SQL) or similar database query languages. The structured nature of the data enables eﬃcient indexing and optimized query execution, allowing for fast and precise retrieval of desired information.

Relational Database Management Systems (RDBMS): Structured data is commonly stored and managed using RDBMS. It provides robust mechanisms for creating, storing, and manipulating structured data, ensuring data integrity, transaction management, and security.

Examples of structured data in big data include transactional data in e-commerce systems(customer orders, product details, purchase history, and so on.), ﬁnancial data (stock prices, sales reports, and so on.), sensor data with ﬁxed attributes (temperature, pressure readings, and so on.), and customer data (demographics, contact information, and so on.). Structured data is relatively easy to work with because of its organized and predictable nature. However, it is important to note that big data encompasses not only structured data but also semi-structured and unstructured data. Incorporating and integrating structured data with other data types adds complexity to big data analytics and requires advanced techniques to extract meaningful insights from the larger data landscape.

Unstructured data

Data without a predetermined structure is referred to as unstructured data. It exhibits heterogeneity and is typically larger than structured data. The results of a Google search serve as a prime example of unstructured data. It includes various sizes of text, images, videos, webpages, and other data formats. Unstructured data in big data refers to data that lacks a predeﬁned structure or does not ﬁt into a traditional tabular format. It is essentially any form of data that does not conform to a rigid schema or model. Unstructured data is typically more complex, diverse, and challenging to process compared to structured data. Examples of unstructured data include text documents, social media posts, emails, audio recordings, images, videos, webpages, and sensor data.

Key characteristics of unstructured data in big data include:

Lack of predeﬁned structure: Unstructured data does not adhere to a ﬁxed schema or predeﬁned format. It can have varying lengths, formats, and organization. Each piece of unstructured data may contain different types of information or have different data ﬁelds, making it challenging to organize and process.

Diverse data types: Unstructured data encompasses various data types, such as text, multimedia, and sensor data. This diversity requires specialized techniques to handle different formats and extract insights from multiple data sources.

Natural language content: Unstructured data often includes natural language content, such as text documents, emails, or social media posts, sentiment analysis, text mining, and entity recognition.

Rich media content: Unstructured data also includes media ﬁles, such as images, videos, and audio recordings. Analyzing and extracting insights from these media ﬁles may involve computer vision techniques, video/image analysis, audio processing, and pattern recognition.

Semi-structured elements: Unstructured data can contain semi-structured elements, which exhibit some level of organization but lack a strict schema. For example, webpages may have HTML tags, XML ﬁles may have tags and attributes, or social media posts may have hashtags and mentions. Handling these semi-structured elements requires techniques that can capture the underlying structure while accommodating the variations in data organization.

Large volume: Unstructured data can contribute significantly to the volume of big data. Text documents, social media feeds, and multimedia ﬁles can accumulate rapidly, resulting in a massive amount of unstructured data that needs to be processed and analyzed.

Dealing with unstructured data in big data requires advanced technologies and techniques. These include text mining, image natural language processing, machine learning, video analysis, and deep learning algorithms. By leveraging these methods, organizations can unlock valuable insights hidden within unstructured data and gain a more comprehensive understanding of their business processes, market trends, customer sentiments, and more.

Semi-structured data

As the name suggests, semi-structured data is a combination of structured and unstructured data. It refers to data that is not organized into a speciﬁc database but has crucial tags that identify various components of it. A relational Database Management Systems (DBMS)table deﬁnition is a prime example of semi-structured data. Semi-structured data in big data refers to data that has some level of structure but does not conform to a rigid, predeﬁned schema like structured data. It lies between structured and unstructured data, combining elements of both. Semi-structured data possesses some organizational patterns or tags that provide a basic structure, but it allows ﬂexibility in terms of data ﬁelds and formats. It is commonly encountered in various domains, including web data, log ﬁles, JSON documents, XML ﬁles, and NoSQL databases.

Here are the key characteristics of semi-structured data in big data:

Flexible schema: Semi-structured data does not require a predeﬁned, ﬁxed schema like structured data. It allows for variations in data ﬁelds and formats, enabling greater ﬂexibility when capturing and storing data. Each record or document can have different attributes or elements, and new attributes can be added over time without disrupting the existing data structure.

Tags or markers: Semi-structured data often includes tags, markers, or metadata that provide some level of organization or structure. These tags provide hints about the data elements and their relationships but do not enforce a strict schema. Examples include XML tags, JSON key-value pairs, or attributes in NoSQL databases.

Hierarchical structure: Semi-structured data can exhibit a hierarchical structure, where data elements are organized in a nested or tree-like fashion. This structure enables capturing complex relationships between data elements and supports eﬃcient querying and navigation through the data.

Limited data integrity: Contrasting structured data and semi-structured data does not enforce strict data integrity constraints. It may contain inconsistencies or incomplete information. Data quality control and validation mechanisms need to be applied during data processing to ensure accuracy and reliability.

Diverse formats: Semi-structured data can be represented in various formats, including XML, JSON, YAML, HTML, or key-value pairs. These formats provide ﬂexibility in representing complex data structures and enable interoperability between different systems and platforms.

Processing challenges: Analyzing and processing semi-structured data requires specialized tools and techniques. Techniques such as XML parsing, JSON parsing, XPath querying, or schema-on-read approaches are commonly employed to extract information, navigate through the data hierarchy, and handle the ﬂexible structure of semi-structured data.

Semi-structured data presents unique challenges and opportunities in big data analytics. It allows for the storage and analysis of diverse and dynamic data types while providing some organizational structure. Leveraging semi-structured data requires data integration techniques, schema discovery, and ﬂexible data processing approaches that can adapt to the evolving nature of the data.

Evolution of big data

The history of big data can be traced back to the initial days of computing and the evolution of data storage and processing technologies. Here are some key milestones in the history of big data:

Early data processing (1950s-1970s): In the early days of computing, data processing was limited to structured data stored in databases. Mainframe computers were used to process huge volumes of data, primarily in batch mode.

Relational databases (1970s-1980s): The invention of relational databases introduced a structured and organized approach to data storage and management. Structured Query Language (SQL) was developed as a standard language for interacting with relational databases.

Data warehousing (1980s-1990s): The idea of data warehousing emerged, focusing on collecting and storing large volumes of data from multiple sources for analysis and reporting. Data warehouses allow organizations to consolidate data and perform complex queries.

Internet and Web (1990s): The beginning of the Internet and the World Wide Web led to an explosion of digital data. Websites, online transactions, and digital content generated vast amounts of data, including text, images, videos, and user interactions.

Emergence of Hadoop (2000s): In 2004, Google introduced the Google File System (GFS) and MapReduce, which inspired the development of Apache Hadoop, an open-source framework for distributed storage as well as the processing of big data. Hadoop allows organizations to store and process large volumes of data across clusters of commodity hardware.

NoSQL and new database technologies (2000s-2010s): As the volume and variety of data grew, new database technologies emerged to handle unstructured and semi-structured data. NoSQL databases, such as MongoDB and Cassandra, offered scalable and ﬂexible solutions for managing Big data.

Cloud computing (2000s-2010s): Platforms for cloud computing like Amazon Web Services (AWS)and Microsoft Azure offered scalable and affordable infrastructure for the processing and storing of Big data. Cloud services could

Enjoying the preview?

Page 1 of 1

Big Data and Analytics: The key concepts and practical applications of big data analytics (English Edition)

About this ebook

Dr. Jugnesh Kumar

Related authors

Related to Big Data and Analytics

Related ebooks

Computers For You

Related podcast episodes

Related articles

Related categories

Reviews for Big Data and Analytics

What did you think?

Book preview

Big Data and Analytics - Dr. Jugnesh Kumar

Introduction

Structure

Diverse facets of big data

Digital data and its types

Characteristics of big data

Types of big data

Structured data

Unstructured data

Semi-structured data

Evolution of big data