Nothing Special   »   [go: up one dir, main page]

Big Data Seminar

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 27
At a glance
Powered by AI
The key takeaways are that big data refers to very large datasets that are complex and unstructured, exceeding the capabilities of traditional databases. It has characteristics of volume, velocity and variety. Big data can provide advantages like real-time analytics, customer insights and decision making for organizations.

The characteristics of big data include volume (very large amounts of data), velocity (data is generated rapidly), and variety (data comes in all types of formats from structured to unstructured).

Traditional data is structured and stored in a centralized database, while big data is unstructured and complex. It is too large to be handled by traditional databases and tools. Big data requires new tools and techniques for processing and extracting value.

COLLEGE OF COMPUTING AND INFORMATICS

 
 
DEPARTMENT OF INFORMATION TECHNOLOGY
 

Title: Seminar report on Big Data


Slide contents
 Introduction

 General motives of topic

 General objectives of paper

 Detail description about the big data

 Big Data Technologies

 How do big data work

 Architecture of big data

 Advantages and drawback of Big data

 Big data applications area

 Conclusions

 Recommendation
Introduction

 Big Data refers to data volumes in the range of Exabyte and beyond. Such volumes exceed the
capacity of current on-line storage and processing systems. With characteristics like volume,
velocity and variety big data throws challenges to the traditional IT establishments. Computer
assisted innovation, real time data analytics, customer-centric business intelligence, industry
wide decision making and transparency are possible advantages, to mention few, of Big Data.
 In simple terms, it can be defined as the vast amount of data so complex and unorganized which
can’t be handled with the traditional database management systems(i.e. relational database
management system). It is so complex and huge that we cannot store and process it with the
traditional database management tools or data processing applications. With more and more
data generated, it has become a big challenge for traditional architectures and infrastructures to
process large amounts of data within an acceptable time and resources.
In order to efficiently extract value from these data, organizations need to find new tools
and methods specialized for big data processing. For this reason, big data analytics has
become a key factor for companies to reveal hidden information and achieve competitive
advantages in the market.
 Hence Big data is a broad term for data sets so large or complex that traditional data
processing applications are inadequate.
 
Motivation
 Traditional data is the structured data which is being majorly maintained by all types of businesses
starting from very small to big organizations. In traditional database system a centralized database
architecture used to store and maintain the data in a fixed format or fields in a file. For managing and
accessing the data structured query language is used.

 In the traditional data: -It is difficult to maintain the accuracy and confidential as the quality of the data is
high and in order to store such massive quantity of data is expensive. It affects the data analyzing which
also decrease the end result of accuracy and confidentiality.

 For all this problem big data is the solution. So, get deep information about big data so that we can use it
will solve the problem in the relational database.

 Big data is about handling data that wouldn't be possible to be handled by traditional database
systems. Many world class problems are being addressed with big data. Because big data
process used for handling a lot of different types of data very quickly.
General Objectives
 

The general objective of this paper (seminar project) is to identify problem with traditional
data, and to introduce big data to the people who don’t know and haven’t any information
about this big data so that they can get detail information of this technology. Not only getting
information and also use it.
Specific objectives

 To organizations and create new growth opportunities and entirely new categories of
companies that can combine and analyze industry data.

 To organize data comes from a variety of sources, such as social media pages, customer
logs, financial reports, e-mails, presentations and reports created by employees.so that
prepare reports by combining all this data

 To Secure huge sets of data by using more cybersecurity professionals to protect these
data.

 To handle these large data sets (amount of data being stored in data centers and
databases of companies which increasing rapidly.) by using modern techniques.
Detail description about the big data

• Big Data is a collection of data that is huge in volume, yet growing exponentially with time. It
is a data with so large size and complexity that none of traditional data management tools can
store it or process it efficiently. Big data is also a data but with huge size.

• It is a term that describes the large volume of data both structured and unstructured that
inundates a business on a day-to-day basis. But it’s not the amount of data that’s important. It’s
what organizations do with the data that matters. Big data can be analyzed for insights that lead
to better decisions and strategic business moves.

• The use of Big Data is becoming common these days by the companies to outperform their
peers. In most industries, existing competitors and new entrants alike will use the strategies
resulting from the analyzed data to compete, innovate and capture value.  
• Big data refers to massive complex structured and unstructured data sets that
are rapidly generated and transmitted from a wide variety of sources. These
attributes make up the three Vs of big data:   
Volume: The huge amounts of data being stored.
Velocity: The lightning speed at which data streams must be processed and analyzed.
Variety: The different sources and forms from which data is collected, such as
numbers, text, video, images, audio and text.
Massive collections of valuable information that companies and organizations need
to manage, store, visualize and analyze.
Traditional data tools aren't equipped to handle this kind of complexity and volume.
Big Data Technologies
 

Big Data technologies are the software utility designed for analyzing, processing, and
extracting information from the unstructured large data which can’t be handled with the
traditional data processing software. Companies required big data processing technologies to
analyze the massive amount of real-time data. They use Big Data technologies to come up

with Predictions to reduce the risk of failure. There are lots of technologies to solve the

problem of Big data Storage and processing. Such technologies are Apache Hadoop, Apache
Spark
 Apache Hadoop
It is the topmost big data tool. Apache Hadoop is an open-source software framework developed
by Apache Software foundation for storing and processing Big Data. Hadoop stores and
processes data in a distributed computing environment across the cluster of commodity
hardware. Hadoop is the in-expensive, fault-tolerant and highly available framework that can
process data of any size and formats.
 Apache Spark
Apache Spark is another popular open-source big data tool designed with the goal to speed up the
Hadoop big data processing. The main objective of the Apache Spark project was to keep the
advantages of MapReduce’s distributed, scalable, fault-tolerant processing framework and make
it more efficient and easier to use. It provides in-memory computing capabilities to deliver Speed.
Spark supports both real-time as well as batch processing and provides high-level APIs in Java,
Scala, Python. Hence spark helps in-memory calculation.
Big data Examples
Social Media: The statistic shows that 500+terabytes of new data get ingested into the
databases of social media site Facebook, every day. This data is mainly generated in
terms of photo and video uploads, message exchanges, putting comments etc.
A single Jet engine can generate 10+terabytes of data in 30 minutes of flight time. With
many thousand flights per day, generation of data reaches up to many Petabytes.
How do big data work

The need to handle so much data requires a really stable and well-structured


infrastructure. It will need to quickly process huge volumes and different types of data
and this can overload a single server or cluster. This is why we need to have a well-
thought-out system behind Big Data.

All the processes should be considered according to the capacity of the system. And
this can potentially demand hundreds or thousands of servers for larger companies.

Therefore, to understand how big data work is about to know the following concept
Integration
Big Data is always collected from many sources and as we are speaking for enormous loads of
information, new strategies and technologies to handle it need to be discovered. In some cases, we
are talking for petabytes of information flowing into our system, so it will be a challenge
to integrate such volume of information in our system. We will have to receive the data, process
it and format it in the right form that our business needs and that our customers can understand.

Management
We will need a place to store it. our storage solution can be in the cloud, on-premises, or both. we can
also choose in what form your data will be stored, so you can have it available in real-time on-demand.

Analysis
Okay, we have the data received and stored, but we need to analyze it so we can use it.
Explore our data and use it to make any important decisions such as knowing what
features are mostly researched from our customers or use it to share research. Do
whatever we want and need with it put it to work, because we did big investments to
have this infrastructure set up, so you need to use it.
Architecture of big data

Big data architecture refers to the logical and physical structure that dictates how high volumes
of data are ingested, processed, stored, managed, and accessed.
A big data architecture is designed to handle the ingestion, processing, and analysis of data that
is too large or complex for traditional database systems.
Big Data Architecture Layers

Data sources: All big data solutions start with one or more data sources. Examples include:
Application data stores, such as relational databases.
Static files produced by applications, such as web server log files.

Data storage: Data for batch processing operations is typically stored in a distributed file store that can
hold high volumes of large files in various formats. This kind of store is often called a data lake. Options
for implementing this storage include Azure Data Lake Store or blob containers in Azure Storage.

Batch processing: Because the data sets are so large, often a big data solution must process data files
using long-running batch jobs to filter, aggregate, and otherwise prepare the data for analysis. Usually,
these jobs involve reading source files, processing them, and writing the output to new files.
Real-time message ingestion: If the solution includes real-time sources, the architecture must
include a way to capture and store real-time messages for stream processing. This might be a
simple data store, where incoming messages are dropped into a folder for processing.
Stream processing: After capturing real-time messages, the solution must process them by
filtering, aggregating, and otherwise preparing the data for analysis. The processed stream data
is then written to an output sink. Azure Stream Analytics provides a managed stream processing
service based on perpetually running SQL queries that operate on unbounded streams.
Analytical data store: Many big data solutions prepare data for analysis and then serve the
processed data in a structured format that can be queried using analytical tools.

Analysis and reporting: The goal of most big data solutions is to provide insights into the data through
analysis and reporting. Analysis and reporting can also take the form of interactive data exploration by
data scientists or data analysts.

Orchestration: Most big data solutions consist of repeated data processing operations, encapsulated in
workflows, that transform source data, move data between multiple sources and sinks, load the processed
data into an analytical data store, or push the results straight to a report or dashboard.
Advantages and Dis advantage of Big data

Advantage of big data


The biggest advantage of Big Data is the fact that it opens up new possibilities for organizations. Improved
operational efficiency, improved customer satisfaction, drive for innovation, and maximizing profits are only a
few among the many, many benefits of Big Data.

Enhanced productivity: big data impacts productivity to a great extent. There are several tools like Hadoop and
Spark for raising business productivity. Analysts are using these tools for analyzing data which is high in volume.
Moreover, these tools help analysts to analyze data at fast speed.
Better customer service: Data analyses can help an analyst to know more about customer behavior. There are
multiple social media platforms from where analysts gather information about customers. By doing so, many
companies serve their customers by gaining knowledge of their behavior. This is the benefit of big data.
Better decision-making: big data help businesses to make good decisions that support their
business. It builds a strong foundation for businesses by strengthening their roots. How big data
supports businesses and taking decisions. Customer Engagement is improved as real-time data of
the customer is used. Business operation is effectively performed

Minimization of cost: Big data analytics is helping companies reducing their expenses.
Moreover, big data tools assist in increasing operational efficiency and thereby reducing cost.
Increased revenue: big data helps businesses to take relevant decisions. These decisions lead to better
customer service. If the customers are happy ultimately revenue is increased. Different big data tools are
helping in analyzing customer behavior to increase productivity and accelerate growth.

Fraud detection: big data analytics system makes the use of machine learning to detect anomalies and
patterns. By using these banks and credit card companies easily spot that something is fraudulent.
Moreover, stolen credit cards or unusual purchases are also detected in this.

Increased agility: Many Big organizations are deploying their big data for better alignment of their
business efforts. they are making the use of their analytics for supporting frequent changes to enhance
their business strategies and tactics.
Disadvantage of big data

 The quality of data: Big data is usually semi-structured and unstructured. Quality is not always up to
the mark.
 Rapid change: Every month technology is improving and getting better than previous versions. So
many big companies cannot meet the requirements of deploying these tools. Sometimes this rapid
change can lead to a mess in the business.
 Lack of professionals: those people who analyze the big data to find valuable insights for increasing
productivity of a business is called big data analyst but the people who possess these skills are not
available sometimes.
 Cost Factor: Big data analytics is an expensive process. Many additional costs are associated with it.
These costs include hardware cost, technology cost, storage and maintenance, tool deployment and
hiring talented staff. For working on the analysis of big data high investment is needed.
Big data applications area

Big Data in Healthcare Industry

Healthcare is yet another industry which is bound to generate a huge amount of data.
Following are some of the ways in which big data has contributed to healthcare:
Big data reduces costs of treatment since there is less chances of having to perform
unnecessary diagnosis.
Big Data in Banking Sector
We keep our valuable properties in the bank for ensuring security. But a bank has to go through a lot of
strategies to keep your wealth safe and well maintained. In each bank, big data is being used for many years.
From cash collection to financial management, big data is making banks more efficient in every sector. Big
data applications in the banking sector have lessened customer’s hassle and generated revenue for the banks.
Big Data in Academia
Big Data is also helping enhance education today. Education is no more limited to the physical bounds of the
classroom there are numerous online educational courses to learn from. Academic institutions are investing in
digital courses powered by Big Data technologies to aid the all-round development of budding learners.
Conclusions

Big data is a term that describes the large volume of data both structured and unstructured that
inundates a business on a day-to-day basis. It's what organizations do with the data that
matters. Big data can be analyzed for insights that lead to better decisions and strategic business
moves. Now a day's companies use Big Data to make business more informative and allows to
take business decisions by enabling data scientists, analytical modelers and other professionals to
analyses large volume of transactional data. Big data is the valuable and powerful fuel that drives
large IT industries of the 21st century. Big data is a spreading technology used in each business
sector.
We recommend these four points that will allow organizations to exploit the available
opportunity,

 First using big data to deep-dive for advanced analytics

 Second, integrate new big data with legacy enterprise data to extend views
of customers and other business entities

 Third, enable ability to capture, ingest, analyses and enlighten in real time business
processes based on streaming big data.

 Our last recommend is strong collaboration amongst the big data management
organization for big data to be fully utilized, so that it must be accessed and
leveraged by multiple business units and users.
Thank you

You might also like