Nothing Special   »   [go: up one dir, main page]

Big Data Question Bank

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 26

Question Bank on big Data

1. What Is Big Data?


a. Huge volume of Data
b. Huge volume of high speed data
c. Huge volume of high speed, multimedia data
d. Collection of unstructured data.

Answer: C

2. Big Data Frameworks does not include:


a. Apache Hadoop
b. Apache Spark
c. DBMS
d. Apache Kafka

Ans. C

3. The Job of a business analyst does not include


a. Investing and analysing business situations
b. Identifying and evaluating option for improving business
opportunities
c. To take vital decisions to run a business
d. Elaborating and defining requirements to run a business efficiently

Answer: C

4. Business analysis means


a. Describe and define a business
b. Define and Plan a business
c. Plan and Build a business
d. Build, Test and Implement a Business

Answer: A

5. A business analyst work is also to Liaison between:


a. Owner/sponsor and project Manager
b. Project Manager and Experts/Users
c. Owner/ Sponsor, Project Manager, Solution Developers and
Expert/Users
d. Project Manager and Solution Developers
Answer: C

6. Limitations of Data warehouse is


a. Works on small sample of data
b. Cannot be used for analysis.
c. Contains unstructured Data
d. Data are not reliable.

Answer: a

7. Big Data is applied for:


a. Machine learning.
b. Deep learning.
c. Coding
d. Warehousing learning.

Answer: b

8. Role of a business analysts in a business is:


a. Advisory.
b. Operational
c. Quality Control
d. Statutory.

Answer: a

9. Job of a project Implementer is to:


e. Describe and define a business
f. Define and Plan a business
g. Plan and Build a business
h. Plan, Build, Test and Implement a Business

Answer: d

10. A business analyst work as a liaison between stakeholders


to elicit and analyse
a. Business Process
b. Business Policies
c. Business information systems
d. Business Process, Policies and Information Systems

Answer: d
11. The challenges of big Data include :
a. Capture, storage, search, sharing, transfer, analysis, and
visualization.
b. Search, sharing, transfer, analysis, and visualization.
c. Capture, storage, search, sharing, transfer and analysis.
d. Capture, storage, search, and visualization.

Ans; a

12. MapReduce is used for.


a. Performing computations on huge data sets on multiple
systems in a parallel fashion.
b. Google Maps.
c. Reduce Map size.
d. Mapping of natural resources.

Ans: a

13. A business analyst role is that of:

a. An auditor.
b. An advisor
c. An Employee
d. A Franchise

Answer: b

14. To set up a successful business you need to have:


a. Business analyst
b. Project Manager
c. Both Business analyst and Project Manager
d. A Data Analyst.

Answer: c
15. A business requirement should be

a. Clear and Complete

b. Clear, complete, correct and consistent

c. Complete and consistent.

d. Complete and Correct.

Answer: b

16. Elicitation Means:


a. To Draw forth or bring out.
b. To describe
c. To plan
d. To monitor

Answer: a

17.Which is not an application of Data Mining?


a. Creating Data
b. Data analysis and decision support
c. Risk analysis and management
d. Fraud detection and detection of unusual patterns (outliers)

Answer: a

18.The Big Data analytics lifecycle can be divided into the following
------ stages
a. 3
b. 6
c. 9
d. 12

Answer : C

19.‘Data Visualization’ is the


a. First Stage of Analytics lifecycle
b. Fourth Stage of Analytics lifecycle
c. Sixth Stage of Analytics lifecycle
d. Eighth Stage of Analytics lifecycle

Answer: d

20.For which of the following situation(s) is the market research


method of forecasting suitable?

a. When a firm is working with stable technology


b. When a firm is planning moderate changes on product
innovations
c. When a firm is market testing one of its new offerings
d. When a firm is working with stable technology, planning
moderate changes on product innovations or market testing one
of its new offerings.
Answer: d

21.How can we elicit requirement.


a. Through business analysis
b. Trough System Analysis
c. Through IT analysis
d. Through Interviews, meetings, surveys, observations and
Prototyping

Answer: d.

22.The difference between Business analyst and Data Scientist is:


a. Business analyst focus is business while Data Scientist focus is
Data
b. Business analyst analyses Data while Data Scientist creates Data.
c. Business analyst analyses data and assesses requirements from a
business perspective. Whereas, the main tasks of data
analysts are to collect, manipulate data.
d. There is no difference between them.

Answer: c

23.‘Business Case Evaluation ‘ is the


a. First Stage of Analytics lifecycle
b. Fourth Stage of Analytics lifecycle
c. Sixth Stage of Analytics lifecycle
d. Ninth Stage of Analytics lifecycle

Answer: a

24. Cloud computing Technology allows one to


a. To own a Data Centre
b. To compute over multiple servers thereby compute faster.
c. To avail the IT services like hardware, software and storage
from a service provider.
d. To access software applications only from other servers
hosting the applications.

Answer: C

25. Diagnostic analysis means:


a. A set of techniques for reviewing and examining the data set(s) to
understand the data and analyze business performance.
b. A set of techniques for determine what has happened and why
c. A set of techniques that analyze current and historical data to
determine what is most likely to (not) happen

d. A set of techniques for computationally developing and analyzing


alternatives that can become courses of action – either tactical or
strategic – that may discover the unexpected

Answer: b.

26. Which of the following a Business analyst should not be


doing?

a. Identify Stakeholders

b. Block issues raised by technical team.

c. Prepare Prototypes
d. Gather requirements from Stakeholders

Answer: b.

27. The data analytics encompasses ------------ phases

a. 2

b. 4

c. 6

d.8

Answer: c

28. ‘Data Acquisition & Filtering’ is the:

e. First Stage of Analytics lifecycle


f. Fourth Stage of Analytics lifecycle
g. Third Stage of Analytics lifecycle
h. Ninth Stage of Analytics lifecycle.

Answer: c

29. Cloud computing Provides

a. Data Analysis as a service.

b. System analysis as a service.

c. Business Analysis as a service.

d. IT facilities as a Service.

Answer: d

30. Predictive analysis means

a. A set of techniques for reviewing and examining the data set(s) to


understand the data and analyze business performance.

b. A set of techniques for determine what has happened and why

c. A set of techniques that analyse current and historical data to


determine what is most likely to (not) happen.
D. A set of techniques for computationally developing and analyzing
alternatives that can become courses of action – either tactical or
strategic – that may discover the unexpected

Answer: C

31. Descriptive analysis means:

a. A set of techniques for reviewing and examining the data


set(s) to understand the data and analyze business
performance.
b. A set of techniques for determine what has happened and
why
c. A set of techniques that analyse current and historical data
to determine what is most likely to (not) happen
d. A set of techniques for computationally developing and
analyzing alternatives that can become courses of action –
either tactical or strategic – that may discover the
unexpected.

Answer: a

32.Diagnostic Analytics process starts with:


a. Identifying the attributes, then assess/evaluate the attributes
b. Begin with descriptive analytics
c. Begin with descriptive AND diagnostic analytics
d. Begin w/ predictive analytics

Answer: b

33. Cloud computing Provides

a. Data Analysis as a service.

b. System analysis as a service.

c. Business Analysis as a service.

d. IT facilities as a Service.

Answer: d

34. What triggered Big Data Technology:


a. Failure of hardware in distributed architecture
b. Failure of software in distributed architecture
c. To process huge volume of Data swiftly
d. To collect Data fast.

Answer: C

35. As companies move past the experimental phase with Hadoop,


many cite the need for additional capabilities, including
_______________
a) Improved data storage and information retrieval
b) Improved extract, transform and load features for data integration
c) Improved data warehousing functionality
d) Improved security, workload management, and SQL support

Answer: d
36. Predictive Analytics starts with

a. Identifying the attributes, then assess/evaluate the attributes


b. Begin with descriptive analytics
c. Begin with descriptive AND diagnostic analytics
d. Begin with predictive analytics

Answer: c

37. Cloud computing Provides

a. Data Analysis as a service.

b. System analysis as a service.

c. Business Analysis as a service.

d. IT facilities as a Service.

Answer: d

38. As companies move past the experimental phase with Hadoop,


many cite the need for additional capabilities, including
_______________
a) Improved data storage and information retrieval
b) Improved extract, transform and load features for data integration
c) Improved data warehousing functionality
d) Improved security, workload management, and SQL support

Answer: d

39. Point out the correct statement.


a) Hadoop do need specialized hardware to process the data
b) Hadoop 2.0 allows live stream processing of real-time data
c) In Hadoop programming framework output files are divided into
lines or records
d) Hadoop is another name for Big Data

Answer: b

40. Point out the wrong statement.


a) Hardtop processing capabilities are huge and its real advantage lies
in the ability to process terabytes & petabytes of data
b) Hadoop uses a programming model called “MapReduce”, all the
programs should confirm to this model in order to work on Hadoop
platform
c) The programming model, MapReduce, used by Hadoop is difficult
to write and test
d) MapReduce is a programming model or pattern within the Hadoop
framework that is used to access big data stored in the Hadoop File
System (HDFS).

Answer: c

41. Prescriptive Analytics starts with

a. Identifying the attributes, then assess/evaluate the attributes


b. Begin with descriptive analytics
c. Begin with descriptive AND diagnostic analytics
d. Begin with predictive analytics

Answer: d.

42. How can we elicit requirement.

a. Through business analysis


b. Through System Analysis
c. Through IT analysis
d. Through Interviews, meetings, surveys, observations and
Prototyping

Answer: d

43. As companies move past the experimental phase with Hadoop,


many cite the need for additional capabilities, including
_______________
a) Improved data storage and information retrieval
b) Improved extract, transform and load features for data integration
c) Improved data warehousing functionality
d) Improved security, workload management, and SQL support

Answer: d

44. Which of the following statement is incorrect?

a. Apache Hadoop is an open source framework that is used to


efficiently store and process large datasets ranging in size from
gigabytes to petabytes of data.

b. Hadoop allows clustering multiple computers to analyse massive


datasets in parallel more quickly

c. Hadoop is an application for data mining

d. Hadoop makes it easier to use all the storage and processing capacity
in cluster servers, and to execute distributed processes against huge
amounts of data.

Answer : C

45. Hadoop is a framework that works with a variety of related tools.


Common cohorts include ____________
a) MapReduce, Hive and HBase
b) MapReduce, MySQL and Google Apps
c) MapReduce, Hummer and Iguana
d) MapReduce, Heron and Trumpet

Answer: a
46. Which of the following statement is incorrect?

a. Apache Hadoop is an open source framework that is used to


efficiently store and process large datasets ranging in size from
gigabytes to petabytes of data.

b. Hadoop allows clustering multiple computers to analyse massive


datasets in parallel more quickly

c. Hadoop is an application for data mining

d. Hadoop makes it easier to use all the storage and processing capacity
in cluster servers, and to execute distributed processes against huge
amounts of data.

Answer : C

47. Hadoop is a framework that works with a variety of related tools.


Common cohorts include ____________
a) MapReduce, Hive and HBase
b) MapReduce, MySQL and Google Apps
c) MapReduce, Hummer and Iguana
d) MapReduce, Heron and Trumpet

Answer: a

48. Point out the wrong statement.


a) Hardtop processing capabilities are huge and its real advantage lies
in the ability to process terabytes & petabytes of data
b) Hadoop uses a programming model called “MapReduce”, all the
programs should confirm to this model in order to work on Hadoop
platform
c) The programming model, MapReduce, used by Hadoop is difficult
to write and test
d) MapReduce is a programming model or pattern within the Hadoop
framework that is used to access big data stored in the Hadoop File
System (HDFS).
Answer: c

49. What was Hadoop named after?

a) Creator Doug Cutting’s favorite circus act


b) Cutting’s high school rock band
c) The toy elephant of Cutting’s son
d) A sound Cutting’s laptop made during Hadoop development

Answer: c

50. All of the following accurately describe Hadoop, EXCEPT


____________
a) Open-source
b) Real-time
c) Java-based
d) Distributed computing approach

Answer: b

51. What was Hadoop named after?

a) Creator Doug Cutting’s favourite circus act


b) Cutting’s high school rock band
c) The toy elephant of Cutting’s son
d) A sound Cutting’s laptop made during Hadoop development

Answer: c

52. __________ can best be described as a programming model


used to develop Hadoop-based applications that can process massive
amounts of data.
a) MapReduce
b) Mahout
c) Oozie
d) Apache

Answer: a

53. __________ has the world’s largest Hadoop cluster.


a) Apple
b) Datamatics
c) Facebook
d) Google

Answer: c

54. What ties so many small and reasonably priced machines


together into a single cost effective computer system?

a. MapReduce

b .Hadoop distributed file system

c. Grid computer system

d. cloud computing

Answer: b.

55. A common practice among investigators is to defer


the selection of statistical analytic procedures to
____________:
 

a. Statisticians

b. Graduate assistants

c. IRB committees

d. Funding agency

Answer: a

56. What was Hadoop named after?

b) Creator Doug Cutting’s favorite circus act


b) Cutting’s high school rock band
c) The toy elephant of Cutting’s son
d) A sound Cutting’s laptop made during Hadoop development

Answer: c
57. All of the following accurately describe Hadoop, EXCEPT
____________
a) Open-source
b) Real-time
c) Java-based
d) Distributed computing approach

Answer: b

58. ------------- is a programming model for processing and generating


large Data sets with a parallel , distributed algorithm on a cluster.

a. MapReduce

b .Hadoop distributed file system

c. Grid computer system

d. cloud computing

Answer: a

59. Hadoop lacks

a. Computing Power

b. Security

c. Flexibility

d. Fault tolerance

Answer: b

60. Data analysis is defined as the:


 

a. process to ensure that research data, digital and


traditional, is stored in a secure manner to ensure that
procedural controls are in place and adhered to in
order to protect the integrity of data.
b. convention whereby research findings are prepared
and disseminated to the scientific community.

c. process of gathering and measuring information on


variables of interest, in an established systematic
fashion that enables one to answer stated research
questions, test hypotheses, and evaluate outcomes.
d. the process of systematically applying statistical and/or
logical techniques to describe and illustrate, condense,
recap, and evaluate data.

Answer: d

61. __________ can best be described as a programming model


used to develop Hadoop-based applications that can process massive
amounts of data.
a) MapReduce
b) Mahout
c) Oozie
d) Apache

Answer: a

62. __________ has the world’s largest Hadoop cluster.


a) Apple
b) Datamatics
c) Facebook
d) Google

Answer: c

63. What ties so many small and reasonably priced machines


together into a single cost effective computer system?

a. MapReduce

b .Hadoop distributed file system

c. Grid computer system

d. cloud computing

Answer: b
64. ------------- is a programming model for processing and generating
large Data sets with a parallel , distributed algorithm on a cluster.

a. MapReduce

b .Hadoop distributed file system

c. Grid computer system

d. cloud computing

Answer: a

65. Hadoop lacks

a. Computing Power

b. Security

c. Flexibility

d. Fault tolerance

Answer: b

66. Which of the following genres does Hadoop produce ?

(A) Distributed file system

(B) JAX-RS

(C) Java Message Service

(D) JSP

Answer: a
67. As companies move past the experimental phase with
Hadoop, many cite the need for additional capabilities, including
__________ .
(A) As companies move past the experimental phase with Hadoop,
many cite the need for additional capabilities, including

(B) Improved extract, transform and load features for data integration

(C) Improved data warehousing functionality

(D) Improved security, workload management and SQL support

Answer:d

68. Point out the correct statement.


a) Raw data is original source of data
b) Pre-processed data is original source of data
c) Raw data is the data obtained after processing steps
d) All data are Raw data

Answer :a

69. Which of the following approach should be used to ask Data


Analysis question?

a) Find only one solution for particular problem


b) Find out the question which is to be answered
c) Find out answer from dataset without asking question
d) Find many solutions for particular problem.

Answer: b

70. Which of the following is characteristic of Processed Data?


a) Data is not ready for analysis
b) All steps should be noted
c) Hard to use for data analysis
d) Merge, summarise and subsetting of data

Answer :d

71. Data analysis is defined as the:


 

a. process to ensure that research data, digital and


traditional, is stored in a secure manner to ensure that
procedural controls are in place and adhered to in
order to protect the integrity of data.

b. convention whereby research findings are prepared


and disseminated to the scientific community.

c. process of gathering and measuring information on


variables of interest, in an established systematic
fashion that enables one to answer stated research
questions, test hypotheses, and evaluate outcomes.
d. the process of systematically applying statistical and/or
logical techniques to describe and illustrate, condense,
recap, and evaluate data.

Answer: d

72. A common practice among investigators is to defer


the selection of statistical analytic procedures to
____________:
 

a. Statisticians

b. Graduate assistants

c. IRB committees

d. Funding agency

Answer: a.

73. Statistical analysis advice should be obtained at the stage of initial


planning in a study:
 a. so that attribution of authorship can be decided
b. so that conflicts of interest could be identified
c, to better coordinate the selection of appropriate sampling methods and
data collection instruments
d.how data will be archived can be planned

Answer: c
74. Point out the correct statement.
a) Raw data is original source of data
b) Pre-processed data is original source of data
c) Raw data is the data obtained after processing steps
d) All data are Raw data

Answer : a

75. Which of the following is one of the key data science skills?
a) Statistics
b) Machine Learning
c) Data Visualization
d) Statistics, Machine Learning, Data Visualising

Answer: d

76. How many V's of Big Data

A. 2
B. 3
C. 4
D. 5

Answer: C

77. The examination of large amounts of data to see what patterns or


other useful information can be found is known as

A. Data examination
B. Information analysis
C. Big data analytics
D. Data analysis
Answer: C

78. Data analysis is defined as the:


 

e. process to ensure that research data, digital and


traditional, is stored in a secure manner to ensure that
procedural controls are in place and adhered to in
order to protect the integrity of data.

f. convention whereby research findings are prepared


and disseminated to the scientific community.

g. process of gathering and measuring information on


variables of interest, in an established systematic
fashion that enables one to answer stated research
questions, test hypotheses, and evaluate outcomes.
h. the process of systematically applying statistical and/or
logical techniques to describe and illustrate, condense,
recap, and evaluate data.

Answer: d

79.The chief aim of analysis is to distinguish between:


 

a. an event occurring as either reflecting a true effect


versus a one occurring by chance

b. right from wrong

c. quality control and quality assurance

d Misrepresentation and plagiarism

Answer: a

80. Statistical analysis advice should be obtained at the stage


of initial planning in a study:
 a. so that attribution of authorship can be decided
b. so that conflicts of interest could be identified
c, to better coordinate the selection of appropriate sampling
methods and data collection instruments
d.how data will be archived can be planned

Answer: c
81. Which of the following genres does Hadoop produce?

(A) Distributed file system

(B) JAX-RS

(C) Java Message Service

(D) JSP

Answer: a
82. As companies move past the experimental phase with
Hadoop, many cite the need for additional capabilities,
including __________ .

(A) As companies move past the experimental phase with


Hadoop, many cite the need for additional capabilities,
including

(B) Improved extract, transform and load features for data


integration

(C) Improved data warehousing functionality

(D) Improved security, workload management and SQL


support

Answer: d
83. How many V's of Big Data

A. 2
B. 3
C. 4
D. 5
Answer: C

84. All of the following accurately describe Hadoop, EXCEPT


____________

A. Open-source
B. Real-time
C. Java-based
D. Distributed computing approach

Answer :B

85. Explanation: The examination of large amounts of data to


see what patterns or other useful information can be found is
known as Big data analytics.
Big data analysis does the following except?

A. Collects data
B. Spreads data
C. Organizes data
D. Analyzes data

Ans : B

86.The process of using the present and past conditions for


analysing future aspects are classified as

A. Forecasting
B. Term analysis
C. Expectation analysis
D. Long Term analysis
Ans: A

87. Decisions relating to production scheduling involve:


a. Short-term forecasting.
b. Medium-term forecasting.
c. Long-term forecasting.
d. both short-term as well as medium-term forecasting

Answer : a

88. Which of the following is performed by Data Scientist?

a) Define the question


b) Create reproducible code
c) Challenge results
d) Analysing data, particularly large amounts of data.

Answer :d

89. The new source of big data that will trigger a Big Data
revolution in the years to come is?

A. Business transactions
B. Social media
C. Transactional data and sensor data
D. RDBMS

Ans : C

90. The process of using the present and past conditions for
analysing future aspects are classified as

A Forecasting
B. Term analysis
C. Expectation analysis
D. Long Term analysis

91. Which of the following is performed by Data Scientist?

a) Define the question


b) Create reproducible code
c) Challenge results
d) Analysing data, particularly large amounts of data.

Answer :d

92. SalesForce wants to host its CRM application in a cloud.


What sort of service does SaleForce require to hire from the
service provider?

a. Platform as a service

b. Hardware as a service

c. Software as a service

d. Infrastructure as a Service

Answer: a

93. The Job of a business analyst does not include

e. Investing and analysing business situations


f. Identifying and evaluating option for improving
business opportunities
g. To take vital decisions to run a business
h. Elaborating and defining requirements to run a
business efficiently

Answer: C

You might also like