Nothing Special   »   [go: up one dir, main page]

Get Towards The Integration of IoT, Cloud and Big Data: Services, Applications and Standards 1st Edition Vinay Rishiwal Free All Chapters

Download as pdf or txt
Download as pdf or txt
You are on page 1of 64

Full download test bank at ebookmeta.

com

Towards the Integration of IoT, Cloud and Big


Data: Services, Applications and Standards 1st
Edition Vinay Rishiwal
For dowload this book click LINK or Button below

https://ebookmeta.com/product/towards-the-
integration-of-iot-cloud-and-big-data-services-
applications-and-standards-1st-edition-vinay-
rishiwal/
OR CLICK BUTTON

DOWLOAD EBOOK

Download More ebooks from https://ebookmeta.com


More products digital (pdf, epub, mobi) instant
download maybe you interests ...

Blockchain for Big Data: AI, IoT and Cloud Perspectives


1st Edition Shaoliang Peng

https://ebookmeta.com/product/blockchain-for-big-data-ai-iot-and-
cloud-perspectives-1st-edition-shaoliang-peng/

Blockchain and IoT Integration Approaches and


Applications 1st Edition Kavita Saini (Editor)

https://ebookmeta.com/product/blockchain-and-iot-integration-
approaches-and-applications-1st-edition-kavita-saini-editor/

Cambridge IGCSE and O Level History Workbook 2C - Depth


Study: the United States, 1919-41 2nd Edition Benjamin
Harrison

https://ebookmeta.com/product/cambridge-igcse-and-o-level-
history-workbook-2c-depth-study-the-united-states-1919-41-2nd-
edition-benjamin-harrison/

Big Data Analytics in Fog-Enabled IoT Networks: Towards


a Privacy and Security Perspective 1st Edition Govind
P. Gupta

https://ebookmeta.com/product/big-data-analytics-in-fog-enabled-
iot-networks-towards-a-privacy-and-security-perspective-1st-
edition-govind-p-gupta/
Computer Networks, Big Data and IoT: Proceedings of
ICCBI 2020 A.Pasumpon Pandian

https://ebookmeta.com/product/computer-networks-big-data-and-iot-
proceedings-of-iccbi-2020-a-pasumpon-pandian/

IoT enabled Smart Healthcare Systems Services and


Applications 1st Edition Various Autors

https://ebookmeta.com/product/iot-enabled-smart-healthcare-
systems-services-and-applications-1st-edition-various-autors/

Contemporary Issues in Communication, Cloud and Big


Data Analytics

https://ebookmeta.com/product/contemporary-issues-in-
communication-cloud-and-big-data-analytics/

Information Fusion and Analytics for Big Data and IoT


1st Edition Eloi Bosse

https://ebookmeta.com/product/information-fusion-and-analytics-
for-big-data-and-iot-1st-edition-eloi-bosse/

Machine Learning, Big Data, and IoT for Medical


Informatics 1st Edition Pardeep Kumar

https://ebookmeta.com/product/machine-learning-big-data-and-iot-
for-medical-informatics-1st-edition-pardeep-kumar/
Studies in Big Data 137

Vinay Rishiwal
Pramod Kumar
Anuradha Tomar
Priyan Malarvizhi Kumar Editors

Towards
the Integration
of IoT, Cloud
and Big Data
Services, Applications and Standards
Studies in Big Data

Volume 137

Series Editor
Janusz Kacprzyk, Polish Academy of Sciences, Warsaw, Poland
The series “Studies in Big Data” (SBD) publishes new developments and advances
in the various areas of Big Data- quickly and with a high quality. The intent is to
cover the theory, research, development, and applications of Big Data, as embedded
in the fields of engineering, computer science, physics, economics and life sciences.
The books of the series refer to the analysis and understanding of large, complex,
and/or distributed data sets generated from recent digital sources coming from
sensors or other physical instruments as well as simulations, crowd sourcing, social
networks or other internet transactions, such as emails or video click streams and
other. The series contains monographs, lecture notes and edited volumes in Big
Data spanning the areas of computational intelligence including neural networks,
evolutionary computation, soft computing, fuzzy systems, as well as artificial
intelligence, data mining, modern statistics and Operations research, as well as
self-organizing systems. Of particular value to both the contributors and the
readership are the short publication timeframe and the world-wide distribution,
which enable both wide and rapid dissemination of research output.
The books of this series are reviewed in a single blind peer review process.
Indexed by SCOPUS, EI Compendex, SCIMAGO and zbMATH.
All books published in the series are submitted for consideration in Web of Science.
Vinay Rishiwal · Pramod Kumar ·
Anuradha Tomar · Priyan Malarvizhi Kumar
Editors

Towards the Integration


of IoT, Cloud and Big Data
Services, Applications and Standards
Editors
Vinay Rishiwal Pramod Kumar
Department of CSIT Glocal University
Faculty of Engineering and Technology Saharanpur, Uttar Pradesh, India
M.J.P. Rohilkhand University
Bareilly, India Priyan Malarvizhi Kumar
Department of Data Science
Anuradha Tomar University of North Texas
Netaji Subhas University of Technology Denton, TX, USA
New Delhi, India

ISSN 2197-6503 ISSN 2197-6511 (electronic)


Studies in Big Data
ISBN 978-981-99-6033-0 ISBN 978-981-99-6034-7 (eBook)
https://doi.org/10.1007/978-981-99-6034-7

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Singapore Pte Ltd. 2023

This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore

Paper in this product is recyclable.


Preface

The rapid advancement of technology has led to the emergence of the Internet of
Things (IoT), Cloud Computing, and Big Data as transformative forces in various
industries. As these technologies continue to evolve, there is a growing need for
their integration to unlock their full potential and enable the development of inno-
vative services, applications, and standards. The integration of these three domains
presents numerous challenges and opportunities. One of the key challenges is the
efficient and secure management of the massive data generated by IoT devices, as
well as the seamless integration of IoT devices with cloud-based infrastructure. This
requires the development of scalable and robust architectures, protocols, and stan-
dards that enable interoperability, data sharing, and resource allocation across hetero-
geneous systems. Moreover, the integration of IoT, Cloud, and Big Data enables the
creation of innovative services and applications. To achieve successful integration,
the establishment of common standards is crucial.
To summarise, it is the right time to explore the integration of IoT, Cloud, and
Big Data, which holds immense potential to transform industries, enhance services,
and enable data-driven decision-making. However, addressing the challenges related
to data management, interoperability, and security is vital for successful integration.
Moreover, the establishment of standards is crucial to facilitate seamless commu-
nication and collaboration between different systems. By leveraging the combined
power of IoT, Cloud, and Big Data, organizations can unlock new possibilities and
drive digital transformation in the era of interconnected and data-driven ecosystems.
This book consists of eight chapters. The first chapter covers introduction to Big
Data analysis and its need, skills required for Big Data analysis, characteristics of Big
data analysis, an overview of the Hadoop ecosystem, and some use cases of Big Data
analysis. The aim of the second chapter is to study and compare three of the most
common classification methods, Support Vector Machines, K-Nearest Neighbours
and Artificial Neural Networks, for heart disease prediction using the ensemble of
standard Cleveland cardiology data. The objective of the third article is to reduce the
energy consumption of the ECG machine. Authors in chapter four, have proposed a
system to implement an automatic water supply to the farms based upon their crop,
system that measures water level of soil and helps to decide to turn on or off the water

v
vi Preface

supply. Further, chapter five uses deep convolutional networks algorithms for leaf
image classification to provide accurate results. The concept of Blockchain is used
in chapter six with the aim to ensure the security of the patient’s medical records.
Chapter seven offers SHA-PSO, a PSO-based meta-heuristic technique that schedules
workloads among Virtual Machines (VM) to minimize energy. Authors in chapter
eight have proposed design of field monitoring device using IoT in Agriculture.

Bareilly, India Vinay Rishiwal


Saharanpur, India Pramod Kumar
New Delhi, India Anuradha Tomar
University of North Texas, USA Priyan Malarvizhi Kumar
Contents

Introduction to Big Data Analytics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1


Nitin Arora, Anupam Singh, Vivek Shahare, and Goutam Datta
DCD_PREDICT: Using Big Data on Prediction for Chest Diseases
by Applying Machine Learning Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Umesh Kulkarni, Sushopti Gawade, Hemant Palivela,
and Vikrant Agaskar
Design of Energy Efficient IoMT Electrocardiogram (ECG)
Machine on 28 nm FPGA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Pankaj Singh, Bishwajeet Pandey, Neema Bhandari, Shilpi Bisht,
Neeraj Bisht, and Sandeep K. Budhani
Automatic Smart Irrigation Method for Agriculture Data . . . . . . . . . . . . . 57
Rashmi Chaudhry, Vinay Rishiwal, Preeti Yadav,
Kaustubh Ranjan Singh, and Mano Yadav
Artificial Intelligence Based Plant Disease Detection . . . . . . . . . . . . . . . . . . 75
Vinay Rishiwal, Rashmi Chaudhry, Mano Yadav,
Kaustubh Ranjan Singh, and Preeti Yadav
IoT Equipped Intelligent Distributed Framework for Smart
Healthcare Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Sita Rani, Meetali Chauhan, Aman Kataria, and Alex Khang
Adaptive Particle Swarm Optimization for Energy Minimization
in Cloud: A Success History Based Approach . . . . . . . . . . . . . . . . . . . . . . . . 115
Vijay Kumar Sharma, Swati Sharma, Mukesh Rawat, and Ravi Prakash
Field Monitoring and Automation in Agriculture Using Internet
of Things (IoT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
Ashendra Kumar Saxena, Rakesh Kumar Dwivedi, and Danilla Parygin

vii
Editors and Contributors

About the Editors

Dr. Vinay Rishiwal Ph.D. is working as a Professor in the Department of Computer


Science and Information Technology, Faculty of Engineering and Technology, MJP
Rohilkhand University, Bareilly, Uttar Pradesh, India. He obtained B.Tech. degree
in Computer Science and Engineering in the year 2000 from M.J.P. Rohilkhand
University (SRMSCET), India and received his Ph.D. in Computer Science and
Engineering from Gautam Buddha Technical University, Lucknow, India, in the year
2011. He has 23 years of experience into academics. He is a senior member of IEEE,
ACM and worked as Convener, Student Activities Committee, IEEE Uttar Pradesh
Section, India. He has published more than 90 research papers in various journals
and conferences of international repute. He also has 20 patents into his credit. He
is a General/Conference chair of four International Conferences namely ICACCA,
IoT-SIU, MARC 2020 and ICAREMIT. He has received many awards as best paper/
research/orator at various platforms. Dr. Rishiwal has visited many countries for
academic purposes and worked upon many projects of CST, UP Government, MHRD
and UGC. His current research interest includes Wireless Sensor Networks, IoT,
Cloud Computing, Social networks and Blockchain Technology.

Prof. (Dr.) Pramod Kumar is an accomplished academic leader with over 24 years
of experience in the field. He currently serves as the Dean of Academics at Glocal
University in Saharanpur, UP, where he has been since September 2022. Prior to
this, he held the position of Dean of Computer Science and Engineering at Krishna
Engineering College in Ghaziabad and served as the director of Tula’s Institute in
Dehradun, Uttarakhand. Prof. Pramod Kumar holds a Ph.D. in Computer Science
and Engineering, which he earned in 2011, as well as an M.Tech in CSE from 2006.
He is a Senior Member of IEEE and an Ex-Joint Secretary of the IEEE U.P. section.
Through his research, he has made significant contributions to the fields of Computer
Networks, IoT, and Machine Learning. He is the author or co-author of more than 70

ix
x Editors and Contributors

research papers and has edited four books. He has also supervised and co-supervised
several M.Tech. and Ph.D. students.

Dr. Anuradha Tomar is currently working as an Assistant Professor in the Instru-


mentation & Control Engineering Division of Netaji Subhas University, Delhi, India.
Dr. Tomar has completed her Postdoctoral research in EES, from Eindhoven Univer-
sity of Technology (TU/e), the Netherlands. She received her B.E Degree in Elec-
tronics Instrumentation & Control with Honours in the year 2007 from the Univer-
sity of Rajasthan, India. In the year 2009, she completed her M.Tech. Degree with
Honours in Power Systems from the National Institute of Technology Hamirpur. She
received her Ph.D. in Electrical Engineering, from the Indian Institute of Technology
Delhi (IITD), India. Dr. Anuradha Tomar has committed her research work efforts
towards the development of sustainable, energy-efficient solutions for the empower-
ment of society, and humankind. Her areas of research interest are the Operation &
Control of Microgrids, Photovoltaic Systems, Renewable Energy based Rural Elec-
trification, Congestion Management in LV Distribution Systems, Artificial Intelli-
gent & Machine Learning Applications in Power Systems, Energy conservation, and
Automation.

Dr. Priyan Malarvizhi Kumar is presently employed as an Assistant Professor at


the University of North Texas in the United States. Before joining this role, he served
as an Assistant Professor in the Department of Computer and Information Science
at Gannon University, USA. Prior to his tenure at Gannon University, he held the
position of Assistant Professor in the Computer Science and Engineering Department
at Kyung Hee University in South Korea. Additionally, he gained valuable experience
as a Postdoctoral Research Fellow at Middlesex University in London, UK. Dr.
Kumar earned his Ph.D. degree from Vellore Institute of Technology University.
His academic journey also includes a Bachelor of Engineering degree from Anna
University and a Master of Engineering degree from Vellore Institute of Technology
University.
Dr. Kumar’s current research focuses on areas such as Big Data Analytics, Internet
of Things (IoT), Internet of Everything (IoE), and Internet of Vehicles (IoV) in the
context of healthcare. He has authored and co-authored papers published in inter-
national journals and conferences, including those indexed by the Science Citation
Index (SCI). He maintains a lifetime membership with the International Society for
Infectious Disease, the Computer Society of India, and is an active member of the
Vellore Institute of Technology Alumni Association.

Contributors

Vikrant Agaskar Vidyavardhani College of Engineering and Technology, Vasai-


Virar, Maharashtra, India
Editors and Contributors xi

Nitin Arora Electronics and Computer Discipline, Indian Institute of Technology,


Roorkee, India
Neema Bhandari Birla Institute of Applied Sciences, Bhimtal, Uttarakhand, India
Neeraj Bisht Birla Institute of Applied Sciences, Bhimtal, Uttarakhand, India
Shilpi Bisht Birla Institute of Applied Sciences, Bhimtal, Uttarakhand, India
Sandeep K. Budhani Graphic Era Hill University, Bhimtal, Uttarakhand, India
Rashmi Chaudhry Netaji Subhas University of Technology, Delhi, India
Meetali Chauhan Department of Computer Science and Engineering, Guru Nanak
Dev Engineering College, Ludhiana, Punjab, India
Goutam Datta School of Computer Science, University of Petroleum and Energy
Studies, Dehradun, India
Rakesh Kumar Dwivedi CCSIT, Teerthanker Mahaveer University, Moradabad,
UP, India
Sushopti Gawade Pillai College of Engineering, Panvel, India
Aman Kataria Amity Institute of Defence Technology, Amity University, Noida,
India
Alex Khang GRITEx and VUST, Ho Chi Minh City, Vietnam
Umesh Kulkarni Vidyalankar Institute of Technology Wadala, Mumbai, Maha-
rashtra, India
Hemant Palivela Manager-AI, Accenture Solutions, Mumbai, Maharashtra, India
Bishwajeet Pandey Gyancity Lab, Guragaon, India
Danilla Parygin Volgograd State Techincal University, Vogograd, Russia
Ravi Prakash CSED, Motilal Nehru National Institute of Technology, Allahabad,
India
Sita Rani Department of Computer Science and Engineering, Guru Nanak Dev
Engineering College, Ludhiana, Punjab, India
Mukesh Rawat CSED, Meerut Institute of Engineering and Technology, Meerut,
India
Vinay Rishiwal MJP Rohilkhand University, Bareilly, India
Ashendra Kumar Saxena CCSIT, Teerthanker Mahaveer University, Moradabad,
UP, India
Vivek Shahare Department of Computer Science and Engineering, Indian Institute
of Technology, Dharwad, India
Swati Sharma IT, Meerut Institute of Engineering and Technology, Meerut, India
xii Editors and Contributors

Vijay Kumar Sharma CSED, Meerut Institute of Engineering and Technology,


Meerut, India
Anupam Singh Department of Computer Science and Engineering, Graphic Era
Deemed to be University, Dehradun, India
Kaustubh Ranjan Singh Delhi Technological University, Delhi, India
Pankaj Singh Birla Institute of Applied Sciences, Bhimtal, Uttarakhand, India
Mano Yadav Bareilly College Bareilly, Bareilly, India
Preeti Yadav MJP Rohilkhand University, Bareilly, India
Introduction to Big Data Analytics

Nitin Arora , Anupam Singh , Vivek Shahare , and Goutam Datta

Abstract Nowadays, social media and networks, scientific instruments, mobile


devices, mobile devices, and a high volume of information data (tabular data, text
files, images, videos, audio, logos, etc.) is generated at high velocity by social media
and networks, scientific instruments, mobile devices, and sensors technology and
networks. In these types of data, data quality is usually not guaranteed. This data can
be structured or unstructured, necessitating a cost-effective, innovative method of
data processing to improve understanding and decision-making. This chapter covers
some introduction to Big Data analysis and its need, skills required for Big Data
analysis, characteristics of Big data analysis, an overview of the Hadoop ecosystem,
and some use cases of Big Data analysis.

Keywords Big data · Hadoop ecosystem · Big data analysis · Business


intelligence analysis · Big data domain · Big data quality · Dimensions

N. Arora (B)
Electronics and Computer Discipline, Indian Institute of Technology, Roorkee, India
e-mail: nitinarora.iitr@gmail.com
A. Singh
Department of Computer Science and Engineering, Graphic Era Deemed to be University,
Dehradun, India
e-mail: anupamsingh.cse@geu.ac.in
V. Shahare
Department of Computer Science and Engineering, Indian Institute of Technology,
Dharwad, India
e-mail: vivek.shahare27@gmail.com
G. Datta
School of Computer Science, University of Petroleum and Energy Studies, Dehradun, India
e-mail: gdatta@ddn.upes.ac.in

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 1
V. Rishiwal et al. (eds.), Towards the Integration of IoT, Cloud and Big Data,
Studies in Big Data 137, https://doi.org/10.1007/978-981-99-6034-7_1
2 N. Arora et al.

Table 1 Characteristics of data


Small data Big data
Volume Less than 1 TB Greater than 1 TB
Velocity Controlled and steady data flow Enormous data flowing at shorter time
frames
Variety Structured and semi-structured. E.g., Wide variety of data, i.e., tabular data,
Xcel, Table, data, Json text files, images, videos, audios, logs,
etc.
Veracity It contains more quality data The quality of data is rarely guaranteed
Value Business Intelligence, analysis, and Complex data mining, predictions,
reporting pattern finding, etc.
Time variance Data represents the business value for At times, history data becomes
the history of data as well as irrelevant for analyzing business
incremental insights
Infrastructure More defined resources allocation The Load-on system varies a lot

1 Introduction to Big Data

Big Data is a phrase that relates to a collection of vast and complex data sets that
are challenging to store and analyze using standard data processing methods. Big
data refers to data assets with a large volume, great velocity, and great diversity that
necessitate cost-effective, creative data processing to improve insight and decision-
making [1].

2 The Distinction Between Small and Big Data

There are several distinctions between small data and big data. These distinctions
include volume, velocity, variety, veracity, value, time variance, and infrastructure
[2]. Table 1 summarizes all the differences.

3 Classification of Big Data

Big data is classified as [3]


– Structured Data: Structured data has a well-defined format. It can be readily
stored in tabular form in relational databases such as MySQL and Oracle.
– Semi-Structured Data: Semi-structured data has some structure but can’t be
recorded in a tabular format in relational databases. XML files, JSON documents,
e-mail messages, and so forth are examples.
Introduction to Big Data Analytics 3

– Unstructured Data: Unstructured data has no structure and cannot be saved in


tabular form in relational databases. Examples include video, audio, text, and
machine-produced data.

4 Characteristics of Big Data

Big data has many characteristics. Some of them are: [4]


– Volume: The term “volume” refers to the “quantity of data,” which rapidly
increases daily. Humans, technology, and their interactions on social media create
enormous amounts of data.
– Variety: Because so many sources contribute to Big Data, the types of data they
generate are diverse. It might be organized, semi-organized, or unorganized. Many
different forms of data can be generated/collected by a single application. All of
these forms of data must be connected to extract knowledge.
– Velocity: Velocity refers to stream of data that arrives from different social media
sites continuously, and the repository gets completed with new data at the same
rate. It becomes a challenge to capture this stream of data promptly for further
processing.
– Veracity: The term “veracity” alludes to the data’s unreliability. Data inconsis-
tency and incompleteness create uncertainty in the data supplied. Many exten-
sive data types, such as Twitter postings with hashtags, abbreviations, typos, and
colloquial speech, have less controlled quality and accuracy.
– Value: It’s great to access massive data, but it isn’t sensible unless we can transform
it into practice.
– Variability: Diversity and variation are not the same things. Even though a coffee
shop may offer six different coffee blends, diversity only exists when you consis-
tently obtain and enjoy the same blend. The same is true for data; if the meaning
changes frequently, it can significantly affect the homogeneity of the data.
– Visualization: Using charts and graphs to represent vast quantities of accurate
data is far more successful than using spreadsheets.

5 Who’s Generating Big Data?

The capacity to acquire data no longer limits development and creativity. However,
the capacity to organize, analyze, summarise, display, and find information from
acquired data in a timely and scalable manner is critical.
4 N. Arora et al.

6 Why Is Big Data Important?

Companies obtain a complete knowledge of their company, consumers, products,


and rivals if big data is gathered, processed, and analyzed properly and efficiently. It
results in enhanced efficiency, more sales, cheaper expenses, better customer service,
and better goods and services. Sensors are embedded in manufacturing items to
provide a stream of telemetry. Retailers frequently know who buys their products.
To determine who didn’t buy and why, businesses can leverage social media and
blog data from their e-commerce sites, which is the knowledge they don’t have
today. Using extensive historical call center data more rapidly enhances customer
engagement and satisfaction. Use of social media material is encouraged to better
and more rapidly assess consumer sentiment about you/your customers and enhance
goods, services, and customer interactions [5].

7 Challenges in Big-Data

Big data size is enormous, and this data can be structured or unstructured. There are
many challenges in this and are discussed below [6].
– Volume: Thanks to new data sources that are developing, the volume of data,
particularly machine-generated data, is expanding, as is the rate at which it expands
each year. For example, the world’s data storage capacity was 800,000 petabytes
(PB) in 2000 and is anticipated to reach 35 zettabytes by 2020.
– Variety and the use of many data sets: Unstructured data makes up greater than
80% of today’s data. Most of this data is so vast for effective management.
– Velocity: As organizations realize the benefits of analytics, they face a problem:
they want the data sooner, or in other words, they want real-time analytics.
– Veracity, Data Quality, Data Availability
– Data Discovery: Finding high-quality data from the massive amounts of data
available on the Internet is a significant problem.
– Relevance and Quality: It’s tough to determine data sets’ quality and relevance
to specific requirements.
– Personally Identifiable Information: A lot of this data is about people. This
necessitates, in part, efficient industrial processes. “It partly asks for efficient
government monitoring. Partly-perhaps even entirely-it necessitates a severe
rethinking of what privacy truly entails.”
– Process Challenges: Finding the appropriate analysis model may take a lot of
time and effort; thus, the ability to cycle quickly and ‘fail fast’ through many
(perhaps throwaway) models is crucial.
– Management Challenges: Sensitive data, such as personal information, is found
in many warehouses. Accessing such data raises legal and ethical problems. As a
result, the data must be secured, access restricted, and audited.
Introduction to Big Data Analytics 5

8 Big Data Applications

There are many of data in today’s environment. Big businesses use these data to
expand their operations [7]. In a variety of circumstances, such as those outlined
below:
– Customer Spending Habits and Shopping Patterns: Management teams at large
retail stores keep customer spending habits, purchasing behavior, and customers’
most loved products. Based on which product is most searched/sold, that product’s
production/collection rate is fixed. Banking companies utilize information about
their customers’ purchasing habits to offer customers who want to buy a particular
product a discount or cashback using their bank’s credit or debit card. They will
send the appropriate offer to the right individual at the right time [8].
– Recommendation: Large retailers provide custom recommendations based on
spending and buying patterns. E-commerce platforms offer product suggestions.
They keep track of the products customers are interested in and propose them
based on that data [9].
– Smart Traffic System: Data on the traffic state on various roads were obtained
using a camera stationed alongside the road, at the city’s entry and departure
points, and a GPS device installed in the car. This information is examined, and
the least time-consuming or jam-free routes are advised. Big data analysis may
create an intelligent traffic system in the city. Another advantage is that fuel usage
may be lowered [10].
– Auto Driving Car: Without human interpretation, a car can be driven, thanks to
big data analysis. A sensor is installed in various places around the vehicle to
gather information on the size of the neighbouring car, barriers, distance from
the camera, and other things. Numerous computations are made based on these
data, including how many rotational angles to utilize, what speed to employ, when
to halt, etc. These calculations facilitate the automatic performance of activities
[11].
– Media and Entertainment Sector: Companies that offer media and entertain-
ment services, including Spotify, Amazon Prime, and Netflix, analyze subscriber
data. To develop the following business strategy, information is acquired and
assessed about video, music, and the number of time the users spend on the
website.
– Education Sector: Online education is highly impacted with usage of Big data.
An online or offline course provider company will market their course online to
someone looking for a YouTube tutorial video on a topic [12].
– IoT: IoT sensors are installed in equipment by manufacturing companies to collect
operational data. By analyzing this data, it is possible to anticipate how long
a machine will run without issue until it has to be repaired, allowing the firm
to take action before the equipment develops several problems or fails. As a
result, the cost of replacing the entire equipment can be reduced. Big data is
making a significant impact in healthcare [13]. Patient experiences are collected
using a big data platform and clinicians to improve treatment. An IoT gadget can
6 N. Arora et al.

detect a sign of a potentially fatal disease in the human body and prevent it from
receiving treatment in advance. IoT sensor installed nearpatients and newborn
infant continuously monitors various health conditions such as heart rate, blood
pressure, etc. When any parameter exceeds the safe limit, an alarm is transmitted
to a doctor, who can take action remotely.
– Energy Sector: Every 15 min, a smart electric meter reads the used power and
sends it to a server, where the data is evaluated, and the time of the day when the
city’s power load is lowest may be determined. Using this technology, a manu-
facturing company or a housekeeper may be advised to use their heavy machines
at night when the power load is lower, resulting in lower electricity bills.
– Secure Air Traffic System: Numerous locations along the flight route have
sensors (propellers). These sensors keep track of environmental variables such
as temperature, humidity, and flying speed. Based on this data analysis, the envi-
ronmental parameter is built up and changed while in flight. Studying the flight’s
machine-generated data may calculate how long a machine will perform flawlessly
after being replaced/repaired [14].

9 How Big Data Analysis Differs from Business Intelligence


Analysis?

9.1 Business Intelligence

We are analyzing the data to improve decision-making and gain a competitive advan-
tage. Business intelligence refers to a group of tools that offers quick access to data-
driven insights into an organization’s growth and development—BI’s open-source
tools a rebirth, JasperReport, KNIME, etc.

9.2 Big Data

Large amounts of organized and unstructured data are generated and sent fast from
various sources. Big data refers to massive, varied amounts of data increasing at a
high-speed rate. There are three fundamental pillars on which big data depends. Data
volume, creation speed, velocity, and the variety or scope of data points are all factors
to consider. The data variety may be structured, semi-structured, or unstructured.
Some available tools like Hadoop, Apache Spark, Cassandra, etc., are available to
deal with all types of data.
Introduction to Big Data Analytics 7

9.3 Differences Between Business Intelligence (BI) and Big


Data

– BI aims to help firms make improved decisions. Business intelligence supports the
delivery of credible information by extracting data directly from the data source.
In contrast, Big Data’s main aim is to capture, process, and analyze structured
and unstructured data to improve consumer results.
– Localization intelligence and what if analysis are some applications of BI. Variety,
Volume, Variability, Veracity, and Velocity, on the other hand, are characteristics
that better explain extensive data.
– Big Data results can handle historical data and data generated in real-time, whereas
Business Intelligence handles only historical data sets.

10 The Analytical Lifestyle of Big Data

To depict a simple project, the cycle is iterative. Figure 1 shows the different phases
involved in an analytical lifestyle of Big Data. A gradual approach is needed to
organize the actions and procedures involved with repurposing, collecting, analyzing,
and processing data to address the specific requirements for performing Big Data
analysis.

10.1 Phase 1: Discovery

– The data science team researches and learns about the issue.
– Creates a sense of context and understanding.
– Researches the data sources necessary for the project which will be available.
– The team builds an initial hypothesis which is tested later with data.

10.2 Phase 2: Data Preparation

– Data must be examined, pre-processed, and conditioned before modeling and


analysis.
– Data transformation, loading, and execution into an analytical sandbox are all
necessary for data preparation.
– Tasks for data preparation may be repeated frequently and in an arbitrary manner.
– At this stage, several technologies are used, including Hadoop, Alpine Miner,
Open Refine, and others.
8 N. Arora et al.

Fig. 1 Analytical lifestyle of big data

10.3 Phase 3: Model Planning

– After executing the model, the team must compare the established success and
failure criteria.
– The data science team produces data sets for training, testing, and production
during this phase.
– The team builds and executes the models based on work done during this.

10.4 Phase 4: Model Building

– Datasets are created for testing, training, and production by the team.
– The team also determines if its present tools are adequate for running the models
or whether a more stable environment is necessary.
– Open-source software includes R, PL/Rand, and WEKA.
Introduction to Big Data Analytics 9

10.5 Phase 5: Communicate Results

– After executing the model, the team must assess the findings against the success
and failure criteria.
– The team assesses the best methods for informing various team members
and stakeholders of the results and conclusions while considering justification
warnings and assumptions.
– The business value should be quantified, and a narrative should be established to
summarize and explain findings to stakeholders.

10.6 Phase 6: Operationalize

– The team conveys the benefits of the projects to a broader audience.


– It creates a pilot project to deploy work in a controlled fashion before extending
it to the whole organization.
– With this approach, the team may test the model’s capabilities and constraints in
a real-world setting before deploying it.
– The team delivers final reports, briefings, and codes.
– Octave, WEKA, and SQL are examples of open-source software.

11 Big Data Analysis Necessitates a Set of Skills

– Problem-solving abilities can go a long way in the age of Big Data. Because of
its unstructured data, Big Data is considered a risk. Someone who enjoys solving
problems is the best candidate for working in Big Data. Their ingenuity and
originality will aid them in developing a better solution to an issue they have
discovered.
– SQL serves as a foundation in the Big Data era. SQL is a data-centric programming
language. While working with Big Data buzzwords like NoSQL, knowing SQL
can benefit a programmer in dealing with high-dimensional data sets.
– Utilizing as many big data tools and technologies as possible, including R, SAS,
Scala, Hadoop, Linux, MatLab, SQL, Excel, SPSS, etc., is often preferred. The
demand for professionals with strong programming and statistical knowledge has
surged.
10 N. Arora et al.

12 Big Data Domain

Things connected and constantly delivering data to a system generate data, which
might be semi-structured, structured, or unstructured. The best examples are your
mobile devices, from which Telecom Operators receive a massive amount of data
from each cellular network and analyze it. Bioinformatics, the Internet-of-Things,
Cyber-Physical Systems, and Social Media are just a few fields that use Big Data
to look at trends and behavior for their purposes. Modern search engines, such as
Google, are based on Big data, used to obtain information using information retrieval
techniques and logic. Furthermore, you may argue that the World Wide Web is the
most important realm of Big Data.

13 Introduction to Big Data Analytics

Big Data analytics has become a first-class citizen of daily life. It involves a process of
continual discovery using practical analytic tools to find correlations, hidden patterns,
and various other insights from big data. This includes data of any source, struc-
ture, and size. Insights can be discovered more quickly and efficiently, resulting in
immediate business decisions that decide a winner [15].
The rise of big data, which began in the 1990s, prompted the development of
big data analytics. At the advent of the computer age, corporations employed enor-
mous spreadsheets to analyze information and look for trends. New data sources
helped boost the volume of data generated in the late 1990s and early 2000s. Due to
the widespread use of mobile devices and search engines, more data was generated
than any organization could handle. Another factor to consider was speed. The more
data generated, the more and more data need to be processed. Gartner defined this
phenomenon as the “3Vs” of data in 2005: volume, velocity, and variety. Anyone
who feels it boring to deal with the vast amounts of raw and unstructured data could
unlock a coffer of unseen facts about business operations, consumer behavior, popu-
lation changes, and natural phenomena. Conventional data warehouses and relational
databases were incapable of completing the task. So it required Innovation. There-
fore, Hadoop came into existence. Yahoo engineers created it in 2006 and released
it as an Apache open source project in 2007. Thanks to the distributed processing
framework, big data applications could now run on a clustered platform. Distributed
processing is the critical distinction between traditional and big data analytics.
Only big corporations such as Facebook and Google took extensive data analysis.
But then, in the 2010s, banks, retailers, healthcare, and manufacturing organizations
saw the value in big data analytics companies. At first, big organizations with on-
premises data stores were best suited to gathering and analyzing large data sets.
However, Amazon Web Services (AWS), Microsoft Azure, and many other cloud
platform providers, on the other hand, give ease for any company to utilize a big data
analytics platform. The option to set up Hadoop clusters over the cloud allowed any
Introduction to Big Data Analytics 11

company to start and run just what they needed on-demand, irrespective of its size.
This provides flexibility in the usage of clusters. A big data analytics environment
is a critical component of adaptability, which is required for today’s businesses to
succeed [16].

14 Overview of the Hadoop Ecosystem

The Hadoop Ecosystem is a platform or framework for addressing significant data


issues and considering it as a package containing various services. That includes
storing, ingesting, analyzing, and maintaining the data. Hadoop is a platform for
storing Big Data in a distributed ecosystem that may be analyzed in parallel. Hadoop
consists primarily of two parts: The first is Hadoop distributed file system (HDFS),
which allows you to store data in several formats throughout a node. The second
is Yet another resource negotiator (YARN), which Hadoop utilizes to manage the
resource. It allows the concurrent processing of data stored throughout HDFS [17].
Figure 2 shows the Hadoop ecosystem, which has various components that
combine to form an ecosystem.

14.1 HDFS

HDFS creates abstraction. HDFS is logically a single unit for storing Big Data.
Similar to virtualization, the actual data is distributed among numerous nodes. HDFS
has a master–slave architecture. In HDFS, the primary node is Name-node, while
the enslaved people are Data-nodes. Name-node holds metadata about data stored in

Fig. 2 Hadoop ecosystem [18]


12 N. Arora et al.

Data-nodes, like which data block is saved in which data node, how many replications
of the data block are retained, etc. Data nodes are where the actual data is kept.

14.2 YARN

Yet another resource negotiator (YARN) handles all data processing duties. These
duties mainly allocate resources by the manager and schedule tasks. The Resource
Manager and the Node Manager are the two primary components of YARN. The
Resource Manager plays the role of a controller node. It accepts processing requests
and then forwards them to the corresponding Node Managers. Node managers are
responsible for the actual processing that takes place. Every Data-node has a Node
Manager installed. It is in charge of completing the task on each Data-node.

14.3 MapReduce

A MapReduce task separates the input data into fragments processed by the map jobs
in parallel. The framework sorts the output of the map tasks before being given to
the reduced tasks. HDFS stores the data from the job’s input and output. The frame-
work handles task monitoring, scheduling, and re-execution. The MR framework
and HDFS run on the same nodes; hence the compute and storage nodes are usually
the same. This configuration enables the framework to efficiently schedule jobs on
the data nodes, resulting in high aggregate bandwidth throughout the cluster. A
Resource Manager (master), Node Manager (enslaved person), each cluster node, and
MR AppMaster per application make up the MapReduce framework. A MapReduce
framework is composed of four steps including map, shuffle, sort and reduce.

14.4 Spark

Spark Programming is a cluster computing platform that is both for general-purpose


and is quicker. It can handle a wide range of data types, and more importantly, it is
a free and open-source data processing engine. This reveals development APIs that
qualify analytical professionals for streaming machine learning or SQL workloads
requiring frequent access to real-time data sets. Spark can handle both stream and
batch processing.
Introduction to Big Data Analytics 13

15 Overview of Big Data Analysis and Its Need

Big data analytics is processing large amounts of data efficiently using technolo-
gies. This is mainly used for decision-making, which requires individual intellectual
capabilities and collective knowledge. Businesses usually look forward to storing
business data history to get meaningful results for new insights to grow the busi-
ness. As a result, extensive data analysis needs technical Innovation and data science
expertise. Models for extensive data analysis were investigated and utilized to design
a general conceptual architecture to make things more transparent.
Following are examples of the need for Big Data Analytics:
1 Business decisions: online retail companies like Amazon look forward to making
decisions based on past’ Prime day’ sales and consider the best-selling items to
be repeated for the next sale.
2 Insight into data and business: A company located in multiple locations using
their sales data can get an insight into which location has maximum sales for the
last financial year.
3 Interpretation of outcomes: The data can be estimated in the nearest time range
based on pattern-based analysis.
4 Descriptive: Graphical representation of data can show business behavior.
5 Predictive analytics: Using mathematical and scientific techniques applied to
historical data, future data can be predicted with appropriate variables to a certain
confidence level.

16 Use Cases of Big Data Analytics

As per industry standards, big data broadly consists of three Vs. The three V’s are
as follows:
Volume: The term “volume” refers to the “quantity of data,” which rapidly increases
daily. Humans, technology, and their interactions on social media create enormous
amounts of data.
Velocity: Velocity refers stream of data that arrives from different social media sites
continuously, and the repository gets completed with new data at the same rate. It
becomes a challenge to capture this stream of data promptly for further processing.
Variety: There is a variety of data coming from various sources. The repository stores
this data in different file formats spreadsheet, text files, e-mails, image files, video
files, etc.
Some of the use cases of Big data are as follows:
1. Fraud detection in Financial Organization: Recently, the headlines recently
found credit and debit card fraud involving millions of people. Several consumers
discovered fraud activity associated with their accounts. With big data and
14 N. Arora et al.

machine learning, this could have been minimized. Based on machine learning
analysis, banks can learn about a customer’s typical activities and transactions.
And if they notice any suspicious conduct, they can quickly block the customer’s
card or account and notify them. Banks have begun to use Big Data to study
market and consumer behavior, but more work still needs to be done.
2. Big data in health care: Healthcare businesses are being used to enhance
profitability and save lives. Healthcare firms, hospitals, and researchers collect
massive volumes of data. However, none of this information is helpful on its
own. When the data is evaluated, it becomes critical to highlight trends and
threats in patterns and construct prediction models. This data can also be used
for classification purposes, for example, COVID-19 data as presented in [19, 20].
3. Big data in the telecom sector: Telecom operators use big data analytics to
gain a more comprehensive perspective of their operations and consumers and
accelerate innovation initiatives.
4. Big data in the Oil and Gas sector: This sector has been using big data to find
new ways to innovate for the last few years. Data sensors have long been used in
the oil and gas industry to track and monitor the performance of wells, gear, and
activities. Oil and gas corporations have used this information to track healthy
activity, develop Earth models to discover new oil sources, and perform other
value-added operations.
5. Log data Analytics in business: Many commercial big data applications rely on
log data as a foundation. Long before big data, there were log management and
analysis tools. However, as business activity and transactions rise exponentially,
storing, processing, and presenting log data most efficiently and cost-effectively
can become a significant burden. In this context, big data analytics play a signifi-
cant role because of some synergy found in log data search and big data analytics
discovering by industries.
6. Big Data Analytics in Recruitment: In the rush to place applicants as rapidly
as possible in a competitive climate, recruiters frequently believe they lack the
(proper) tools. Recruiters nowadays use a new technique that performs mining
of internal database with candidates’ overall skill sets such as educational back-
ground, certification is done, job title applied for, skill sets, years of experience,
and so forth. Then this mined result is matched and compared with previous
recruitment candidates’ performance, salaries, and overall past recruitment expe-
rience. The traditional approach of matching keywords with the job description is
no longer efficient in today’s scenario, where big data analytics has significantly
changed the paradigm in different industry verticals. Figure 3 shows the steps
involved in the recruitment process using Big data analytics.
7. Big Data Analytics in Natural Language Processing (NLP): In NLP, the
computer processes languages before feeding them to the model for training [21].
Various linguistic features are being considered during processing. We find many
important use cases of NLP in different industry verticals. Sentiment analysis of
customers is one of the essential applications of natural language processing used
by several companies. They analyze customers’ sentiment by capturing contin-
uous streaming data, where customers’ feedback on any particular product is
Introduction to Big Data Analytics 15

Fig. 3 Big data analytics in recruitment

positive, negative, and neutral. The company subsequently analyzes these textual
sentiment documents to improve its product further. One of the essential use
cases in the banking sector is a chatbot that primarily solves the customer service
officer’s job/responsibility. Chatbot process all textual data on a real-time basis,
and matching is done with the existing huge NLP database (corpus). It then tries
to respond to the user’s query—another critical use case of NLP, i.e., Machine
Translation (MT) system. Machine translation translates source language to target
language. The source is one language, e.g., English, and the target is another
language, e.g., Hindi. We call it a bilingual MT system if it translates from one
language to another.
We use neural-based translation known as the Neural Machine Translation
(NMT) system. NMT’s latest NLP models are used in its language model. In
NMT, since it uses a deep neural network, we need a massive amount of parallel
corpus to train our model. Model performance can be measured with automatic
metrics such as BLEU, METEOR, etc. Researchers have been researching perfor-
mance evaluation of MT/NMT systems with various automatic metrics, and
evaluated outcomes computed by different metrics are compared.
8. Blockchains aren’t efficient for storing large file sizes: Large file sizes are
inefficiently stored on blockchains [12]. Storing vast volumes of data on a public
blockchain is expensive and time-consuming. Storing data on-chain isn’t a very
scalable or efficient option for anything other than primary ledger data and asso-
ciated hashes. Each transaction may add up to thousands of dollars per terabyte on
the chain, plus costs each time you wish to access that data. It also consumes time,
such as minutes per megabyte, that SLAs cannot afford. As a result, blockchains
are almost entirely reliant on off-chain storage.
16 N. Arora et al.

17 Challenges in Analyzing Big Data

The fundamental issue is that most firms can’t keep up with the available data and
data sources. Big data has created several challenges in collecting and storing many
correct streaming data sources for correct analysis. Most of the big data technology
we use is obsolete. Sometimes even the tools are also not able to provide a satisfactory
solution. Hence, it is necessary for organizations to upgrade/replace their existing
system. Some of the significant challenges in analyzing Big Data are as follows:
– Lack of data science skills: There is a substantial relevant skill shortage in the
data scientist community. It is a considerable challenge to minimize this gap. It is
also an issue of educating people on using big data analytics. Instead, many other
technical issues require addressing, so it will take longer to close this gap.
– Lack of proper data visualization is often disregarded when interesting and
relevant data is mixed with ordinary or irrelevant discoveries. In other cases,
team members and even seasoned data scientists often fail to present data in a
meaningful and visually appealing manner due to a lack of skill. Consequently,
sometimes they may ignore/miss the most relevant and meaningful data.
– Lack of proper data transformation demands proper transformation when we
need to get or extract insight/value from data. Since data size is too significant
and data formats are not fixed, proper and correct transformation is a big chal-
lenge for data engineers. Data engineers are responsible for converting this data
into an analytics-ready form, i.e., which analytics team members can use. Data
engineers must only depend on rudimentary and code-heavy technologies during
this transformation process. Hence it sometimes becomes a significant challenge
to transform data as per requirement.

18 Big Data Quality Dimensions

The study of data quality in extensive data systems is still in its infancy. Most research
on big data quality acknowledges the relevance of standard dimensions in measuring
big data quality. Some critical quality dimensions of big data are accessibility, confi-
dentiality, redundancy, volume, etc. In Table 2, we have represented most of the
critical quality dimensions of Big data and their purpose.

19 Conclusion

Big Data Analytics plays a vital role in today’s world. All businesses carry vast
amounts of data with them, which can be used to uplift their future growth with
the help of Big Data Analytics and its tools. Big Data Analytics helps the company
predict future trends from past data using the Hadoop ecosystem, which eventually
Introduction to Big Data Analytics 17

Table 2 Critical key quality dimensions of Big data and purpose


Big data key Purpose
quality
dimension
Accessibility Accessibility and availability are the ability of a person to get data from his
physical status and available technology
Confidentiality This quality factor determines if the correct data is in the hands of the correct
people. Is the information safe?
Pedigree This dimension aids in determining the data’s source, allowing any
inconsistencies to be rectified in the source rather than elsewhere
Readability This dimension, also known as clarity, simplicity, ease of understanding,
interpretability, and comprehensibility, relates to the consumers’ ability to
grasp/understand data
Redundancy Redundancy, minimality, compactness, and conciseness refer to the capacity
to portray a reality of interest with the least amount of information resources
Volume The proportion of values present in the examined Data Object concerning
the source from which it is derived is provided by this quality dimension

enhances the organization’s profitability. The discussion presented in this chapter


gives a clear insight into Big Data analysis and the critical differences between Big
data analysis and business intelligence analysis. It explains the analytical life cycle
of big data. The skills required to deal with big data analysis are highlighted, and the
extensive data domain is depicted. We have also discussed how big data analytics
can be exploited in decision-making by different industries such as recruitment/HR
of the company, oil and gas sector, health- care, sentiment analysis, and so on. Some
significant challenges in big data analytics, such as lack of proper skill, issues during
the data transformation process, big data quality dimension, etc., are also discussed.

References

1. Lazer, D., Radford, J.: Data ex machina: introduction to big data. Ann. Rev. Sociol. 43, 19–39
(2017)
2. Kitchin, R., Lauriault, T.P.: Small data in the era of big data. GeoJournal 80(4), 463–475 (2015)
3. Fernández, A., del Río, S., Chawla, N.V., Herrera, F.: An insight into imbalanced big data
classification: outcomes and challenges. Complex Intell. Syst. 3(2), 105–120 (2017)
4. G’eczy, P.: Big data characteristics. Macro Theme Rev. 3(6), 94–104 (2014)
5. Ansari, S., Mohanlal, R., Poncela, J., Ansari, A., Mohanlal, K.: Importance of big data. In:
Handbook of Research on Trends and Future Directions in Big Data and Web Intelligence,
pp. 1–19. IGI Global (2015)
6. Fan, J., Han, F., Liu, H.: Challenges of big data analysis. Natl. Sci. Rev. 1(2), 293–314 (2014)
7. Al Nuaimi, E., Al Neyadi, H., Mohamed, N., Al-Jaroodi, J.: Applications of big data to smart
cities. J. Internet Serv. Appl. 6(1), 1–15 (2015)
8. Aloysius, J.A., Hoehle, H., Goodarzi, S., Venkatesh, V.: Big data initiatives in retail environ-
ments: linking service process perceptions to shopping outcomes. Ann. Oper. Res. 270(1),
25–51 (2018)
18 N. Arora et al.

9. Verma, J.P., Patel, B., Patel, A.: Big data analysis: recommendation system with hadoop
framework. In: 2015 IEEE International Conference on Computational Intelligence &
Communication Technology, pp. 92–97. IEEE (2015)
10. Rizwan, P., Suresh, K., Babu, M.R.: Real-time smart traffic management system for smart
cities by using Internet of things and big data. In: 2016 International Conference on Emerging
Technological Trends (ICETT), pp. 1–7. IEEE (2016)
11. Fathi, F., Abghour, N., Ouzzif, M.: From big data to better behavior in self-driving cars. In:
Proceedings of the 2018 2nd International Conference on Cloud and Big Data Computing,
pp. 42–46 (2018)
12. Daniel, B.K.: Big data in higher education: the big picture. In: Big Data and Learning Analytics
in Higher Education, pp. 19–28. Springer (2017)
13. Mahapatra, S., Singh, A.: Application of IoT-based smart devices in health care using fog
computing. In: Fog Data Analytics for IoT Applications, pp. 263–278. Springer (2020).
14. Singh, A., Mahapatra, S.: Network-based applications of multimedia big data computing in iot
environment. In: Multimedia Big Data Computing for IoT Applications, pp. 435–452. Springer
(2020).
15. Kannan, S., Karuppusamy, S., Nedunchezhian, A., Venkateshan, P., Wang, P., Bojja, N.,
Kejariwal, A.: Chapter 3 - Big data analytics for social media. In: Buyya, R., Calheiros, R.N.,
Dastjerdi, A.V. (eds.) Big Data, pp. 63–94. Morgan Kaufmann (2016).
16. Tsai, C.W., Lai, C.F., Chao, H.C., Vasilakos, A.V.: Big data analytics: a survey. J. Big Data
2(1), 1–32 (2015)
17. Landset, S., Khoshgoftaar, T.M., Richter, A.N., Hasanin, T.: A survey of open source tools for
machine learning with big data in the hadoop ecosystem. J. Big Data 2(1), 1–36 (2015)
18. Monteith, J.Y., McGregor, J.D., Ingram, J.E.: Hadoop and its evolving ecosystem. In: 5th
International Workshop on Software Ecosystems (IWSECO 2013), vol. 50, p. 74. Citeseer
(2013)
19. Goyal, L., Arora, N.: Deep transfer learning approach for detection of covid-19 from chest
x-ray images. Int. J. Comput. Appl. 975, 8887 (2020)
20. Kakde, A., Sharma, D., Arora, N.: Optimal classification of covid-19: a transfer learning
approach. Int. J. Comput. Appl. 176(20), 25–31 (2020)
21. Datta, G., Joshi, N., Gupta, K.: Empirical analysis of performance of MT systems and its metrics
for English to Bengali: a black box-based approach. In: Intelligent Systems, Technologies and
Applications, pp. 357–371. Springer (2021)
22. Sharma, A., Tiwari, S., Arora, N., Sharma, S.C.: Introduction to blockchain. In: Blockchain
Applications in IoT Ecosystem, pp. 1–14. Springer (2021)
DCD_PREDICT: Using Big Data
on Prediction for Chest Diseases
by Applying Machine Learning
Algorithms

Umesh Kulkarni, Sushopti Gawade, Hemant Palivela, and Vikrant Agaskar

Abstract Technology with system learning algorithms is frequently utilized in the


medical domains to estimate disorders. Through the provision of some reference
guidelines, it assists in the real-world diagnosis of diseases. The DCD-PREDICT
system employs system learning to make prophetic diagnosis of diseases of the
chest, including lung cancer, asthma, COPD, pneumonia, and tuberculosis. A ques-
tionnaire will be provided to each participant (self-administered and physician-
administered). Understanding, specificity, and positive and negative analytical values
will be computed for each question, and the combined patient scores will be
contrasted with those of controls. It will be determined how closely the physician-
and self-administered questionnaires agree. This enables medical professionals to
do better differentiated analysis earlier, lowering errors and delivering timely treat-
ment. One of the main causes of death can be the heart disease. Because real-world
practitioners lack the necessary knowledge, expertise, or experience regarding the
signs of heart failure, it is challenging to diagnose the disease. Therefore, computer-
based predictions of cardiac illness may be crucial as an early diagnosis to take
the appropriate actions as well as a perspective on recovery. However, by choosing
the right data mining classification algorithm, the early stages of the disease and its
recurrence can be accurately predicted. The aim of this study was to compare three
of the most common classification methods, Support Vector Machines (SVM), K-
Nearest Neighbors (KNN) and Artificial Neural Networks (ANN), for heart disease
prediction using the ensemble of standard Cleveland cardiology data.

U. Kulkarni (B)
Vidyalankar Institute of Technology Wadala, Mumbai, Maharashtra, India
e-mail: umesh.kulkarni@vit.edu.in
S. Gawade
Pillai College of Engineering, Panvel, India
e-mail: sgawade@mes.ac.in
H. Palivela
Manager-AI, Accenture Solutions, Mumbai, Maharashtra, India
V. Agaskar
Vidyavardhani College of Engineering and Technology, Vasai Road, Vasai-Virar, Maharashtra,
India

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 19
V. Rishiwal et al. (eds.), Towards the Integration of IoT, Cloud and Big Data,
Studies in Big Data 137, https://doi.org/10.1007/978-981-99-6034-7_2
20 U. Kulkarni et al.

Keywords Prediction · Classification (SVM · KNN) · Machine learning ·


Artificial neural network · Heart diseases

1 Introduction

1.1 Introduction

Numerous disorders related to the chest affect people. Diseases like asthma, COPD,
pneumonia, tuberculosis, and others have symptoms that demonstrate their presence.
These symptoms, which can occur in a number of settings while people are going
about their regular lives, include shortness of breath, chest symptoms, throat and chest
coughs, among others. In order to identify which chest ailment a person is experi-
encing, we plan to use these symptoms and how they present during various human
contexts, such as running, waking up, and other situations. We achieve this by using
a questionnaire that is symptom-based. The purpose of this activity is to help with the
first diagnosis of chest problems and to help distinguish between various diseases.
We employ the idea of prodrome based surveys furthermore, weighted scores for
these inquiries in our methodology. The initiative is made to fit seamlessly into any
nearby doctor’s office, nursing home, or hospital’s regular schedule to Programme the
computer to recognize and foretell the illness the patient is suffering from. Training
is carried out using example datasets that include survey-style cues. Test datasets
are accessible in the UCI vault dataset, the California Health and Human Services
(CHHS) data set, and data from the esteemed National Institute of Tuberculosis and
Respiratory Diseases.

1.2 Background

Coronary illness has a high worldwide mortality rate. Prediction and conclusion of
coronary illness has turned into a troublesome errand for specialists and emergency
clinics both in India and abroad. The Heart Infection Forecast System is a system
that aids in the prediction of heart disease, specifically cardiovascular disease such
as myocardial infarctions. In this field, data mining and system learning algorithms
are critical. The experiments of researchers are accelerating their work to develop
a graphical user interface and machine learning algorithm that can assist doctors
in making decisions regarding the prediction and diagnosis of heart disease. This
project’s main output is predicting a patient’s heart disease using machine learning
algorithms. A comparative study is carried out, with the performance calculated using
a machine learning procedure.
DCD_PREDICT: Using Big Data on Prediction for Chest Diseases … 21

1.3 Objective

The primary motivation behind this undertaking stands to utilize AI calculations


to foresee the presence of a coronary illness in a person. Using the existing data,
analysis is done to determine the presence of such characteristics in various types
of individuals which indicate the vulnerability to a heart disease. ML algorithms are
used on the data to calculate the probability of a person having a heart disease in the
future. This data is centered on the various functioning parameters of the heart. Our
project focuses on reducing efforts and time and increasing efficiency and accuracy
in prediction.

2 Literature Survey

Machine learning is a fast-growing field and I aim to utilize its potential to create this
Artificial Intelligence system. Having a vast application, this system will be used by
the doctor’s patients when all the elements are implemented in the system. Actual
doctors can decide disease with a large number of tests, which require a high process
time, conclusion, lack of skilled cognition and becoming inexperienced [1]. It is
difficult to extract important data in the form of knowledge, hence, it is crucial to use
various techniques such as mining and machine learning methods. Further, extracting
important data from such a type of medical data repository becomes important, when
using methods like classification, clustering, regression, prediction, etc. [2]. The
primary focal point of the paper is to see the strategy for information mining grouping
methods to identify coronary illness expectation in beginning phases. Likewise, by
utilizing PC based expectation, it will be not difficult to foresee heart illnesses at a
beginning phase [3]. KNN (k-closest neighbor), ANN (Artificial Neural organization)
and SVM (Support vector machine) are some of the techniques which are typically
involved and a relative report for our proposed project and for expectation is finished
utilizing the Cleveland coronary illness dataset [4].

2.1 Summary

Early detection and treatment options exist for heart disorders. Using the method
described above, we may determine whether a patient has heart disease based on their
numerous symptoms. In this instance, SVM and random forest classifiers provide
the most accurate predictions. We are unable to anticipate the many types of heart
disorders with any degree of accuracy due to the lack of abundant data, but we can
identify heart infections with a respectable degree of accuracy of roughly 80 to 85%.
When sufficient data is available, it will be possible to design methods for disease
diagnosis that are more accurate three data mining creation strategies that are used to
22 U. Kulkarni et al.

construct a model of the projection system for chest infections. The process retrieves
secret information from a historical record of chest infections. The models are made
and gotten to utilizing the DMX inquiry language and tasks. A test dataset is utilized
to prepare and approve the models. Methods like the Lift Chart and Categorization
Matrix are utilized to measure how well the models work. As a consequence of the
anticipated express, each of the three models are equipped for removing patterns.
Neural Network and Ruling Trees appear to be the best models for anticipating
individuals with chest disease. In correlation with the prepared models, the objec-
tives are assessed. Each of the three models enjoys its own benefits concerning the
effortlessness of model understanding, accessibility of exhaustive data, and preci-
sion in giving solutions to complex questions. This framework can be improved and
extended further. It may likewise incorporate extra information mining strategies, for
example, Association Rules and Time Series. The use of constant information is an
option to all out data. Another subject is to mine the colossal measure of unstructured
information present in medical care data sets utilizing message mining.

3 System Design

3.1 Existing System

A large number of people suffer from chest related diseases. Several people die from
chest conditions. This is often due to the fact that they are diagnosed much later after
they occur when it becomes difficult to solve the problem. In addition to this, they
are often misdiagnosed for one another. A patient with Asthma may be told he has
COPD and vice versa. This leads to adverse effects as it leads to wrong treatment
being given to the patient. Therefore, there is a need to build an easy system to
aid doctors for preliminary decision making. A need to empower the patient with a
tool that helps him understand his condition better and take appropriate measures by
talking to the correct doctor.
It is mainly focused on Knowledge Discovery in Databases (KDD) which is
the primary proposal from which mashup candidates are identified by addressing a
repository of open services. In this methodology, there is a personalized development
of software, which can be used to produce new software based on service integration
methods. KDS define service integration qualification by discovering different phases
of web service specifications.
The process that is being used here intersects the fields of data mashup and service
mashup. This idea of obtaining information from web service offerings is comparable
to the well-established KDD approaches. The representations of data integration and
service mashup are discussed in this work. Furthermore, cutting-edge techniques for
the fundamental KDS domains of comparison processing, grouping, filtering, etc.
DCD_PREDICT: Using Big Data on Prediction for Chest Diseases … 23

3.2 Identification of Common Risks

Heart disease include risk factors like


1. High blood pressure.
2. Abnormal blood lipids.
3. Use of tobacco.
4. Obesity.
5. Physical inactivity.
6. Diabetes.
7. Age.
8. Gender.
9. Family generation.
Extracting Data from huge data sets data mining can be one of the methods to
automatically process knowledgeable information [3].

3.3 Types of Heart Diseases

Heart diseases identified are [2]


1. Coronary heart disease.
2. Cardiomyopathy.
3. Cardiovascular disease.
4. Ischemic heart disease.
5. Heart failure.
6. Hypertensive heart disease.
7. Inflammatory heart disease.
8. Valvular heart disease.
Therefore, to do so, one of the simplest ways to empower the patient as well as
doctors for early diagnosis is through a simple symptom-based questionnaire. This
questionnaire is a simple tool that consists of the different symptoms faced by the
patient such as chest congestion, wheezing, symptom from the throat, symptom from
the body part, shortness of breath, etc. Using these symptoms in a wide variety of
scenarios in the regular lives of patients compared with the regular lives of people with
no chest conditions and no related symptom, we are able to diagnose the percentage
of a particular chest disease occurring and are able to tell which of the many chest
diseases the patient might be suffering from out of a set of diseases. To be able to
build such a scalable tool, we need indicators that have been well researched by
medical researchers.
There are a number of reputed published questionnaires for the identification
of Asthma, diagnosis of COPD, etc. We aim to combine these questionnaires into a
single tool and adjust the weights assigned to these questions by training the machine
24 U. Kulkarni et al.

with both understanding and controlling data keeping. These questions have yes or
no inputs or they have a spectrum of inputs that indicate the extent of the symptom
occurring from 1 to 4. While the patient enters the answers to the questionnaire,
there is an initial weight that has been assigned to the question that determines and
calculates the percentage chances of the disease taking place. This weight changes as
we train our machine with more understanding and control data for both Western as
well as Indian conditions. As we get more training data, the diagnosis of the system
becomes more and more precise. To test the working of the system, there will be
extensive use of UCI datasets and CHHS datasets.

3.4 Problem Statement

Currently systems utilize a large amount of medical data taken from tests that deter-
mine the nature of the chest disease. These are costly and not scalable in nature and
require advanced medical professionals. To overcome problems on existing systems,
in the proposed system users may not require to search data in various reposito-
ries with special features. Users need only to give information which is required to
collect. Users can just type a combination of queries and based on user behavior
analysis exact data will be predicted. However, over the years, medical researchers
have compiled this medical data into prodrome based surveys which are used to
determine the complexities.

3.5 Scope

The objective of the task is to recognize the primary side effect of chest simplicity—
the recognizing component of these diseases. Our project utilizes the idea of side
effect based shapes and changed scores for these sections. The project is planned to
be incorporated into the everyday activities of any nearby doctor, nursing home, or
medical clinic.

3.6 Proposed System

Currently, systems utilize a large amount of medical data taken from tests that deter-
mine the nature of the chest disease. These are high-priced and not scalable in
nature and require advanced medical professionals. To overcome problems of the
existing system. In the proposed system, such data is stored in various reposito-
ries with special features. The user needs to provide only the information which is
required to be collected. Users can just type a combination of queries and based on the
user’s behavior analysis, exact data will be predicted. However, over time, medical
DCD_PREDICT: Using Big Data on Prediction for Chest Diseases … 25

researchers have synthesized this medical data to provide us with symptom-based


questionnaires that people can use to detect these diseases. Be that as it may, when
utilized in little clinical examinations with sparse patient and control information,
these polls have burdens. To validate and deploy these symptom-based question-
naires for the general public, a machine learning system that makes use of a lot of
patient and control data is needed. In order to reliably and quickly identify which
chest condition the patient has, we intend to combine a number of these symptom-
based questionnaires with information from actual case studies. Data from patients
(patients with chest ailments and their symptoms) and control data from healthy
groups without chest issues are the two categories that are required. Consolidating
these datasets will bring about weighted scores for each inquiry on the structure,
which will permit us to recognize which kind of chest illness the patient has, to train
the machine, our new system intends to use supervised machine learning algorithms
and built-in Python libraries.
1. Questionnaire Generation and Machine Training
At the beginning, we generate a global questionnaire based on the different ques-
tionnaires from medical researchers. These questions will have standard weightage
scores assigned at the beginning. Once the questionnaire has been established with
its standardized scores, the machine will be trained taking into consideration patient
data and control data associated with these symptoms that will change the weightage
of the score using TensorFlow.
2. Patient Input
The patient inputs the answers to the questions using simple yes or no, or multiple-
choice ranging from 1 to 4 in extent of the symptom and also chooses between
symptoms occurring in daily situations.
3. Disease Probability Calculation
Entered percentage chance of the illness is calculated and generated for the user to
see.
4. Graph Generation
Based on the inputs and the probability, the system will also generate comparison
graphs with respect to other diseases and other patients.

4 Methodology

4.1 Supervised Learning

There are various methods used in the main classification as follows


(i) Supervised Learning Model.
26 U. Kulkarni et al.

(ii) Unsupervised Learning Model.


Here we are going to target on supervised methodology mainly on the model as.
(i) Support Vector Machine (SVM)
(ii) K-Nearest Neighbors (KNN)
(iii) Artificial Neural Network (ANN).

(a) Support Vector Machine (SVM) [16]

A controlled learning model known as the Support Vector Machine (SVM) is depicted
as limited layered vector spaces, where each aspect signifies a specific property of
an object., and it has been shown that SVM functions admirably for tackling high-
layered space issues. Due to its computational ability on tremendous datasets, SVM
is most of the time used in report classification, opinion examination, and expectation
based undertakings [16].
(b) K-Nearest Neighbors (KNN) [16]
The test information is quickly ordered utilizing the preparation tests utilizing K-
Nearest Neighbor (KNN), one more directed learning model. The greater part vote of
an item’s closest neighbors decides its grouping in KNN. As another option, distance
measurements, which can be essentially as fundamental as Euclidean distance, are
utilized to foresee the class of another sample. In the functioning strides of KNN, k
is at first determined (No. of the closest neighbors). The test information will then
be given a class name in view of the results of the normal democratic [16].
(c) Artificial Neural Network (ANN)
The administered learning procedure known as the Artificial Neural Network (ANN)
contains three layers: input, secret result, and output. The joints between the key units,
the mystery, and the result are not entirely settled by the pertinence of the allotted
load of that specific info unit. In general, significance increments with expanding
weight. ANN can utilize both direct and sigmoid exchange (actuation) functions.
ANNs might be prepared to deal with immense volumes of information with few
inputs. The most famous learning calculation for multi-facet feed forward ANNs is
the backpropagation learning tool. Three sub-datasets for preparing, approval, and
testing ought to be made from the information records for ANN.

4.2 Symptom-Based Questionnaire

Symptom-based Questionnaires are required for the following heart related diseases.
. Asthma.
. COPD.
. Pneumonia.
. Tuberculosis.
DCD_PREDICT: Using Big Data on Prediction for Chest Diseases … 27

4.3 Dataset Training and Testing

. Dataset required for the purpose can be obtained from the UCI database, CSSH
database and datasets obtained from the National Institute of Tuberculosis and
Respiratory Diseases (India).
. An ML training service like TensorFlow may be used to train the system based
on the dataset selected.
. A Cloud ML service like Azure ML or Amazon ML may be used to verify and
double check the training.

Working of the system:


. User chooses one of the diseases entered
. Collection of data based on the survey.
. Working on the input, a chance percentage of the illness occurring.
. Graphs are calculated indicating relationship with other diseases.

5 Process and Analysis

5.1 General Process

The Agile cycle model was employed. Agile showcasing is a strategy that follows
programming engineers and attempts to speed up straightforwardness in marketing.
Agile is ordinarily a period boxed, iterative technique to programming conveyance
that produces programming step by step from the start as opposed to holding on until
the finish to introduce the undertaking as a whole. Agile philosophies frequently work
by separating projects into little pieces of client usefulness known as client stories,
focusing on them, and afterward consistently conveying them in short emphases of
about fourteen days.
(i) Probability Generation: Here as per the input given by the user this block will
try to give the probability of chest disease in which category it will be defined
as shown in Fig. 1.
(ii) Graph Calculation: Here it is expected depending upon the category by which
a definite Path by which the method of medicine can be worked out as shown
in Fig. 1.
28 U. Kulkarni et al.

Fig. 1 General Process Diagram [self-prepared as per required for project]

5.2 Use Case Diagram

General process will be as given below,


(i) User Choice: Here it expected to know the choice of the user from where a
classification of the chest diseases can be identified to get analysis done as
shown in Fig. 2.
(ii) User Input Data: Here the user is expected to give data which may be used for
further calculation as shown in data level 0 in Fig. 3.
(iii) User Input Data: Here the user is expected to give data which may be used for
further calculation as shown in data level 1 in Fig. 4 where a processing and
training phase is being worked out which in turn help in preparing a probability
output for the graph calculation model.

5.3 Data Flow Diagram

See Figs. 3 and 4.

5.4 System Flow

Working of the flow shown in Fig. 5.


1. Start.
2. Collect general information of the patient.
Another random document with
no related content on Scribd:
secured. It is expected that in time the national bank notes will go out
of existence altogether, their place being taken by these federal
reserve bank notes.
Since their establishment in 1913 the work of Value of the Federal
these federal reserve banks has been of great Reserve system.
value. They have enabled the banking operations of the country to
expand and contract in accordance with changes in business
conditions, thus obviating serious danger of financial panics. In
helping the government to float the various Liberty Loans they
rendered great service. There is no doubt that the system has
improved and strengthened the banking facilities of the country.[206]
This will appear more clearly when the relations of banking and
credit are discussed a few pages further on.
The Practical Operations of Banking.— Commercial and
There are some elementary things connected savings banks
with the practical operations of banking which distinguished.
everyone ought to know. Generally speaking, there are two kinds of
banks, commercial banks and savings banks; or, in some cases the
same bank may have two departments, a commercial department
and a savings department. Both commercial and savings banks
receive deposits; the former may or may not pay interest according
to the amount of the deposit and the length of time it is left in the
bank; the latter always pay interest if money is left on deposit a
prescribed length of time. When money is deposited in a commercial
bank the depositor is said to have an “account” and he may issue
checks up to the amount of his deposit. A check Bank checks.
is an order, addressed to the bank, and calling
for the payment of a designated sum. This check may be cashed at
the bank on which it is drawn, or the person who receives it may
have it cashed at the bank where he has his account. Banks cash
checks for their own customers no matter what bank the checks
happen to be drawn upon.
One result of this is that every bank at the close of each day’s
business will have on hand a large number of checks drawn against
other banks. It receives payment on these The clearing house
checks through the medium of the “clearing system.
house”, an institution which is maintained by the banks in every large
city. To the nearest clearing house a clerk takes each morning all the
checks on other banks that have come in during the previous day.
These are sorted out and exchanged for checks drawn on the bank
itself which are held by other banks. Whatever difference there
happens to be is paid in cash.
When any person desires to borrow money How bank loans are
from a bank he gives his note, which is a made.
promise to repay the bank at a designated time. The bank may ask
the borrower to obtain an endorsement upon his note, that is, to
have some responsible person put his name on the back of it, which
means that the endorser assumes liability for the amount of the note
if it is not paid by the maker on time. Or the bank may ask the
borrower to deposit “collateral” as security for the payment of the
note. This collateral may be in the form of bonds, stocks, mortgages,
or any other intangible property that has sufficient value. The bank
holds this collateral until the loan is repaid.
When a bank lends money and takes a man’s The process of
note, with or without collateral, it is said to “discounting.”
discount the note. It gives the borrower the face value of his note
less the interest, whatever it is, calculated at the current rate. Thus if
the rate is six per cent and the person gives his note for one
thousand dollars payable in six months, the bank would hand him
$970 in money. Business men obtain large sums of money from the
banks by getting their notes discounted; they borrow money in this
way to buy goods and then pay off their notes when the goods are
sold. Such notes are called “commercial paper”.
Now the federal reserve banks help the “Rediscounting.”
member banks by “rediscounting” this
commercial paper for them. Suppose a small bank has loaned on
notes all the money it has to spare. Then it receives applications
from its customers for more loans. What does it do? It takes a bundle
of business men’s notes, or commercial paper, from its vaults and
sends this to the nearest federal reserve bank. The latter does just
what the member bank did in the first instance; it deducts the
discount at current rates and gives the balance to the member bank
in money, that is, in federal reserve notes. The member banks are
enabled, in this way, to loan a great deal more money than would be
the case if there were no way of getting their commercial paper
“rediscounted”.
Drafts or bills of exchange are used to make How the banks
payments at distant points. If a person lives in transfer funds.
San Francisco and wishes to pay a small bill in New York, he will
probably go to the post office and buy a postal money order; but if
the amount is large, he may find it more convenient and cheaper to
go to a bank in San Francisco and buy a draft on some New York
bank. This draft he then sends to New York in payment of his bill. A
draft payable in a foreign country is usually called a bill of exchange.
From any American bank one can buy a bill of exchange payable in
Paris, Madras, Hong Kong, or elsewhere. When the money of one
country is worth more than that of another, as is the case throughout
the world at the present time, allowance is made for this difference.
Bills of exchange are “cleared” through the great clearing houses in
London or New York, and any balances are paid by the shipment of
gold.
The Credit System
What is Credit?—Credit is simply the giving The five chief
and taking of promises in place of money. The instruments of
most common form is “book credit”, which credit.
means that wholesalers and retail merchants give out goods with
nothing but charge accounts on their books to show for it. These
accounts are merely the records of credit which has been extended
to customers. But in many transactions something more than a book
record is desired, in which case the person giving the credit may ask
for a “promissory note”. This is a written promise to pay a designated
sum either on demand or at a definite date. Bank checks are also
instruments of credit; so are drafts and bills of exchange. Anything
that expresses or implies a promise to pay a sum of money is an
evidence of credit.

THE RELATION OF MONEY AND PRICES


The general relation between the amount of money
in circulation and the course of prices is shown by the
two statistical diagrams on the other side of this page.
It will be noticed that per capita circulation began to
decline in 1921. Prices also commenced to fall during
that year, and if the table of prices were extended to
cover the last year or two it would show the price-lines
moving downward. The data for continuing the lines of
the lower diagram may be found in the publications of
the United States Bureau of Labor Statistics.
MONEY IN CIRCULATION PER CAPITA (Figures for first day
of month)
COURSE OF WHOLESALE AND RETAIL PRICES[207] IN
THE UNITED STATES
JANUARY, 1913, TO MAY, 1920

[Average Prices, 1913 = 100]

The Relation of Credit to Money.—A large part of the world’s


business is done on credit. If all debts had to be paid tomorrow, there
would not be enough money in the world to pay one cent on the
dollar. But all debts do not fall due at once, and a huge credit system
is able to stand with comparative safety upon a relatively small
amount of gold. There is a limit, however, to the There is a limit to
expansion of credit and this limit is roughly the expansion of
determined by the amount of gold available to credit.
be held as a reserve. Hence it is that when the volume of gold
increases, credit usually expands also. With their reserves full to
overflowing the banks are more ready to lend money on notes, and
the rate of discount goes down. Conversely, as the volume of gold
declines, credit usually contracts. The rate of discount then goes up
and business men find it harder to borrow money upon commercial
paper. In the one case we speak of an inflation or expansion in
money and credit; in the other we speak of a contraction or deflation.
Credit and Prices.—The general level of prices depends upon the
value of money. The price of a thing is merely its value expressed in
terms of money. To say that prices have gone up is to say exactly the
same thing as that the value of money has gone down.[208] The
general level of prices, to put the matter in another way, is
determined by the demand for goods on the one hand and the
supply of goods on the other. The demand for goods, however, is
represented by the amount of gold currency available plus the
amount of credit which is built upon this gold. The credit, as has
been seen, bears a definite relation to the gold. How the general
Hence it can fairly be said that the amount of level of prices is
gold is an index of demand for goods or determined.
services. So, if the supply of goods remains approximately the same,
any large increase in the available amount of gold would send prices
up; and conversely, if the supply of goods is greatly increased, while
the available amount of gold remains approximately the same, prices
would go down.
This is the so-called quantity theory of the The quantity theory
relation between money, credit, and prices and it of money and
holds good in a general way although it does not prices.
work out as simply as it reads. The adjustment of supply and
demand sometimes takes place very slowly. The volume of credit
which can be built upon a given reserve of gold is not absolutely
fixed, moreover; in some circumstances it may be more extensive
than in others. During the World War, for Defects of the
example, credit ran away from the gold reserve quantity theory as
in all the European countries. Enormous shown by recent
amounts of paper money were issued with very experience.
little gold in reserve to protect them. Due to reduced production, the
supply of ordinary goods sharply declined. A combination of these
two things, inflation of credit (i. e., potential demand for goods) and
decreased production, sent prices sky-high.[209] In the United States
credit was also inflated during the war and prices went up, though
not to the same extent as in Europe. Since 1920 the process of
“deflating” credit has been going on. This process of deflation is
guided by the federal reserve banks, which are able to contract the
volume of credit by charging higher rates for rediscount.
The Advantages and Dangers of Credit.—It is probable that at
least two-thirds of the buying and selling in the world is done on
credit. Nearly all large transactions are put through by the use of
credit for short or long terms. Credit affords many advantages to
modern industry and commerce; without it, indeed, our whole
economic system would break down. A few of Four functions
these advantages may be mentioned: (a) It which credit
economizes the use of gold and silver, by doing permits.
away with the necessity of passing gold and silver coin from hand to
hand at every transaction. (b) It enables large payments to be made
at distant points without an actual shipment of metallic money. (c) It
permits men to engage in business operations beyond their own
means by borrowing capital and using it productively. (d) It enables
people to invest their savings (by depositing in savings banks,
lending money on mortgages, buying bonds or stocks, etc.) so as to
secure a profitable rate of interest without great risk.
But there are also some disadvantages. The Credit may also
credit system often encourages extravagance in harm.
that people are tempted to buy goods which they eventually find it
hard to pay for; it tends to encourage speculation which frequently
results in heavy losses; and it sometimes enables promoters to
obtain capital when there is little or no chance of their being
successful. By strict governmental supervision, however, the
advantages can be retained and most of the dangers eliminated.
The Stock Exchange.—A word should be said about the place
where instruments of credit are most commonly bought and sold,
namely, the stock exchange. As its name implies, this is a market in
which men buy and sell stocks, bonds, and other securities.[210]
There is a stock exchange in every large city. The buying and selling
is done through brokers, who are members of the exchange and who
receive a small commission for their work, this commission being
paid by the persons for whom they buy or sell. A broker, at your
request, will buy or sell on the exchange any security that is listed
there. The amount of the purchase may be paid in full, or, if the
buyer desires, a partial payment of five, ten, or twenty per cent may
be made. This is called “buying on margin”. The Trading on margin.
current prices of all securities are kept posted on
the exchange; they go up and down from day to day in keeping with
market conditions. Shrewd investors try to buy when prices are low
and to sell when prices are high, but in this they are not always
successful. Many fortunes have been made—and lost—on the stock
exchange.
General References
F. W. Taussig, Principles of Economics, Vol. I, pp. 227-235 (Coinage); 236-251
(Quantity of Money and Prices); 265-273 (Bimetallism); 348-359 (Banking
Operations); 375-385 (The Banking System of the United States);
Isaac Lippincott, Economic Development of the United States, pp. 550-580;
H. R. Burch, American Economic Life, pp. 336-371;
W. A. Scott, Money and Banking, pp. 1-116;
W. S. Jevons, Money and the Mechanism of Exchange, pp. 3-41;
Marshall, Wright, and Field, Materials for the Study of Elementary
Economics, pp. 443-546;
C. J. Bullock, Introduction to the Study of Economics, pp. 224-246;
D. R. Dewey, Financial History of the United States, especially pp. 383-413;
Everett Kimball, National Government of the United States, pp. 460-479;
Horace White, Money and Banking (5th edition), passim;
F. A. Fetter, Modern Economic Problems, pp. 31-163.
Group Problems
1. How money and credit are related to prices. The meaning of “prices”. The
quantity theory of money. Relation of money to credit. Reserves for paper money.
Bank reserves. Discounting and rediscounting. How inflation of money and credit
affects prices. Index numbers. American experience during the years 1914-1921.
References: F. W. Taussig, Principles of Economics, Vol. I, pp. 427-445; C. J.
Bullock, Introduction to the Study of Economics, pp. 242-278; Irving Fisher,
The Purchasing Power of Money, pp. 8-32; Ibid., Stabilizing the Dollar, pp. 1-12; J.
A. Hobson, Gold, Prices, and Wages, passim; David Kinley, Money, pp. 199-223.
2. The American banking system: how it is organized and how it functions. D.
R. Dewey, Financial History of the United States, pp. 320-328; 383-390; F. W.
Taussig, Principles of Economics, Vol. I, pp. 375-399; C. F. Dunbar, Theory and
History of Banking, pp. 132-153; C. A. Conant, A History of Modern Banks of
Issue, pp. 396-447; E. W. Kemmerer, The A, B, C of the Federal Reserve System,
pp. 28-65; H. P. Willis, The Federal Reserve System, passim; A. B. Hepburn,
History of the Currency and Coinage of the United States, pp. 411-418; 511-544.
3. The controversy over free silver and its lessons for the future. D. R.
Dewey, Financial History of the United States, pp. 101-104; 210-212; 403-413;
436-437; 468; F. W. Taussig, Principles of Economics, Vol. I, pp. 265-273; J. L.
Laughlin, History of Bimetallism in the United States, especially pp. 266-280.
Short Studies
1. The early history of money. W. S. Jevons, Money and the Mechanism of
Exchange, pp. 19-30; David Kinley, Money, pp. 14-26.
2. The quantity theory of money. F. W. Taussig, Principles of Economics, Vol.
I, pp. 236-251.
3. American and foreign banking systems compared. E. R. A. Seligman,
Principles of Economics, pp. 524-550; or F. W. Taussig, Principles of Economics,
Vol. I, pp. 360-385.
4. Can the dollar be stabilized? Marshall, Wright, and Field, Materials for
the Study of Elementary Economics, pp. 474-483; Irving Fisher, Stabilizing the
Dollar, especially pp. 12-30.
5. The free-silver campaign of 1896. C. A. Beard, Contemporary American
History, pp. 164-198; D. R. Dewey, National Problems, pp. 220-237; 314-328.
6. Banking operations and accounts. C. F. Dunbar, History and Theory of
Banking, pp. 20-38.
7. American institutions for saving and investment. F. A. Fetter, Modern
Economic Problems, pp. 146-166.
8. Financial panics. F. W. Taussig, Principles of Economics, Vol. I, pp. 400-
426.
9. The high cost of living. J. H. Hammond and J. W. Jenks, Great American
Issues, pp. 143-159.
10. Economic crises. T. N. Carver, Principles of National Economy, pp. 427-
442.
Questions
1. Are the qualities of money given in this book in the order of their importance?
If not, rearrange them so. Can you think of any other essential qualities? What
objections would there be to the use of platinum as money? Pearls? Porcelain?
2. Gold dollars are not coined in the United States at all. How is it, then, that the
gold dollar can be the legal standard of value?
3. Name all the different kinds of money that are circulated in the United States
(including paper money) and tell when the issue of each kind was first authorized.
Examine the money you have with you. Tell where each coin was minted. In the
case of bills what is the security behind each? Can you detect counterfeit bills?
How?
4. Why was the action of Congress in demonetizing silver called “the crime of
1873”?
5. At the Democratic National Convention of 1890 Mr. Bryan said: “You shall not
crucify mankind upon a cross of gold.” Explain in full what he meant. Was there
any good reason for believing that the free coinage of silver at a ratio of sixteen to
one would (a) increase prices; (b) give relief to the debtor class; (c) benefit the
wage earner?
6. Explain the process by which, under a dual system of coinage, the metal
which is over-valued at the mint will drive the other out of circulation. Is it correct to
say that “cheap money drives out dear money”?
7. If you were engaged in business as a manufacturer, name all the different
dealings that you might have with a bank.
8. Explain what is meant by each of the following terms: demand note; endorser;
trustee; commercial paper; rate of discount; rediscounting; collateral; deflation;
coupon bond; preferred stock; broker; buying stock on margin.
9. Show how the volume of credit helps to determine prices and how the volume
of credit is related to the amount of gold coin in hand. Why does the quantity
theory of money not work out with mathematical accuracy in practice?
10. Does the argument in McCulloch vs. Maryland impress you as logical? Does
the decision mean that officials of national banks and of federal reserve banks are
exempt from state taxes? Does it mean that when a national bank occupies a
leased building the landlord pays no taxes to the city?
Topics for Debate
1. All banking institutions should be brought under the supervision of the federal
government.
2. A “compensated” dollar (adjusted to the general level of prices) should be
established as a measure of deferred payments.
3. The national and state governments should guarantee depositors against loss
in all banks chartered by the nation and the states, respectively.
CHAPTER XXIII
TAXATION AND PUBLIC FINANCE

The purpose of this chapter is to explain what taxes are, how they are
levied, and how they are spent.

The Cost of Government.—The cost of Taxation per capita.


maintaining the national government and all its
activities is now about four billion dollars per year, in other words
about forty dollars per annum for every man, woman, and child in the
country. The cost of maintaining state and local government varies in
different parts of the country, but it would be safe enough to put it
down as three billion dollars more, or thirty dollars per head. In round
figures, therefore, the average tax payment every year for each
individual in the United States is at least seventy dollars.[211]
Bear in mind, however, that only a small part The extent of the
of the whole population is earning the income burden upon the
which enables these taxes to be paid. When we income-earner.
eliminate all the children, all the women who are not employed in any
income-earning occupation, all the public officials who are paid out of
taxes, all the delinquents, cripples, paupers, unemployed, and so on
—when we subtract all these from the total it will be found that only
one person in five is an actual income-earner. From the earnings of
these twenty million people the entire seven billions in taxes must be
paid; there is no other source from which the taxes can come. A little
mental arithmetic will readily demonstrate, therefore, that every
income-earner in the United States pays, on the average, at least
$350 per year in taxes of one sort or another, in other words about a
dollar a day.[212]
Who Pays the Taxes?—“Oh yes”, someone Everyone is a
will say, “but most people earn small incomes taxpayer, directly or
and pay no taxes at all, or almost none. The indirectly.
heavy taxes are paid by wealthy men and women who own property
and have large incomes.” That is misleading. People who own
property and earn large incomes are the ones who actually hand the
collector his tax-money, to be sure; but they merely give him, for the
most part, money which they have collected from others. The owner
of an apartment house collects taxes from his tenants in the form of
rent; the storekeeper collects taxes in the price of his goods; the
lawyer and the doctor collect taxes when they charge fees. Taxes
are an element in the cost of everything, an element just as certain
as interest, wages, or profit. Everyone who rents a house, buys
goods, or hires any form of service pays taxes. If you analyze the
various items which make up the price of a suit of clothes, for
example, you will find that they usually come in this order of
importance; wages, cost of materials, taxes, profits, interest.[213] The
chief factors which make up the rent of a house are interest, taxes,
and profits in the order named. Hence it is that while landlords,
merchants, manufacturers, and others make the direct payment of
taxes to the government, they in turn pass the burden to tenants and
consumers.[214]
The Incidence of Taxation.—Taxes, The way in which
therefore, do not usually stay where they are taxes are shifted.
levied. They are shifted from one shoulder to another until they
finally reach someone, usually the ultimate consumer, who cannot
unload the burden upon anybody else. This ultimate resting-place of
a tax is called its incidence, and an important thing about any tax is
to discover just what its incidence is; for the justice or injustice of
taxation depends upon the ability of the actual taxpayer to bear the
burden and not upon the wealth of the ostensible taxpayer. If the
government were to levy a tax of one cent per loaf upon bread, there
would be a storm of protest because everybody would recognize it
as a direct tax upon one of the necessities of life. But a tariff duty on
wheat, or a property tax on flour mills or bakeries, is just as certainly
a tax on bread and is paid ultimately by those who buy it. The chief
difference is that in the latter case the payment is made by the
consumer without his knowing it.
Most people pay taxes unknowingly. Their Relation of taxes to
taxes are concealed in rents or prices, and they rents and prices.
complain bitterly that these things are high. It does not occur to the
average American wage-earner that if taxes were lower, rents and
prices would be lower, and that if there were no taxes, it would be
exactly the equivalent to finding every morning, on coming down to
breakfast, a crisp, new dollar-bill on his plate. Demagogues tell us
that trusts, and profiteers, and other forms of organized avarice are
responsible for high prices; but one of the biggest factors in the high-
cost-of-living is the high-cost-of-government.
If this enormous flow from the nation’s If waste were
earnings into the public coffers were wholly, or avoided the tax
even largely, used to promote and encourage burden would be
production, it would not be so bad. Much of it is diminished.
wasted, or spent without adequate return. This takes place because
the people do not keep close watch on the officials whom they elect
to public office and do not hold them to a strict accountability when
public money is squandered. More than a hundred years ago the
most eminent of American jurists, Chief Justice John Marshall,
pointed out that “the power to tax involves the power to destroy”. He
was right; the power to tax is the most far-reaching power that any
government can possess. By the use of the taxing power a
government can take from the people what they would otherwise
save, thus preventing the increase of the nation’s wealth and
ultimately breaking down its prosperity.
How Taxes Differ from other Payments.— Taxes are:
Taxes differ from most other payments in two
respects. First, they are compulsory. No one (a) compulsory.
need pay interest, rent, wages, or prices unless
he bargains to do so; but the payment of taxes is not the result of
any bargain. Taxes are levied without any reference to the initiative
or wishes of the individuals upon whom they may fall, except, of
course, in so far as these individuals by their votes may have an
influence in determining the general taxing policy of the government.
Second, taxes are not payments made to the government by
individuals and corporations in return for (b) levied without
services rendered. The man who rides a reference to service
hundred miles on a railroad pays twice as much rendered.
as one who goes half that distance, because he gets twice as much
for his money. But the man who pays a thousand dollars in taxes
does not get twice as much in benefits from the government as the
one who pays only five hundred dollars.
Nearly all payments that we make are in the The basis of
form of a quid pro quo; they are in proportion to taxation is ability to
the benefits which we receive. This is the case pay.
in payments for all forms of goods or services—the one great
exception is the payment of taxes. Taxes have no direct relation to
benefit; those who pay very little in taxes, either directly or indirectly,
sometimes receive a large return in the form of public services. Take
for example the taxes that support the public schools. The fact that a
wealthy man has no children, or prefers to send his children to a
private school, does not relieve him of the obligation to pay his full
share of what public education costs the community. On the other
hand, a man whose contribution in taxes is very small may send a
dozen children, one after another, through the public schools without
any extra cost.
It would not be possible to base taxation upon Why taxes cannot
service, because there is no way of knowing be adjusted to
how much benefit each individual receives from service.
the government’s work. Do some individuals, for example, obtain
more benefit than others from the maintenance of law and order or
do all derive benefit alike? Who gets the greater benefit from clean
streets, the rich man who drives his motor car over them, or the poor
man whose children use the streets as a playground? Taxes could
not be adjusted to benefit. Even if they could be so proportioned, it
would be unwise to do so. The general interest requires that
everyone should enjoy the benefits of police protection, the public
schools, the parks, the playgrounds whether they are able to pay for
them or not.[215] So taxes are levied in order to pay for these things,
not on a basis of individual benefit, but simply by putting the heaviest
burden in the first instance upon those who are best able to pay it,
letting them shift it if they can.
Principles upon which Taxes are Levied.—How is the ability of
individuals to pay taxes estimated? It is done by taking some such
thing as property or income as the basis. Those who have more
property or income are called upon to contribute more than those
who have less. About a hundred and fifty years The basic principles
ago a famous writer on economics, Adam Smith, of taxation
laid down four principles to which all taxation according to Adam
should conform. These maxims of taxation are Smith.
now everywhere recognized as valid and are worth remembering.
Briefly stated, they are as follows: People should be taxed according
to their ability to pay; all taxes should be definite and not uncertain or
arbitrary; they ought to be levied at the time and in the manner which
causes the least inconvenience to the people; and they should be so
contrived as to take out of the pockets of the people as little as
possible over what is needed by the public treasury. Those who
make the tax laws do not always heed these maxims, and taxes are
sometimes levied on the principle of getting the most money with the
least trouble.[216]
Local Taxes.—The greater portion of the Taxes on property.
taxation levied by cities, counties, towns, and
villages is in the form of taxes on property. This is a direct tax and as
a rule it is levied on all private property, of whatever sort, at a uniform
rate of so much per thousand dollars of valuation. A tax levied in this
uniform way on all private property is called a The general
general property tax. In some states, however, property tax.
provision has been made for classifying the various kinds of property
and taxing each kind at a different rate. Property is first classified into
two divisions, real property, and personal property.[217] Real property
(or real estate) consists of land, buildings, and other fixtures
established on the land; personal property consists of, first, tangible
things of a movable nature such as household furniture, machinery,
merchandise; and second, intangibles such as bonds, mortgages,
and bank deposits. Where there is a classified The classified
property tax, each of these three forms (real property tax.
property, tangibles, and intangibles) is taxed at a different rate. One
reason for taxing them at different rates is that real estate requires a
great deal more in the way of public services (for example, in paved
streets, water supply, sewerage, etc.); another reason is that while
real property cannot evade taxation intangibles can usually do so
when the tax is too heavy.[218] If the rate of taxation on intangibles is
lowered, the temptation to evade is not so great. It will usually be
found that more money will come into the public treasury from a
moderate rate of taxes on stocks and bonds than from an
oppressively high rate.
A few communities also obtain some revenue Other local taxes.
from another direct tax, the poll tax, which
amounts to one or two dollars per year on each adult. In some cities
franchise taxes are laid upon public service companies (such as gas,
electric lighting, and street railway companies). The proceeds from
these sources do not form any large proportion of the total revenue.
All collecting of taxes is preceded by a formal Assessments for
step known as assessment. No tax can be purposes of
legally collected unless it has been assessed in taxation.
ways prescribed by law. Property of all kinds is valued for taxation by
officials known as assessors. Usually they are county or city officials,
sometimes appointed, sometimes elected. They re-value property at
stated intervals and set their assessment at what they believe to be
the market value (unless they are instructed to assess at a
percentage of the market value as is the case in some states).
Income taxes, corporation taxes, and inheritance taxes are assessed
by the tax officials on the basis of sworn statements made to them
by the taxpayers.
In the case of such public improvements as Special
sewers, street pavements, and sidewalks it is assessments.
the custom in many cities to levy a special assessment upon the
owners of the property that is benefited. These special assessments
are levied in proportion to the benefit received; they are not taxes in
the ordinary sense. When the nation or state or Taking property for
city requires land for public improvements it has public use.
the right to acquire it from the owner, even though he be unwilling to
sell. The public authorities, by their right of eminent domain, can take
land or other property for public use at any time, but must give the
owner just compensation. If the amount of compensation cannot be
agreed upon between the government and the private owner, it is
fixed by the courts.
State Taxes.—The states obtain their revenue The sources of
in various ways. One common method is by state revenue.
requiring the cities, counties, or towns to pay over to the state a
certain fraction of the sums which they collect on property. Thus,
when the citizen gets his bill for local taxes he finds it itemized—so
much for state taxes, so much for county taxes, and so much for city
or town taxes. Most of the states also levy taxes on corporations,
including railways, telephone companies, insurance companies, and
banks. These taxes may be calculated upon capital or net earnings
or deposits or upon some other basis. A few states tax inheritances
and a few levy a state income tax. Taxes on inheritances are usually
progressive, that is, the rate is higher in the case of large inherited
fortunes. State income taxes are levied upon the net earnings of
individuals or partnerships, a certain minimum income being left
exempt. Most of the states have other miscellaneous sources of
revenue, some of them important, as, for example, the annual
license fees imposed upon all owners of motor vehicles.
National Taxes.—The national government, Income and excess
by reason of its need for larger revenues in profits taxes.
recent years, has resorted to many forms of taxation. At the present
time the principal sources of national revenue are the taxes on the
incomes of corporations and individuals, the customs duties, the
excises, and the inheritance taxes. The national income taxes are
levied upon the net earnings of all individuals, partnerships, and
corporations above a certain minimum. The rate of taxation, in the
case of individual incomes, is progressive—a normal tax is laid upon
all incomes up to a certain figure and surtaxes are levied upon
incomes above this amount.[219] Under the original provisions of the
constitution, the national government could not levy direct taxes
unless it apportioned them among the several states according to
their population, and according to a decision of the Supreme Court in
1894 an income tax is a direct tax.[220] But the Sixteenth Amendment,
adopted in 1913, now gives the national government authority to tax
incomes “from whatever source derived” without the necessity of
apportionment among the states. Once a year every person or
corporation earning a net income above the prescribed minimum
must make a sworn statement setting forth the exact amount of such
earnings, and upon this “income tax return” the legal rate is
assessed.
Duties on imports still yield a large revenue, Duties on imports.
as they have done every year since 1790. No
duties may be laid upon exports, such duties being forbidden by the
constitution. This is in some respects unfortunate, because duties on
exports go into the price of the exported goods and thus fall upon the
foreign consumer. A tariff on exports (such as lumber, coal, and ore)
would not only yield a considerable revenue but would help to
conserve the natural wealth of the United States. The excises are
levied upon tobacco, theatre tickets, and other things which are rated
as luxuries.[221] The national government also levies an inheritance
tax, the rate of the tax depending upon the value of the property
inherited. These various taxes bring in between three and four billion
dollars per year.
The Two Purposes of Taxation.—The main Taxes for revenue
object of all taxation is to produce a revenue. and taxes for
But this is not the only object. Taxation may also regulation.
be used to bring about such social reforms as the nation or the
community may deem desirable, and taxes are sometimes adjusted
to this end. For example, the manufacture of goods by the use of
child labor can be checked by placing a heavy excise tax upon such
products.[222] It is believed that the growth of large fortunes can be
checked by the imposition of heavy surtaxes on large incomes and
on inheritances; the present national taxes on incomes and
inheritances have been framed with this end in view to some extent.
In other words the system of taxation can be used and is being used
in some measure to secure such economic and social readjustments
as Congress and the state legislatures think desirable. The question
is: How far should the law-making bodies go in this direction? Many
people believe that “swollen fortunes” are an evil in a democratic
society and that all earnings above a certain point should belong to
the community. Others feel that heavy surtaxes place a damper upon
ambition, that they lessen the amount of money saved by the whole
people, thus reducing the amount of capital available for industry,
and that they give the government large sums which are spent
wastefully.[223]
Tax Exemptions and Extravagance.—When Taxation and class
taxation is regarded as a means not only of prejudice.
raising a revenue but also of redistributing wealth it takes on grave
possibilities of abuse. The majority among the voters can always find
reasons for increasing the burdens on the minority; the wage-
earners urge that more taxes ought to be placed on the rich and
insist that they themselves be exempted from taxation upon their
incomes. The chief evil in all this is not the injustice to the rich, for
they usually manage to shift the burden down the line till it comes
back upon the wage-earner; the unfortunate part of it is that the
masses of the people, proceeding under the delusion that they pay
none of the taxes, are quite unconcerned when they see large sums
of money being collected by the government and spent wastefully.
They do not realize that it is their money; that they earned every cent
of it before the government obtained it to spend. If they could be
induced to see matters in this light, they would never permit their
representatives in Congress, in the state legislatures, and in city
councils to throw money around with such a lavish hand. Tax
exemptions and extravagance are twin brothers.
Proposed Reforms in Taxation.—Various The single tax.
new forms of taxation are proposed from time to
time. Many years ago a well-known American social reformer, Henry
George, advocated the placing of all property taxes on land alone,
allowing buildings and personal property to go untaxed altogether.
His argument was that the high value of land in cities and towns is
created by the community, not by the owner. Vacant land in the
downtown portion of a large city is sometimes worth many hundred
dollars per foot. What gives it this high value? Not the owner, for he
has done nothing to improve it. The growth of the city round about
this land has made it valuable. This “unearned increment” of value,
therefore, Henry George proposed that the community should take
by levying a very heavy tax upon it.[224]
The single tax proposition, as above outlined, Objections to the
has many earnest advocates; but it has made single tax.
very little progress as a practical policy in this country. The objection

You might also like