2016 3rd MEC International Conference on Big Data and Smart City

Proposed Application of Big Data Analytics in

Healthcare at Maharaja Yeshwantrao Hospital
Mimoh Ojha[1] Dr. Kirti Mathur[2]
International Institute of Professional Studies International Institute of Professional Studies
Devi Ahilya University Devi Ahilya University
Indore, India Indore, India

Abstract— This paper gives an insight of how we can store

healthcare data digitally like patient’s records as an Electronic that includes both tools and processes that an organization
Health Record (EHR) and how we can generate useful required to handle large amount of data and storage facilities
information from these records by using analytics techniques and [16]. To improve the quality of healthcare it is necessary to
tools which will help in saving time and money of patients as well effectively figure out the hidden facts and figures from large
as the doctors. This paper is fully focused towards the Maharaja volume of collected data to answer new challenges faced by
Yeshwantrao Hospital (M.Y.) located in Indore, Madhya reducing cost of healthcare. Similarly in government hospitals
Pradesh, India. M.Y hospital is the central India’s largest like Maharaja Yeshwantrao hospital (M.Y.H.) located in
government hospital. It generates large amount of heterogeneous Indore, which is considered to be the central India’s largest
data from different sources like patients health records,
government hospital generates large volume of data and these
laboratory test result, electronic medical equipment, health
insurance data, social media, drug research, genome research, data which can be considered as “Big data” as thousands of
clinical outcome, transaction and from Mahatma Gandhi people are treated daily and most of them belong to the below
Memorial medical college which is under MY hospital. To poverty line and usually daily wagers class. If they visit
manage this data, data analytics may be used to make it useful hospital they have to wait in a long queue for treatments and
for retrieval. Hence the concept of “big data” can be applied. Big their day is wasted, and they loose their salary of that day
data is characterized as extremely large data sets that can be which keeps them deprived of food even. So to overcome
analysed computationally to find patterns, trends, and from this situation we can store the patient’s information in
associations, visualization, querying, information privacy and electronic media like Electronic Health Record (EHR) which
predictive analytics on large wide spread collection of data. Big
will save time and money of both patients as well as of
data analytics can be done using Hadoop which plays an effective
role in performing meaningful real-time analysis on the large healthcare industry and government. An EHR is a systematic
volume of this data to predict the emergency situations before it collection of patient’s electronic health information which can
happens. This paper also discusses about the EHR and the big be shared across different units of hospital through a
data usage and its analytics at M.Y. hospital. connected network. EHRs may have data of demographics,
medical history, his/her medication and allergies, previous
Keywords— Big data, Healthcare, EHR, Analytics, Hadoop. laboratory test results, radiology related information, vital
organs status and personal information [17].
I. INTRODUCTION Big data analytics helps in discovering valuable decisions by
understanding the data patterns and the relationship among
The healthcare industry today generates large amount of data
them with the help of clustering, classification, decision tree,
from record keeping of patient related data [11], health and
association, sequence analysis, segmentation, regression and
medical devices related data, drug research data, health
web mining algorithms. Big data analytics can be done with
insurance data, clinical outcome data, laboratory data, images
Hadoop which is an open-source software framework for
with graphic, audio, video data, health policy data and
storing data, it provides massive storage for any kind of data,
patient’s feedback data. This generated data is both structured
enormous processing power and the ability to handle virtually
and unstructured. In today’s digital era, it is mandatory that
limitless concurrent tasks or jobs
these data is digitized [4]. The digitization of healthcare data
in return will help providing enhanced quality of care, with [17]. With the help of EHR, big data and its analytics we can
reduced healthcare cost. With information in digital form, store M.Y healthcare information digitally and doctors can
healthcare organizations can use available tools and find the hidden knowledge and understand patterns and trends
technologies to analyze that information and generate valuable to improve treatment, increasing life expectancy and lowering
insights in treating [3]. Big data can be defined as a very large the costs involved by proper diagnosis in the early and
volume of both structured data (like relational database) and emergency stages of the disease so that proper treatment is
given on right time at M.Y.H.
unstructured data (like text, multimedia, web pages) which is
very difficult to process using traditional database and
software techniques (DBMS, SQL). Big data is a technology

II. LITERATURE REVIEW Health Centre. Tertiary Health care is the third level of health
system, where in specialized consultation is provided on
It is realized that digitization is must needed in health care referral from primary and secondary health care. Specialized
organizations as large amount of data is generated related to a Intensive Care Units, advanced diagnostic support services
patient health record to its genome analysis for keeping track and specialized doctors are the key features of it. In India,
of this information its efficient storage is desired using big tertiary care is provided by either medical colleges or
data and its analytics. advanced medical research centres [5]. M.Y hospital provides
R. Sathiyavathi in 2015 stated that with the digitization of healthcare facilities at all the three levels of healthcare system
health information, doctors can generate deep insights that can of India. But, it is a more form of tertiary healthcare as it acts
streamline clinical workflows, optimize care, strengthen as referral hospital for all the central India’s hospitals. Now
doctor-patient relationships, cut costs, and improve outcomes M.Y hospital comprises of group of hospitals also known as
[1]. M.Y. campus.
Wullianallur Raghupathi et al in 2014 elaborated in their paper
that big data analytics has the potential of transforming the
sophisticated technologies for gaining insight into clinical and
other data repositories and to make decisions.
Muni Kumar et al Manjula R in 2014 said achieving better
outcomes at lower cost is important for health care and which
can be achieved through the implementation of Hadoop HDFS
and MapReduce to uncover the information lying in big health
data sets [5].
Mackinsey & Company reported in 2013 show that big data
analytics might reduce cost in US by $300 billion per year
whereas in clinical operations, saves $165 billion and in R &
D say billion in waste. India spends around 4.2 % of its GDP
on healthcare this necessitates that it should use technology
like big data analytics to offer better healthcare facilities to its
M.Y.H. is a government hospital of Indore city in India.
Indore is Centre of health care and also a pharmaceutical hub Fig.1. Levels of healthcare in India
of central India. M.Y. hospital is the largest hospital of
Madhya Pradesh among both private and government Procedure for Treatment at M.Y.H.:
hospitals. Since 1955, it was Asia's and central India’s first Patients entering the hospital are required to follow a series of
largest government that was computerized. It includes 1200 task to get treatment
beds with all the major medical departments such as 18
1. At first patients have to register themselves at OPD
bedded MICU, 8 bedded ICCU, 5 haemodialysis machines, by paying minimal charge of Rs 10 to get an
endoscopy unit, ventilators etc. It is 8 storied government appointment.
hospital is surrounded by sub-hospitals in its campus namely
2. After registration at OPD patients are diverted to the
300 bedded Chacha Nehru Children hospitals, 100 bedded
specific department/unit or sub-hospital of hospital
M.R TB hospital, 100 bedded cancer hospital . This hospital according to the health related information given by
also has a medical college associated with it. This hospital patients at OPD.
provides special privileges to poor under a central government
3. Here specialized doctors treat the patients and
scheme. Thousands of people from all around central India
accordingly prescribe them medicines or send them
come on regular basis to this hospital. It is also emergency to laboratory for test.
Centre for most of the rural and district hospital of Madhya
4. Patients then go to laboratory for test and get their
Pradesh. All the modern healthcare facilities are available in test report.
this hospital. It also has Mobile Blood Bank, which
5. After going through test results doctors then treat
encourages youth to donate blood and in the case of them accordingly.
emergency this mobile blood bank can be used as a relief and 6. Mostly medicines are freely available and medicines
can help to save life [8]. of diseases like cancer. Ebola are available at cheaper
The Rural public health care system in India has three rates in this hospital.
different levels of health care access. Primary, secondary and 7. Once registered at OPD patients can come for routine
tertiary health care. At the lowest level, we have primary checkups without paying any extra amount.
health center (PHC) which is basic units having minimum 8. In case of emergency a team of specialized doctors is
facilities serving the rural India, each PHC supervises 6 sub- always available.
centers, sub-centers are most basic units of health in villages
and first point for treatment between villagers and public 9. Here patient’s gets healthcare facility at cheaper rates
health care. Secondary Healthcare is the second tier of health for advanced surgery like joint replacement, hearth
system where patients from primary health care are referred etc. which is otherwise very costly at private hospital.
for specialized treatment [5]. The health Centre’s for Even ICCU are very cheap per bed.
secondary health care are District hospitals and Community
For our research work, to see how the current healthcare
system in M.Y. hospital works and what facilities are given to
the patients and what are the problems faced by the patients.
We conducted a survey on 150 people at M.Y hospital and we
have tried to make the relationship between the current
working of M.Y. hospital and current technologies available in
market for healthcare and how with the help of big data
analytics we can improve the healthcare facilities at this
Fig.2. Result for how many people agree that they have to wait in long queue.
In our survey we asked many questions like name, age, city,
whether they have health insurance or not, whether they have This problem can be solved by adoption of electronic health
to wait in a long queue to take doctor’s appointment, what are record (EHR) at M.Y.H. which will help in storing patient’s
the problems faced by them when they visit hospital, their record electronically and which will facilitate, retrieval
current health problem or disease, and their past health issue, eliminating the waiting time of the patients.
what are their annual income and what is their annual medical Our another query regarding the facilities offered at M.Y.H.
expenses. In addition to this we asked them about whether for which most of the people said that they are average and
they use smart phone or internet and would they like to store emphasized improvement in this area, where improvement is
their health records which can be in the form of Electronic needed, according to people is patient management most
health record (EHR) and we have also asked them about the because at times they ported from one area to another extreme
our “ future research work” which will be extension of this end of hospital which consumes lot of time unnecessarily and
research work, whether they would like to have “one touch” in cases of emergency patients under-go catastrophe, losing
facility, meaning developing an application though which a their life. The pie chart below represent above problem and
patients would access his all present and past medical records also shows the survey report.
and details anytime and anywhere.
The result derived from this survey done at M.Y.H highlights
the various problems faced by both patients and doctors such

1. Patients have to wait for long time in different

queues at OPD, doctors sitting area, laboratory,
medicine department etc.
2. Sometimes patients lose their prescription slips,
which makes difficult for doctors to treat them.
3. As thousands of people come to this hospital, and
sometimes there is a mismatch in their laboratory and
health record which can be dangerous.
Fig.3. Survey result for facilities offered at M.Y hospital
4. Doctors have to treat thousands of people regularly
without any past health record of an individual
making it difficult to diagnose their problem which is The third aspect of our survey with respect to the financial
a tedious job to perform without any error. background of the patient coming to M.Y.H. We asked them
5. This patients treatment process is quit time about their annual medical expenses. This survey made us
consuming which in return discourages a daily realize that people who visit M.Y hospital are poor and unable
wagers to take treatment on time. to afford private hospital because of their background and they
cannot afford private hospitals. When patient along with
Solution to all above mentioned problem can be digitization of attainders visit hospital are usually daily wagers who even
current system by using big data analytics. cannot afford staying and eating for days together due to their
In this survey we asked people whether they have to wait in poor financial background to spending time at hospital is
long queues for doctor’s appointment and to give sample at constraint for them. We feel solution to this problem can be
laboratory. Majority of them said YES they have to wait in big data analytics because a patient comes for follow up he
long queues and is a time consuming process. will have to spent minimum possible time for treatment. The
graph below represents relationship between annual income
and medical expense of individual.
Technology has changed the working of hospitals, and has

allowed nurses and doctors to be more efficient with patients
[13]. Big Data analytics methods and technology can be
implemented at M.Y hospital because as per the health
department of India, Madhya Pradesh where M.Y. hospital is
located will be first in country to have fully computerized
health services and with this future plan M.Y hospital can be
digitalized. Another thing which acts as a feather is
government of India has said it will computerize public
hospitals, 20 district hospitals, and three medical colleges by
linking them with State Wide Area Network (SWAN) in the
coming years. SWAN is one of the key infrastructure
Fig.4. Survey result for income and annual expenses components under the national e-governance plan of the
government of India. Health Minister said patients‟
Another financial aspect related to health insurance we found registration slips, medical history, medicines, test reports, and
very few had health insurance and were also unaware of X-ray would be made available online by post on computers.
several existing schemes like “Pradhan Manri Suraksha “This will help the patients in getting treatment in any other
Yojna” making us realize that government policies are still not hospital of the state without carrying documents. Not only
reaching to poor and needy people due to lack of awareness in this, patients visiting government hospitals would be given
this population. The pie chart shows our survey result on one Unique ID number in which his/her Electronic Health
insurance. Record (EHR) would be kept. This would help the doctor in
getting complete medical history of the patient and would give
treatment accordingly”.
Big data in health-care refers to the patient’s data such as
physician notes, lab reports, x-ray reports, case history, diet
regime, list of doctors and nurses in a particular hospital [l4],
and drug analysis, social media, genome research,
transactional data etc. Since long, healthcare organizations
have been storing ample amount of transactional data in
different databases, formats, and systems. Apart from this
other information driving data such as non-traditional, less
structured data from web blogs, social media, call centers, chat
sessions, video chats, email, equipment sensors, photographs,
and digital images can be mined for decision making.
Fig.5. Number of people who have health insurance
Reducing the cost of both storage and computing power have
made data collection feasible. Big Data in healthcare
incorporates the following [6]:
V. PROPOSED IMPLEMENTATION OF EHR AND BIG DATA IN Traditional enterprise data: Such as transactional data from
HEALTHCARE AT M.Y.H. insurance, patient admissions, laboratory test records etc [6].
M.Y.H. generates a large amount of data and if we store this Health and medical devices data: Such as Computerized
data electronically with the help of computers and EHR and Tomography (CT) digital scanners, life support systems,
telemetric and telemedicine data [6].
other technology will help to perform analytics over it. With
EHR we can store a patient data and generate unique number Social media data: Related to patients feedback streams,
blogs on Twitter or Facebook, MySpace, and
specific to him and through this unique number patient
information can be retrieved easily. The data can be stored in
Medical science research: Of drug analysis, genome
data warehouses also known as EDH and we can perform data research, clinical and transactional research.
mining, data classification, data clustering on this voluminous
Miscellaneous data: Like health policy, clinical trails of
data to obtain the useful information. Problems highlighted in treatments etc.
the survey can be solved with the use of computers in
hospitals by doctors and patients.
Veracity: It refers to the noisy and incorrectness in healthcare

data. All obsolete data related to medicines and drugs should
be cleaned by removing this ‘dirty data’ periodically.
Validity: It refers to time dependent data in healthcare like in
telemedicine patient’s information of each second is important
during treatment. If data reaches late patient’s life is at stake.
For example there are certain tests which are locally not
possible to detect, than samples are sent to big cities and their
reports takes three to seven days in India to give results. Big
data can play an important role by making available reports or
results using big data analytics in seconds.
Volatility: It refers to how long data is valid and storable and
if obsolete should be deleted. For small heath problems like
normal fever and pain information may be eliminated
periodically because they become obsolete with time and do
not contribute to making any long run decisions for big
diseases at times.

Fig.6. Sources of health care data Advantages of Big Data at M.Y hospital:
By digitizing, combining and effectively using big data [11],
Characteristics of Big Data: healthcare services can be improved at M.Y. hospital, with the
Big Data has six characteristics help of EHR patients record can be stored electronically which
Volume: Healthcare organizations generates enormous data will help in speedy accessibility of health records. The use of
from different sources of healthcare data as described above. EHRs will greatly increase healthcare data [2]. Potential
This facility provided by big data can help to store cell benefits of big data include detecting diseases at earlier stages
sequence structure of enormous individuals which in turn can when they can be treated more easily and effectively,
help us detect future genetic disease like diabetes, cancer etc managing specific individual and population health and
and which can be proactively handled and may be prevented detecting health care fraud more quickly and efficiently.
from these diseases to occur. Numerous questions can be answered with big data analytics.
Velocity: Different units in a hospital simultaneously Certain developments or outcomes may be predicted and/or
generates large amount of data. Tones of opinion and estimated based on vast amounts of historical data, such as
relationship are generated on social media. This is valuable length of stay in hospital, patients who will choose elective
information for tracking patient feedback for hospitals, surgery, patients who likely will not benefit from surgery,
payments, adverse effects for drug trails, and well being patients at risk for medical complications, patients at risk for
discussions. The prompt availability of data can be used to other hospital-acquired illness; illness/disease progression;
take opinion from doctors at remote location for treating a patients at risk for advancement in disease states; causal
patient at any location. factors of illness/disease progression [11], monitoring the
Variety: It refers to many sources of data both structured and hospital quality, improving the treatment methods at M.Y
unstructured such as spreadsheets, databases, emails, photos, hospital. Big data and its analytics can help in following area
videos, monitoring devices, telemedicine, sensors, EHR etc at M.Y.H:
are structured data type where as unstructured data creates
Clinical Treatments: Big data will allow efficient storage of
problems for storage, mining and analyzing data. All the past
both structured and unstructured healthcare data and by
experiences for treating a patients and lessons learned can be
performing analytics on these data efficient and proper
stored and referred later.
treatment can be given to patients and with reduced cost and
Administration: Big data will help to maintain all the
transactional records, financials and will keep the track of
EHRs data of patients, feedback and schedule of doctors and
nurses and help administration to make decisions.
Health Policy: MY hospital is central India’s largest
government hospital thousands of people come regularly for
treatment. Big data will help to obtain hidden knowledge from
large data sets which also allow government to make health
policy and decide health budget according to the obtained
results from large data sets.
Clinical research and development: M.Y.H. will be a
research ground for its medical college associated with it. Big
data analytics help medical students to do research in
treatment, genomics, semantic and drug analysis research.
Public health: Using big data will benefit in tracking diseases
Fig.7. Six V’s of Big data outbreak, its pattern analysis and transmission for improving
public health surveillance and response. Quick accurately
targeted vaccines may be developed [11]. For example,
finding out new dangerous diseases, and their treatment and facility. Our survey is also done using Google analytics. Cloud
prevention, benefitting people proactively to cure. computing is a cheaper technology.
Device/remote monitoring: Big data have the capability of OpenRefine: For cleaning a database it is used to improving
capturing and analyzing the real-time large volumes of fast- veracity in big data. Example- GoogleRefine, that allow us to
moving data such as of in-hospital and in-home devices, for get ready everything for analysis.
safety monitoring [11]. Other technology like machine learning, A/B testing,
Clinical outcome and safety: Big data helps in easy visualization, search based application can be used for
accessibility of health record, in case of emergency with full analytics of healthcare data at M.Y.H.
Fraud Detection: In healthcare organization insurance claim
related fraud can be detected and eliminated using big data VII. CONCLUSSION
M.Y hospital generates enormous data which can be termed
as big data. As M.Y.H works on traditional system it is
VI. BIG DATA ANALYTICS TOOL AND TECHNIQUE necessary to convert paper work into paperless work using
digitization tool like computer and EHR which will allow
To perform analytics on big data generated by M.Y.H. storage of healthcare data electronically. Big data analytics at
Hadoop and HDFS (Hadoop file system) tools may be used this hospital will help to provide modern healthcare facilities
for implementation. For data storage and processing hadoop with reduced cost and time which will be beneficial for all the
being scalable, open source and fault tolerant may be used. It people of central India, as most of the people who visit are
runs on commodity hardware useing HDFS having features below poverty line. This paper addresses the problem faced by
like fault tolerance, high bandwidth and clustered storage patients and doctors and also provides solutions of these
architecture. For distributed data processing for structured and problems. It also discusses about how the big data analytics
unstructured data it runs MapReduce. Figure 8 illustrates the can transform people’s healthcare by gaining insight for
layers found in the software architecture of a Hadoop stack making clinical decisions. It also presents the big data
[10]. analytics tools and techniques like HDFS (Hadoop File
System) for the huge data storage and Hadoop, cloud,
OpenRefine. With the use of big data analytics in M.Y.H. will
help to shorten the waiting time of patients at different queues,
for the doctors and will provide patients details through a
unique number anytime and anywhere. The major advantage
would be in emergency situation and would save more life
increasing patients and doctors satisfaction at M.Y.H.


Our research can be extended making use of cloud computing
for portability of EHRs to all over the country or world for
better treatment anywhere anytime without carrying past
treatment record of individual making it more cost effective ,
timely, efficient and paperless.
