Geetha Polaboina - Data Analyst - CV
Geetha Polaboina - Data Analyst - CV
Geetha Polaboina - Data Analyst - CV
Data Analyst
Email: Location: London – UK Phone: +44
A Data Analyst having extensive knowledge of data analysis processes, Software Engineering practices and Database
skills and familiar with working on Agile Project delivery. Proficient at using Data Analysis tools, technology,
algorithms and process towards validating the data. Worked in a fast paced environment and always focused towards
quality delivery.
PROFESSIONAL SUMMARY:
● Highly efficient in Data Analysis, Machine Learning, Data mining with large data sets of Structured and
Unstructured data, Data Acquisition, Data Validation, Predictive modeling, Data Visualization, Web
Scraping. Adept in programming language in Python.
● Proficient in managing entire data analysis project life cycle and actively involved in all the phases of
project life cycle including data acquisition, data cleaning, data engineering, features scaling, features
engineering, statistical modeling (decision trees, regression models, clustering), dimensionality reduction
using Principal Component Analysis and Factor Analysis, testing and validation and data visualization.
● Adept and deep understanding of Statistical modeling, Multivariate Analysis, Model Testing, problem
analysis, model comparison, and validation.
● Expertise in transforming business requirements into analytical models, designing algorithms, building
models, developing data mining and reporting solutions that scale across a massive volume of structured
and unstructured data.
● Skilled in performing data manipulation and data preparation with methods including describe data
contents, compute descriptive statistics of data, regex, split and combine, remap, merge, subset, reindex,
melt and reshape.
● Worked on Agile practice and delivered the project using Scrum methodology, good work experience on
participating Scrum calls, Sprint planning, Backlog refinement.
● Experience in using various packages like Pandas, NumPy, Seaborn, SciPy, Matplotlib, Sci-kit-learn
● Extensive experience in Text Analytics, generating data visualizations using Python and creating
dashboards using tools like Power BI.
● Good Knowledge in Proof of Concepts (PoC's), gap analysis and gathered necessary data for analysis from
different sources, prepared data for data exploration.
● Good industry knowledge, analytical & problem-solving skills and ability to work well within a team as well
as an individual.
● Expertise in transforming business requirements into analytical models, designing algorithms, building
models, developing data mining and reporting solutions that scale across a massive volume of structured and
unstructured data.
● Experience in designing stunning visualizations using Power BI software and publishing and presenting
dashboards and desktop platforms.
● Experience with Data Analytics, Data Reporting, Graphs, Scales, PivotTables and reporting.
● Worked and extracted data from various database sources like SQL Server regularly accessing JIRA tool
and other internal issue trackers for the Project development.
● Highly creative, innovative, committed, intellectually curious, business savvy with good communication and
interpersonal skills.
EDUCATION QUALIFICATION
Post-Graduation – Computer Science University of East London
Bachelor’s degree – Electronics and Jawaharlal Nehru Institute of Technology, Telangana, India
Communications Engineering
TECHNICAL SKILLS
Statistical Methods: Distributions, Central Tendency, Dispersion, Random
Variables and Correlation
Database: SQL Server, MongoDB
PROJECTS
PROJECT: PATIENT HEALTH COST CLIENT: EMIS HEALTH Sep – 2020 – Till Date
Project Description: A UK Nationwide survey of hospital costs conducted by the UK Agency for Healthcare
consists of hospital records of inpatient samples. The given data is restricted to the city of London and relates to
patients in the age group 40-60 years. The agency wants to analyze the data to research on the healthcare costs
and their utilization.
The goals of this project are:
To record the patient statistics, the agency wants to find the age category of people who frequently availed the
facility and has the maximum expenditure.
In order of severity of the diagnosis and treatments and to find out the expensive treatments, the agency wants
to find the diagnosis related group that has maximum hospitalization and expenditure.
To make sure that there is no malpractice, the agency needs to analyze if the race of the patient is related to the
hospitalization costs.
To properly utilize the costs, the agency has to analyze the severity of the hospital costs by age and gender for
proper allocation of resources.
Responsibilities ● Evaluated business requirements and prepared specifications that follow project
guidelines required to develop written programs.
● Built Data Analysis model using Python SciPy to classify customers into different
target groups.
● Participated in Data Acquisition with Data Engineer team to extract historical and
real-time data by using SQL Server and Excel.
● Performed Data Enrichment jobs to deal missing values, to normalize data, and to
select features.
● Creating meta-data and data dictionary for the future data use/data refresh of the
same client.
● Running SQL scripts, creating indexes, stored procedures for data analysis
● Extracted data from SQL Server and prepared data for exploratory analysis
● Built models using Machine Learning classification models like Random Forest
CLIENT: PROJECT:
Project Description:
● Extensively used Python's multiple data science packages like Pandas, NumPy,
Matplotlib, Seaborn, SciPy, Scikit-learn.
● Built models using techniques like Regression, Time Series forecasting, and
Clustering.
● Worked on data that was a combination of unstructured and structured data from
multiple sources and automated the cleaning using Python scripts.
● Extensively performed large data read/writes to and from csv and excel files using
pandas.
● Worked on the data validation with the help of Univariate, Multivariate analysis.
● Iteratively rebuild models dealing with changes in data and refining them over
time.
● Created and published multiple dashboards and reports using Power BI.