
ABOUT
Self-motivated individual with solid economics, mathematics, and statistics background. Good problem solver, quick learner, proactive thinker, cross-cultural communicator, and data enthusiast.
Advanced skills in relational database management, data visualization, descriptive and predictive analysis and modeling using machine learning algorithms, dashboard establishment. Proficient with R and SQL, hands on skill with Python (learning in progress), Tableau and other tools.
Experience of Big Data software and platforms - Hadoop, Hive, Spark, AWS, etc.
MY SKILLS
MACHINE LEARNING
Hands on experience:
Supervised Learning: OLS, SVM, kNN, Decision Trees, Neural Networks, XGBoost, Naive Bayes, Time Series
Clustering: kMean
NLP: TF-IDF, Bag of Words, Word2Vec, Sentiment Analysis
PROGRAMMING
R: tidyverse (dplyr, ggplot2, lubridate, etc.), forecast, caret, glmnet, wordcloud, RSQLite, RPostgreSQL, etc.
Python: Numpy, Pandas, Matplotlib, Seaborn, Scikit-Learn, Keras, XGBoost, Scrapy, Selenium, Beautiful Soup, urllib, etc.
SQL: Advanced queries (Sub-queries, Window Functions)
HTML/CSS, C++, Matlab/Octave, GIT(version control)
EDUCATION
MASTER OF SCIENCE IN BUSINESS ANALYTICS @UCSD
August 2017 - December 2018
Staffing Supporting Tool for Rady Children’s Hospital, Capstone Project 04-07/2018
Incorporate public population data via API and automated extract/feed; write SQL to query data from database
Visualize data using Qlik/Tableau to discover patterns and insights; find drivers for demand uncertainties
Develop time-series models and other machine-learning models for predicting patient volumes and analyzing capacities to determine ideal staffing levels for the Emergency Department
Establish a R shiny dashboard and front-end decision support tool deployable within the organization
Pricing Health and Beauty Products, Mini Case Study 04/2018
Visualize the weekly sales data and discover relationships between variables to explore underlying patterns
Run linear regression to estimate price elasticity, using which, calculate the optimal monopoly mark-up price
We’ve got the best Credit Card for You, Customer Analytics 03/2018
Retrieved data from AWS using SQL to collect necessary large datasets for analysis
Conducted visualizations and descriptive analysis to extract patterns and insights, predicted customer churn rate, and calculated customer lifetime value for each combination of credit card
Designed partial factorial experiments and utilized predictive models (Logistic Regression, Naïve Bayes, Neural Networks), tree-based models (random forests and boosted-trees) to predict best offer for each target group
Applied the recommended strategy and improved profitability by 39% compared to the pre-existing strategy
Phoenix: Where? Indian Cuisine? Big Data Tech.& Business App 03/2018
Collected, cleaned and prepared data from Yelp API and other public datasets using SQL, R, and other tools
Visualized the data to discover hot-zone, potential patterns for success; conducted text mining and sentiment analysis, identified common features driving customers speaking positive
Established predictive models to identify features of a successful restaurant based on yelp star-scoring
BACHELOR OF BUSINESS ADMINISTRATION INÂ MARKETING @UM
September 2013 - June 2017
Courses: Business Programming, Linear Algebra, Statistics, Calculus, Economics, Accounting, Integrated Marketing Communication, Research Methods, B2B Marketing
​
IELTS: 7.5
GMAT: 750
DATA SCIENCE NANODEGREE @UDACITY
June 2018 - Present
​