I combine a background in program management with hands-on data analytics skills — using Python, SQL, R, and Tableau to explore complex datasets, build visualizations, and generate insights. My projects range from global inequality analysis to predictive modeling, reflecting both my curiosity about data and my commitment to work that advances social impact.
Using World Bank and World Inequality Database data, this project investigates the structural drivers of global wealth inequality between 2020 and 2024. By distinguishing elite asset concentration in emerging markets from household indebtedness in financialized economies, it challenges the assumption that economic growth alone reduces exclusion. Findings are designed to help both policymakers and program teams understand why standard growth metrics can mask deepening inequality on the ground — supporting more targeted intervention design, resource allocation, and evidence-based development financing strategies. more
This interactive Tableau dashboard makes complex U.S. cancer incidence, mortality, and survival data accessible to researchers, program staff, and policymakers alike. By mapping mortality-to-incidence ratios across states, cancer types, and demographic groups, it surfaces systemic healthcare disparities that persist despite national progress. Built to serve both technical and non-technical audiences, the dashboard models how data visualization can drive equity-focused health advocacy and help organizations identify underserved populations for more effective resource prioritization. more
This project analyzed 550 Amazon bestselling books (2009–2019) using SQL to identify trends in ratings, pricing, genre shifts, and author performance. It highlights what drives commercial success in the bestseller market, including the structural shift toward Non-Fiction and volume-driven growth. more
Image by Peter Mooney via Flickr (CC BY-SA 2.0)
This project investigates whether high-altitude environments are a causal driver of elite endurance running performance using Olympic marathon data. It utilizes both between-country and within-country analyses to distinguish physiological effects of altitude from cultural and socioeconomic factors. more
This project seeks to predict both heart disease and diabetes, filling a gap in existing models focused on individual diseases. Utilizing the 2022 CDC survey data from 400,000+ adults, it uncovers insights on physical status, regional health disparities, and correlations with chronic diseases. This project includes the development of a user-friendly website for predictions. more
The main aim of this project is to identify distinct vocalizations made by my cat, particularly focusing on meow sounds. Utilizing audio feature extraction and PCA, three clustering methods were employed for unsupervised learning. The final voting clustering process identified 11 distinct sounds unanimously agreed upon by all methods. more
The project aims to build an image classifier to distinguish between cat images from various breeds and personal photos of my cat, Mellow. Using transfer learning to boost performance due to limited data, the model achieved 100% training accuracy and 95% validation accuracy. more
This project conducts Exploratory Data Analysis (EDA) on animal shelter data to create a binary classification model predicting cat adoption. By identifying factors influencing adoption, it aims to improve care and support for animals. The final voting classifier model achieved a 77% accuracy and 81% F1 score for adoption prediction. more