Himani Gadve
Apr 12, 202310 min read
4
Data Engineer - Read about project here
Developed and automated a workflow for optimizing BigQuery table partitioning and clustering using Google Cloud Composer (Apache Airflow), leading to a 70% improvement in query performance and reduced data storage costs.
Designed and implemented a scalable FinTech anomaly detection framework (FTDQ) that increased true positive accuracy by 50% and reduced false positives by 55%, saving over 8 engineering hours per week. Led the migration of 100+ jobs from legacy systems and onboarded 70+ anomaly detection checks, facilitating adoption by 8 teams.
Led migration and adoption efforts for a FinTech data quality framework, overseeing the transition of over 100 jobs and onboarding of 70+ anomaly detection checks, while significantly improving detection accuracy and efficiency.
Tools: BigQuery, Google Cloud Composer, Apache Airflow, Python, Dataflow, Pub/Sub.
Data Engineer - Read about project here
Design, implement, and automate deployment of distributed system for collecting and processing log events from multiple sources.
Masters of Science in Business Analytics
San Francisco State University (GPA 3.8)
Coursework : Data Warehousing, Machine Learning, Statistics, Data Management, Data visualization.
Leadership: Writing Tutor to mentor and teach students on utilization structure in formation of their writing work.
Senior Data Analyst
Senior Data Analyst
Senior Data Analyst
Electronics and Telecommunication
Coursework: Computer Programming, Differential equations, Linear Algebra, Calculus, Microcontrollers, Circuit design
PYTHON
BIG DATA (HADOOP, PRESTO, SPARK)
DATA MINING
DATA VIZ (TABLEAU, LOOKER, POWER BI)
AWS CLOUD
APACHE AIRFLOW
DATABASE(SQL, BIGQUERY, POSTGRESQL, HIVE)
VERSION CONTROL SYSTEM (GIT & GITHUB)
MACHINE LEANING
APACHE AIRFLOW
GOOGLE CLOUD PLATFORM
DATA ARCHITECTURE
Transforming Big Data into Big Results: Meet Amazon's Data Engineer
01
Transactional solution to APache Spark's overwrite behavior
02
Dynamic data file compaction in Apache Spark using Delta Lake
03
Orchestration using Apache Airflow
04
Real time data streaming using Kafka cluster and data transformation using Apache Flink
05
Orchestrate a pipeline using AWS Step Function to perform Consumer Engagement Funnel Analysis & User Acquisition Optimization
06
Orchestrating a Data Engineering workflow to improve Lifetime Value(LTV) efficiency
07
Revolutionizing Headcount Reporting: One-Stop Solution for Metrics Management
01
Shopping - Customer Behavior Data Visualiza
02
Automattic - User Engagement Analysis