My Expedition
Exabeam Inc.
Jul '23 - Present
Data Engineer - Read about project here
-
Developed and automated a workflow for optimizing BigQuery table partitioning and clustering using Google Cloud Composer (Apache Airflow), leading to a 70% improvement in query performance and reduced data storage costs.
-
Designed and implemented a scalable FinTech anomaly detection framework (FTDQ) that increased true positive accuracy by 50% and reduced false positives by 55%, saving over 8 engineering hours per week. Led the migration of 100+ jobs from legacy systems and onboarded 70+ anomaly detection checks, facilitating adoption by 8 teams.
-
Led migration and adoption efforts for a FinTech data quality framework, overseeing the transition of over 100 jobs and onboarding of 70+ anomaly detection checks, while significantly improving detection accuracy and efficiency.
Tools: BigQuery, Google Cloud Composer, Apache Airflow, Python, Dataflow, Pub/Sub.
Amazon.com Inc.
Jun '21 - Aug '23
Data Engineer - Read about project here
Design, implement, and automate deployment of distributed system for collecting and processing log events from multiple sources.
Developed and implemented a GDPR-compliant data engineering workflow for EU employee data governance using Airflow, S3,
SQS, Glue, Redshift, and Tableau.
Designed and implemented a data pipeline solution that improved data processing speed by ~30% and reduced errors by 83%,
resulting in significant cost savings and increased operational efficiency for the organization
Designed and implemented a scalable, fault-tolerant data warehousing solution on Amazon Redshift, leveraging AWS services for
an IQR-based anomaly detection framework to ensure high-quality and consistency of data
Tools: Python, SQL, ApacheSpark, Scala, PySpark. , AWS QuickSight, Apache Airflow, AWS DynamoDB, AWS Redshift Spectrum, AWS S3, AWS EMR, AWS Glue.
San Francisco State University (GPA 3.8)
Jan '20 - Dec '21
Masters of Science in Business Analytics
San Francisco State University (GPA 3.8)
Coursework : Data Warehousing, Machine Learning, Statistics, Data Management, Data visualization.
Leadership: Writing Tutor to mentor and teach students on utilization structure in formation of their writing work.
Credit Suisse - Wipro Technologies
Jan '19 - Dec '19
Senior Data Analyst
Credit Suisse-Investment banking domain as Senior Data Analyst.
In-Scope Products: Investment (all types), Deposits (Savings and DDA), Time (CD), Insurance
Analysis: Churn Prediction, Cohort Analysis, Survival Analysis.
Developing REST API’s from Data Sources and Service Connectors for exposing the aggregated functionality to Consumers. Performance tuned Informatica cloud mappings/ ICRT services using the data access services. Worked on Performance Tuning, identifying and resolving performance bottlenecks at various levels like sources, targets, mappings, and sessions. Performed Unit testing, System testing to verify the data loads and services are returning the results as per SLA.
IBM Inc.
Sep '17 - Jan '19
Senior Data Analyst
At a position as Senior Data Analyst Accomplished 92 % system automation using automated business processing (ABP) in ETL Informatica. Enhanced insurance policy creation workflows in PCS, reducing production run time by 7 hrs./ week
Enhanced insurance policy creation workflows in PCS for IAR, RIA and for brokers. Analyzed DB2 utilization in production jobs and enhanced SQL queries for impacted jobs. Implemented critical ABP flows for health insurance products in PCS. Handled large data using IBM DB2 and created weekly reports of the total compensation paid versus the amount that failed and was also responsible for doing the Balancing and Check.
American Express (Amex)- Syntel)
Oct '14 - Sep '17
Senior Data Analyst
American Express- Credit card domain as an Analyst programmer.
Experience in Data warehousing, Data Analysis and Data Migration, Development, ETL, Maintenance, Unit Testing and Documentation
Data Warehouse experience using Informatica Power Center 9.x (Source Analyzer, Repository Manager, Mapping Designer, Transformation Designer, Workflow Manager, Monitor) as ETL tool on DB2/Oracle/ Microsoft SQL Server Database
Bachelor of Engineering
May '10 - Aug '14
Electronics and Telecommunication
Coursework: Computer Programming, Differential equations, Linear Algebra, Calculus, Microcontrollers, Circuit design
SKILLS
1-Basic 2-Novice 3-Intermediate 4-Advanced 5-Expert
PYTHON
BIG DATA (HADOOP, PRESTO, SPARK)
DATA MINING
DATA VIZ (TABLEAU, LOOKER, POWER BI)
AWS CLOUD
APACHE AIRFLOW
4
DATABASE(SQL, BIGQUERY, POSTGRESQL, HIVE)
5
VERSION CONTROL SYSTEM (GIT & GITHUB)
4.5
MACHINE LEANING
5
APACHE AIRFLOW
4.5
4
3.5
5
4.5
GOOGLE CLOUD PLATFORM
4.5
4.5
4.5
DATA ARCHITECTURE
Transforming Big Data into Big Results: Meet Amazon's Data Engineer
01
Transactional solution to APache Spark's overwrite behavior
02
Dynamic data file compaction in Apache Spark using Delta Lake
03
Orchestration using Apache Airflow
04
Real time data streaming using Kafka cluster and data transformation using Apache Flink
05
Orchestrate a pipeline using AWS Step Function to perform Consumer Engagement Funnel Analysis & User Acquisition Optimization
06
Orchestrating a Data Engineering workflow to improve Lifetime Value(LTV) efficiency
07
Revolutionizing Headcount Reporting: One-Stop Solution for Metrics Management
Data
Vizualization
01
Shopping - Customer Behavior Data Visualiza
02
Automattic - User Engagement Analysis