Sangam Patil

I'm

About

My Introduction

Data Scientist & Analyst, AI-ML Developer & Researcher, AI Enthusiast


Hey, I'm Sangam!!

I didn't get into data science because it was trending. I got into it because I genuinely enjoy the puzzle taking a messy dataset and finding something in it that actually changes a decision.

I'm a Data Scientist and ML Developer wrapping up my Master's in Data Science at UMass Amherst (3.9 GPA). My background sits at a unique crossroads I build and deploy machine learning models, run statistical experiments, and dig through large-scale data on one side, and on the other, I co-founded XResilient, a B2B data analytics venture where I worked with 10+ clients, built dashboards, defined KPIs, and helped drive a 150% revenue growth through pure data-driven decision making.

That founder experience changed how I think about data work. It's not enough to build a great model it has to solve something real, communicate clearly to stakeholders, and actually get used. That's the bar I hold myself to now.

On the technical side, I've worked across a pretty wide stack: time-series forecasting, LLM fine-tuning, NLP pipelines, geospatial analysis, and behavioral experiments with real survey data. Python, R, and SQL are my daily drivers, and I'm always chasing the next interesting problem.

Welcome to my portfolio !! Feel free to look around at my recent projects!!

Education

My Education Till Today

Master of Science in Data Science & Analytics

University of Massachusetts Amherst, MA

Sep. 2024 - Present

Subjects Taken : Advanced Natural Language Processing | Advanced Machine Learning | Data Engineering | Data Science with R | Python for Machine Learning | Exploratory Data Analytics | Spatial Data Analysis | Introduction to Quantitative Analysis | Advanced Data-Driven Storytelling

Bachelor of Technology in Artificial Intelligence & Data Science

Vishwakarma Institute of Technology, Pune, MH

Sep. 2021 - May 2024

Subjects Taken : Data Science | Data Analytics | Machine Learning | Deep Learning | Artificial Intelligencr | Natural Language Processing | Statistical Inference | Algorithm Complexity | Business Intelligence and Analytics | Operating Systems | Data Base Management | Introduction to Python | Causal Inference

Diploma in Computer Science & Engineering

Sharad Institute of Technology, Ichalkaranji, MH

Sep. 2018 - May 2021

Subjects Taken : Programming in C | Data Structures | Object oriented Programming | DBMS | Java Programming | Software Engineering | Software Testing | Operating Systems

Professional Experience

My Professional Experience Till Today

Graduate Research & Teaching Assistant

University of Massachusetts, Amherst, MA

Feb. 2025 - Present
  • Assisted in course delivery by leading teaching and tutoring sessions on LLM architectures, Statistical Modeling, Research Design for 200+ students, bridging the gap between theory math and practical python implementation.
  • Optimized student learning outcomes by providing targeted technical support in Machine Learning pipelines and Quantitative Methods, ensuring mastery of data science ethics and fundamentals
  • Conducted statistical analysis on social media and political engagement data by regression models and hypothesis testing in R.
  • Designed and analyzed survey based experiments (for N=500+) to evaluate behavioral patterns and infer causal relationships.
  • Performed data preprocessing and Exploratory Data Analysis to extract meaningful insights from both datasets and created visualizations and reports to communicate findings to academic stakeholders

Data Analyst - Deployment Team

Cartell, Shippensburg, PA

Jan. 2025 - April 2025
  • Architected and deployed an end-to-end AI orchestration pipeline in n8n, integrating PostgreSQL, AWS (S3, EC2), and Kubernetes to replace a legacy traditional software system with a scalable, production-grade workflow.
  • Engineered LLM-powered detection features including theft detection, anomaly flagging, and real-time risk assessment via OpenAI's GPT 4.o, reducing reliance on manual rule-based logic.
  • Integrated Twilio and SendGrid APIs for automated real-time alerting (SMS and email) on threat events, and Slack API for internal incident notification pipelines
  • Coordinated cross-functional deployment across cloud infrastructure (AWS), container orchestration (Kubernetes), and workflow automation (n8n) using Python and JavaScript throughout the stack.
  • Built data ingestion and quality validation pipelines in Python, resolving corrupted files, metadata failures, and schema inconsistencies across a large-scale image dataset ensuring high-fidelity, model-ready inputs for downstream training pipelines.
  • Analyzed class distributions and statistical feature profiles across the ingestion dataset using Pandas and Matplotlib, surfacing labeling gaps and imbalances that informed data resampling strategy and upstream collection decisions.

Co-Founder

XResilient - A Venture by SXT IT Solution, Pune, MH

Nov. 2023 - Present
  • Co-founded a services company; forged B2B partnerships and launched AI-driven product lines that delivered a 150% revenue increase within 8 months, scaling to 10+ enterprise clients.
  • Collaborated with stakeholders to translate business requirements into analytical reports and visualizations.
  • Performed exploratory data analysis and feature engineering on client datasets to identify trends and improve model performance.

E2E Hotel Management System for Deccan Pavilion Restaurant Chain

  • Architected and shipped a full-stack management platform from zero to production for a multi-location restaurant chain, owning both backend and UI end-to-end under tight client deadlines.
  • Built scalable RESTful APIs in Flask for real-time inventory syncing, automated billing, and reservation routing, reducing manual front-of-house overhead significantly.
  • Designed an interactive MATLAB-based UI leveraging advanced data structures and file processing to visualize live restaurant operations, enabling non-technical staff to monitor real-time data without engineering support.
  • Produced customer-facing operational graphs and charts using visualization tools, customized to stakeholder needs and business workflow requirements.

Data Scientist

YBI Foundation, Pune, MH

July. 2023 - Sep. 2023
  • Automated data preprocessing pipelines for large datasets, reducing anomalies by 20% to ensure high-fidelity inputs for production ML systems
  • Managed end-to-end model lifecycle for predictive architectures of XGBoost, Random Forest, optimizing performance and deployment through rigorous hyperparameter tuning.
  • Defined key performance indicators (KPIs) and built dashboards to track business performance and support decision-making.

Data Scientist

Exposys Data Labs, Pune, MH

April 2023 - June 2023
  • Architected a scalable ML solution in Python, extracting complex features from 2B+ records and delivering Matplotlib visualizations of seasonal, inventory, and market price trends as customer-facing analytical reports to stakeholder.
  • Analyzed dataset of 2B+ records using Python and SQL to identify seasonal trends, pricing patterns, and inventory behavior.
  • Conducted exploratory data analysis to uncover seasonal demand fluctuations and market behavior.
  • Developed visual reports using Matplotlib to communicate findings to stakeholders that helped decision making.
  • Built time-series forecasting models (ARIMA, ensemble methods) to improve demand prediction accuracy by 14%.

Machine Learning Engineer

RacksonsIT Developers Pvt. Ltd., Pune, MH

Oct. 2022 - March 2023
  • Operationalized Modern Portfolio Theory (MPT) algorithms to generate automated, highly diversified asset recommendations mathematically optimized for user risk tolerance and specific investment horizons.
  • Engineered a real-time investment recommendation system using Node.js and AWS, integrating APIs to deliver dynamic, scalable portfolio allocation.
  • Processed financial datasets of 250M+ records using NumPy and Pandas; developed data visualizations communicating mathematically optimized asset recommendations and risk tolerance strategies to stakeholders.
  • Collaborated with data engineer team and built data pipelines and integrated APIs to enable real-time portfolio analysis.

Data Scientist

RacksonsIT Developers Pvt. Ltd., Pune, MH

June 2020 - Aug. 2020
  • Conducted end-to-end sentiment analysis on real-world text data using NLP techniques, including text preprocessing, feature extraction (TF-IDF/word embeddings), and classification models to derive customer opinion insights.
  • Engineered interactive Tableau dashboards with graphs and charts tailored to customer needs, transforming complex data trends into actionable insights for external stakeholders, directly paralleling phone receiver data visualization requirements.
  • Built and evaluated machine learning models for fraud detection, applying techniques such as logistic regression and ensemble methods to identify anomalous patterns and reduce false positive rates on imbalanced datasets.

Projects & Publications

My Most Recent Projects & Publications

Domain-Alpaca: Instruction-Tuned Adaptation of Large Language Models

Parameter-Efficient fine-tuning |LoRA/QLoRA | Hugging Face Transformers | PEFT | Python | Tensor

Graduate Project with Live Deployment

BufferZone: Gas Station Relocation Analysis

GIS Data Analysis | ETL Pipeline | Tableaue Visualizations | Python | SQL | Vector Database

Personal Graduate Project

SuperPredict: Superconductor Critical Temperature Estimation

Machine Learning Pipeline | Principal Component Analysis | ExtraTrees ensemble | Python | NumPy | Pandas | Data Visualizations

Personal Graduate Project

The Tip-Off: How Gender Influences Tipping Behavior in Restaurants

Factorial Survey Experiment | t-tests | Statistical Modeling | R | Data Preprocessing | Data Analysis | Visualizations

Personal Graduate Project

Smart Contract Vulnerability Detection using Natural Language Processing

Code Analysis | Semantic Analysis | Natural Language Processing | Deep Learning

Published in IEEE Xplore

Wildfire Smoke Detection using Faster R-CNN

Faster R-CNN | GIS Analysis | Satellite Datasets | PyTorch

Published in Springer

Face Emotion Detection using OpenCV

OpenCV | Real-Time | Computer Vision | ETL pipeline in Python

Published in Springer

Investigating the Impact of Digital Detoxification on Mental and Physical Well-Being

Statistical Modeling | R | Data Preprocessing | Data Analysis | Visualizations

Personal Graduate Project

Real-Time Financial News Analytics Pipeline with Advanced ETL

Financial News Data | Data Analysis | ETL Pipeline | News Sentiment | Python | SQL

Personal Graduate Project

Crypto-Trace: Tracking Cryptocurrency Transactions with Blockchain Technology

Data Analytics | Machine Learning | Blockchain | Tracking

Published in Book - Advances in Networks, Intelligence and Computing

Blockchain Based Land Record System

Blockchain | Web3 | Integrity

Published in ITM Web of Conferences

Powering Progress: A Hospital’s Journey towards Renewable Energy

Techno-Economic Analysis | Time-of-Day tariff scheduling | Renewable | Case Study

Published in E3S Web of Conferences