Portfolio

Tirth Patel

Data Scientist & ML Engineer
MSBA @ UCLA Anderson

About Me

I grew up fascinated by how systems work — which led me to pursue a B.Tech in Engineering at IIT Madras, one of India's premier technical institutions. There, I built a deep foundation in mathematics, statistics, and algorithms — and developed a taste for turning complex, messy data into decisions that actually matter.

During my undergrad, I joined Seat of Joy — a child safety startup incubated at IIT Madras — as a Business & Strategy Analyst. I built a probabilistic market-sizing model from Indian Census data (100+ tables, 200K+ rows each) that estimated 55M target customers with 5% YoY growth, developed a supply-chain optimization model using operations research principles, led full competitor and pricing analysis across the category, and represented the startup at Shark Tank India Auditions — pitching data-backed market and business strategy to investors.

I'm now pursuing my Master of Science in Business Analytics (MSBA) at UCLA Anderson, deepening my expertise in machine learning, data engineering, and optimization. I'm drawn to problems where rigorous analysis drives real-world impact — from production agentic RAG systems to large-scale data pipelines to deep reinforcement learning.

Experience

Business & Strategy Analyst

Seat of Joy (Incubated at IIT Madras)

Child safety startup developing a full-body protective seat for two-wheelers — addressing the 2 children lost daily in India to two-wheeler accidents.

2022 – 2024

India · During Undergrad

  • Probabilistic Market-Sizing Model (Census Data)Sifted through 100+ Indian Census 2011 tables (200K+ rows each) to extract birth-order frequency matrices and inter-birth age-gap distributions. Built a joint-probability model that combined conditional age-gap probabilities with birth-order likelihoods to estimate, for any target year, how many Indian families have a child aged 3–6. Layered linear regression on historical cohorts to project YoY growth. Delivered an estimate of 55M addressable customers with 5% annual growth — 45% more accurate than the startup's prior figures — and became the anchoring market-size number in every investor deck.
  • Supply-Chain Optimization Model (Operations Research)Formulated a profit-maximising distribution model in Gurobi / Excel Solver. The objective maximised margin across state-level shipping routes, accounting for manufacturing costs, per-unit shipping rates, and selling price. Added an elastic-net-style penalty to discourage over-concentration in any single state, and used the market-sizing model's state-level demand estimates as allocation caps. The model produces ready-to-execute distribution recommendations that can scale directly into production operations.
  • Competitive Intelligence & Pricing StrategyConducted a full-stack competitive analysis across three child-safety product categories — built detailed SWOT profiles, cold-called manufacturers to source actual production costs, and computed competitor margins from first principles. Used margin benchmarking to derive a defensible pricing band, quantify competitive moat, and inform go-to-market sequencing. All findings fed directly into investor pitch materials and the product launch strategy.
  • Shark Tank India Auditions — Investor Pitch & PresentationLed a team of 3 to build the business & marketing pitch deck for Shark Tank India Auditions. Synthesised the market-sizing model, supply-chain analysis, competitive intelligence, and pricing strategy into a data-backed narrative covering total addressable market, competitive landscape, unit economics, and launch plan. Presented live to investor judges, fielding quantitative Q&A — every claim anchored to the models and analyses above — successfully representing Seat of Joy at the auditions.

Projects

A selection of data science and ML engineering work.

Course RAG Pipeline

Production-deployed agentic RAG system for UCLA MSBA students to query course materials — lecture slides, transcripts, and PDFs — using natural language. Live at tirth-courserag.duckdns.org.

LangGraphFastAPIChromaDBClaude HaikuOpenAIGoogle Drive APIDockerPythonSQLiteWebSocket

MotivationThe UCLA MSBA program runs 4 simultaneous courses, each with its own slides, transcripts, homework deadlines, and deliverables spread across a shared Google Drive. Students constantly lose time hunting for information manually. I built a fully agentic system that classifies every query, self-verifies deadline answers, supports human-approved file uploads, and can explain exactly which source chunks drove any answer — deployed at effectively zero infrastructure cost on Oracle Cloud Free Tier.

Achievements
  • Built a 13-node LangGraph agent with conditional routing across 5 query types: deadline, summary, upload, general Q&A, and source explanation
  • Implemented self-verifying deadline extraction: LLM extracts date → re-queries ChromaDB with rephrased search → cross-references results → surfaces conflicts with confidence indicator
  • Designed human-in-the-loop upload approval using LangGraph interrupt-before + SQLite checkpointer: LLM proposes a Drive folder path, user approves/edits before embedding — preventing vector store pollution

Yelp & Weather Intelligence Pipeline

End-to-end data engineering pipeline correlating weather patterns with Yelp restaurant sentiment using PySpark, Snowflake, Airflow, and Tableau — processing 10M+ records.

PySparkSnowflakeAirflowTableauVADER NLPPython

MotivationCurious whether weather drives restaurant ratings and business patterns, I built a production-grade data pipeline ingesting the full Yelp Academic Dataset and OpenWeatherMap API, performing distributed ETL at scale, NLP sentiment scoring, and surfacing insights through an executive Tableau dashboard.

Achievements
  • Discovered the "Cold Weather Sentiment Paradox": Freezing weather drops volume to 101/day but yields the highest average sentiment index (0.71)
  • Identified Extreme Heat as the major deterrent to dining out, dropping review volume to ~30/day with the lowest sentiment (0.65)
  • Found that Rainy/Snowy weather causes a 50.7% drop in volume (143/day vs 290/day) but retains a resilient sentiment index identical to pleasant days (0.69)

DeepCubeA Maltese Gear Cube Solver

Deep reinforcement learning agent that solves the Maltese Gear Cube using a CUDA-accelerated neural heuristic.

PyTorchCUDADeep RLPythonNumPy

MotivationTo apply the DeepCubeA algorithm to a novel, higher-complexity puzzle and validate whether deep RL can generalize to unseen combinatorial state spaces.

Achievements
  • Trained a value network on 50M+ self-generated cube states using PyTorch + CUDA
  • Implemented batched A* search guided by learned heuristic, solving cubes optimally
  • Achieved 100% solve rate on test set within optimal or near-optimal move counts

COVID-19 Impact Analysis

Interactive Tableau visualization analyzing COVID-19's impact on India — tracking vaccination rollout, weekly case surges, and death rate trajectories with demographic context.

TableauData VisualizationSQL

MotivationTo quantify the pandemic's real-world effects on India through vaccination rates, case trajectories, and death rates — contextualised by demographic indicators like HDI, median age, and population density — and surface actionable patterns through interactive Tableau dashboards built entirely from joined public datasets.

Achievements
  • Visualized the dramatic impact of vaccination on death rates — sharp decline in death rate even though there is a sharp increase in the number of cases after vaccination
  • Captured India's devastating second wave: weekly new cases peaked at 2.2M+ around February 2021
  • Tracked vaccination milestone of 140M+ total people vaccinated with clear inflection points

E-Commerce Seller & Logistics Analytics

Multi-dashboard Tableau analytics dissecting seller performance, shipping logistics, and product characteristics across 89K+ e-commerce orders — revealing revenue concentration, delivery patterns, and cost structures.

TableauData VisualizationSQL

MotivationTo provide a comprehensive analytical view of e-commerce operations by building interconnected Tableau dashboards that dissect seller performance, shipping efficiency, and product characteristics — enabling data-driven decisions on seller management, logistics optimization, and pricing strategy.

Achievements
  • Built 3 interconnected dashboards covering seller revenue, shipping logistics, and product analysis across 89,316 orders
  • Identified extreme revenue concentration — top sellers drive the majority of $2.44M total revenue with average order value of $340.9
  • Discovered shipping cost disparity — median shipping cost is 22.18% of price vs. average of 62.91%, revealing heavy-product outliers inflating costs

Skills

The tools and technologies I work with.

Languages

PythonSQLRJavaScriptTypeScript

ML / AI

PyTorchscikit-learnLangGraphHugging FaceLLMs & RAG

Data Engineering

PySparkSnowflakeAirflowFastAPIChromaDB

Analytics & Visualization

TableauPandasNumPyMatplotlibVADER NLP

Optimization & Modeling

GurobiLinear RegressionProbability & StatisticsOperations ResearchMathematical Modeling

Tools

GitDockerNext.jsOracle CloudJupyter

Get In Touch

Whether you're recruiting, collaborating, or just want to talk data — my inbox is always open.

© 2026 Tirth Patel. Built with Next.js & Tailwind CSS.