Portfolio

Tirth Patel

Data Scientist & ML Engineer
MSBA @ UCLA Anderson

About Me

I grew up fascinated by how systems work — which led me to pursue a B.Tech in Engineering at IIT Madras, one of India's premier technical institutions. There, I built a deep foundation in mathematics, statistics, and algorithms — and developed a taste for turning complex, messy data into decisions that actually matter.

During my undergrad, I joined Seat of Joy — a child safety startup incubated at IIT Madras — as a Business & Strategy Analyst. I built a probabilistic market-sizing model from Indian Census data (100+ tables, 200K+ rows each) that estimated 55M target customers with 5% YoY growth, developed a supply-chain optimization model using operations research principles, led full competitor and pricing analysis across the category, and represented the startup at Shark Tank India Auditions — pitching data-backed market and business strategy to investors.

I'm now pursuing my Master of Science in Business Analytics (MSBA) at UCLA Anderson, deepening my expertise in machine learning, data engineering, and optimization. I'm drawn to problems where rigorous analysis drives real-world impact — from production agentic RAG systems to large-scale data pipelines to deep reinforcement learning.

Experience

Business & Strategy Analyst

Seat of Joy (Incubated at IIT Madras)

Child safety startup developing a full-body protective seat for two-wheelers — addressing the 2 children lost daily in India to two-wheeler accidents.

2022 – 2024

India · During Undergrad

  • Probabilistic Market-Sizing Model (Census Data)Sifted through 100+ Indian Census 2011 tables (200K+ rows each) to extract birth-order frequency matrices and inter-birth age-gap distributions. Built a joint-probability model that combined conditional age-gap probabilities with birth-order likelihoods to estimate, for any target year, how many Indian families have a child aged 3–6. Layered linear regression on historical cohorts to project YoY growth. Delivered an estimate of 55M addressable customers with 5% annual growth — 45% more accurate than the startup's prior figures — and became the anchoring market-size number in every investor deck.
  • Supply-Chain Optimization Model (Operations Research)Formulated a profit-maximising distribution model in Gurobi / Excel Solver. The objective maximised margin across state-level shipping routes, accounting for manufacturing costs, per-unit shipping rates, and selling price. Added an elastic-net-style penalty to discourage over-concentration in any single state, and used the market-sizing model's state-level demand estimates as allocation caps. The model produces ready-to-execute distribution recommendations that can scale directly into production operations.
  • Competitive Intelligence & Pricing StrategyConducted a full-stack competitive analysis across three child-safety product categories — built detailed SWOT profiles, cold-called manufacturers to source actual production costs, and computed competitor margins from first principles. Used margin benchmarking to derive a defensible pricing band, quantify competitive moat, and inform go-to-market sequencing. All findings fed directly into investor pitch materials and the product launch strategy.
  • Shark Tank India Auditions — Investor Pitch & PresentationLed a team of 3 to build the business & marketing pitch deck for Shark Tank India Auditions. Synthesised the market-sizing model, supply-chain analysis, competitive intelligence, and pricing strategy into a data-backed narrative covering total addressable market, competitive landscape, unit economics, and launch plan. Presented live to investor judges, fielding quantitative Q&A — every claim anchored to the models and analyses above — successfully representing Seat of Joy at the auditions.

Projects

A selection of data science and ML engineering work.

Course RAG Pipeline

Production-deployed agentic RAG system for UCLA MSBA students to query course materials — lecture slides, transcripts, and PDFs — using natural language. Live at tirth-courserag.duckdns.org.

LangGraphFastAPIChromaDBClaude HaikuOpenAIGoogle Drive APIDockerPythonSQLiteWebSocket

MotivationThe UCLA MSBA program runs 4 simultaneous courses, each with its own slides, transcripts, homework deadlines, and deliverables spread across a shared Google Drive. Students constantly lose time hunting for information manually. I built a fully agentic system that classifies every query, self-verifies deadline answers, supports human-approved file uploads, and can explain exactly which source chunks drove any answer — deployed at effectively zero infrastructure cost on Oracle Cloud Free Tier.

Achievements
  • Built a 13-node LangGraph agent with conditional routing across 5 query types: deadline, summary, upload, general Q&A, and source explanation
  • Implemented self-verifying deadline extraction: LLM extracts date → re-queries ChromaDB with rephrased search → cross-references results → surfaces conflicts with confidence indicator
  • Designed human-in-the-loop upload approval using LangGraph interrupt-before + SQLite checkpointer: LLM proposes a Drive folder path, user approves/edits before embedding — preventing vector store pollution

Yelp & Weather Intelligence Pipeline

End-to-end data engineering pipeline correlating weather patterns with Yelp restaurant sentiment using PySpark, Snowflake, Airflow, and Tableau — processing 10M+ records.

PySparkSnowflakeAirflowTableauVADER NLPPython

MotivationCurious whether weather drives restaurant ratings and business patterns, I built a production-grade data pipeline ingesting the full Yelp Academic Dataset and OpenWeatherMap API, performing distributed ETL at scale, NLP sentiment scoring, and surfacing insights through an executive Tableau dashboard.

Achievements
  • Discovered the "Cold Weather Sentiment Paradox": Freezing weather drops volume to 101/day but yields the highest average sentiment index (0.71)
  • Identified Extreme Heat as the major deterrent to dining out, dropping review volume to ~30/day with the lowest sentiment (0.65)
  • Found that Rainy/Snowy weather causes a 50.7% drop in volume (143/day vs 290/day) but retains a resilient sentiment index identical to pleasant days (0.69)

DeepCubeA Maltese Gear Cube Solver

Deep reinforcement learning agent that solves the Maltese Gear Cube using a CUDA-accelerated neural heuristic.

PyTorchCUDADeep RLPythonNumPy

MotivationTo apply the DeepCubeA algorithm to a novel, higher-complexity puzzle and validate whether deep RL can generalize to unseen combinatorial state spaces.

Achievements
  • Trained a value network on 50M+ self-generated cube states using PyTorch + CUDA
  • Implemented batched A* search guided by learned heuristic, solving cubes optimally
  • Achieved 100% solve rate on test set within optimal or near-optimal move counts

COVID-19 Impact Analysis

Multi-dataset SQL analysis and Tableau visualization of COVID-19's economic and social impact across sectors.

MySQLTableauSQLPythonPandas

MotivationTo quantify the pandemic's real-world effects on employment, GDP, and healthcare using publicly available government datasets.

Achievements
  • Joined and cleaned 5+ public datasets (2M+ rows) in MySQL with complex CTEs and window functions
  • Built a Tableau story with 12 interactive dashboards across demographics and sectors
  • Identified sector recovery patterns correlating with specific policy intervention timelines

Skills

The tools and technologies I work with.

Languages

PythonSQLRJavaScriptTypeScript

ML / AI

PyTorchscikit-learnLangGraphHugging FaceLLMs & RAG

Data Engineering

PySparkSnowflakeAirflowFastAPIChromaDB

Analytics & Visualization

TableauPandasNumPyMatplotlibVADER NLP

Optimization & Modeling

GurobiLinear RegressionProbability & StatisticsOperations ResearchMathematical Modeling

Tools

GitDockerNext.jsOracle CloudJupyter

Get In Touch

Whether you're recruiting, collaborating, or just want to talk data — my inbox is always open.

© 2026 Tirth Patel. Built with Next.js & Tailwind CSS.