~ mansit_suman
Open to Analytics roles · New York

Mansit Suman_

~/analytics_engineer # data · engineering · ai · self-service

Doing context & data quality engineering before it was cool.

Engagement Manager · Data & AI Consultant based in New York since June 2025. Six years at MathCo designing canonical data models and self-service foundations across Fortune 10 and Fortune 100 clients — in commercial, supply chain, and People & Operations. Currently architecting an agentic ETL system that automates engineering work end-to-end.

6 yrs analytics & data eng. 30-person team led F10 & F100 clients NY · Jun '25 → DES'25 speaker
— 01 / work

Selected projects, filter to taste.

Latest first. Click any row to expand the spec.
summaryA multi-agent ETL system that automates data engineering end-to-end — from STTM authoring through transformation generation to customer-ready datasets aligned to pre-defined canonical models.
agentsSource profiler · STTM author · transform generator · model validator · doc & lineage writer. Human-in-the-loop checkpoints between stages.
whyEngineers should architect; agents should bolt the parts together. Compress weeks of plumbing into hours, so consultants spend time on stakeholders and design.
statusActive development — demo →
stackPythonLLM orchestrationdbtSnowflakeDuckDB
summaryThree-year analytics roadmap for the South America business group of a Fortune 10 client. Interviewed leaders across every business function, mapped current state to north star, prioritised the build sequence.
stakeholdersC-suite + heads of supply chain, commercial, finance, digital, and R&D across the SAM region.
approachCross-functional discovery → maturity assessment → capability roadmap → 3-year investment plan.
artifactsCapability heat-map · prioritisation matrix · sequenced delivery roadmap · investment ask.
impactAligned executive sponsors on a 3-year analytics investment direction across multiple lines of business.
summaryA centralised, governed context layer connecting enterprise data sources to Text2SQL, DQ, Profiler, STTM, and Agentic ETL agents.
design5 context-object types · LLM auto-tagging with steward review · semantic + hybrid retrieval · RBAC · versioning with rollback · usage tracking.
judgementSteered scope away from a sprawling "do-everything" draft → focused POC the stakeholders could actually fund.
impactReusable middle layer enabling multi-agent trust across the analytics stack.
stackFastAPIPostgresPGVectorAzure BlobReact
summaryDesigned and shipped the A360 Common Data Model for the entire hire-to-retire lifecycle for a Fortune 100 client's People & Operations function. Engineered to be agnostic to future acquisitions — new entities plug in without breaking the schema.
modellingPeople data done right: slowly-changing dimensions for employee attributes, transactional fact tables for HR events, hire-to-rehire continuity preserved across the lifecycle.
lifecycleConceptual diagrams → logical model → physical deployment in production — end-to-end, multiple times, also across supply-chain and commercial domains.
productsAssociate profile views, journey mapping, and AI-insight enablement layered on top of the canonical model.
practiceTrained junior engineers on the modelling discipline across HR, supply chain, and commercial data — codified the mistakes I'd made so they wouldn't have to.
stackSnowflakedbtPower BIAzure
summaryA unified data model spanning sales orders, purchase orders, and supply orders — the first time the commercial and supply-chain teams at a Fortune 100 CPG spoke the same language. Sat at the heart of the SSOT.
modellingOne canonical order grain. One calendar. One product hierarchy. Joinable across LOBs without translation layers — supporting every downstream report the business wanted.
stakeholdersVP IT · Analytics Manager · supply-chain & commercial leads. Translated their separate vocabularies into a single one the data could speak.
slasWarehouse monitoring surfacing workload anomalies + cost drift; +30% data reliability on the SSOT.
impact$650K annual savings · decisions on the same numbers, finally.
stackSnowflakeAzurePower BIPython
summarySSOT for a battery distribution company. Led a 5-person team to build the data foundation that put the right numbers in the right hands at the right time.
productTruck loaders received their morning load list at 8:00 AM — which batteries to load, in what quantities, for what routes. Calibrated against actual demand rather than yesterday's guess.
impactReal money saved on inventory and over-ordering. Data has more power than we usually imagine — sometimes it's the 8 AM list, not the dashboard.
practiceMentored the 5-person team end-to-end through the modelling and delivery lifecycle.
stackSnowflakeAzurePythonPower BI
summaryReusable Python module that bootstraps DQ rules from metadata + lineage via LLMs — semantic checks, anomaly detection, explainable enough that stewards actually approve the rules.
whyHelpful, honest, harmless AI starts with helpful, honest, harmless data.
impactAdopted in 6 projects, secured 4 new engagements.
stackPythonGenAISnowflake
— 02 / principles

How I work.

The implicit contract behind everything above.
01

Ship before certain.

Bias for action over architectural perfectionism. Pulled the Context Store POC back from a six-month sprawl into something fundable in a quarter. The metric is shipped, not designed.

02

Full-stack on purpose.

Stakeholder discovery → schemas → pipelines → dashboards → adoption. Not because the title says so — because each layer informs the next. The dashboard fights you when the model is wrong.

03

Honest data over clever data.

Helpful, honest, harmless AI starts with helpful, honest, harmless data. The only thing worse than no metric is a metric you can't trust. Built a GenAI-powered DQ accelerator on that premise.

04

Translate, don't transcribe.

"We need a dashboard" means something different to the VP than to the PM. Six years of consulting was a masterclass in figuring out which one matters — and saying so out loud.

— 03 / stack

Tools I reach for without thinking.

Shipped with, repeatedly. Adjacent ones called out separately.

languages

  • SQL
  • Python
  • Scala / Spark
  • Bash

pipelines & orch.

  • dbt
  • Airflow
  • Azure Data Factory
  • Databricks · GitHub CI/CD

warehousing

  • Snowflake
  • DuckDB
  • Lakehouse · Medallion
  • Kimball · Data Vault

bi & genai

  • Power BI · Tableau
  • RAG · hybrid retrieval
  • Multi-agent design
  • LLM evals
adjacent Hex· MotherDuck· Looker # comfortable picking up fast
— 04 / cv

Six years at one company.

$ git log --oneline
HEAD →
Engagement Manager · Data & AI Consultant
MathCo · New York · 30-person team · F10 + F100 clients
Jun 2025 — Now
9af6e31
Engineering Manager · Delivery Lead
MathCo · Bengaluru · $1.2M ARR · 15-person team
Jul 2024 — May 2025
b3f5c20
Senior Associate
MathCo · Bengaluru
Jul 2023 — Jun 2024
4e7d1a8
Associate · Engineering Horizontal
MathCo · Bengaluru
Jul 2022 — Jun 2023
c91a2b7
Analyst
MathCo · Bengaluru
Sep 2020 — Jul 2022
Fastracked to EM in 4 years Zero escalations across $1.2M ARR account B.Tech, VNIT Nagpur SnowPro Core · AWS SAA
— 05 / on stage

Data Engineering Summit, 2025.

Agentic systems are stochastic, not deterministic — so even a 0.1% gain in accuracy matters. Enriching context the right way isn't optional; it's essential. — Architecting Contextual Layers for GenAI · DES'25
Watch
— 06 / said about

Client notes, verbatim.

"Mansit brings a rare blend of technical depth and business insight. Clarity in communication, proactiveness, genuine commitment — the kind of high-performance collaboration every individual should strive for."

VP, IT · U.S.-based Fortune 100 CPG

"They understand things faster than most other contractors I've worked with. Open to feedback, always positive and helpful. Mansit's drive for excellence continues to raise the bar."

Senior Director, Analytics · U.S.-based Fortune 100 CPG
— 07 / off-clock

Books I keep close, and what I tinker with.

currently on the shelf

  • Designing Data-Intensive Applications — Martin Kleppmann
  • AI Engineering — Chip Huyen
  • The Data Warehouse Toolkit — Ralph Kimball

interests & practice work

  • community Co-founded Tech Minds at MathCo · 250–300 engineers/session
  • knowledge Built Engineering Hub — practice repo · 50–60 active users/week
  • framework Problem-assessment framework adopted in 100+ projects · +20% CSAT
  • mentoring Co-designed curriculum for 190 freshers
  • side project Data engineering co-pilot · RAG + prompt-engineering pipelines
— 08 / contact

Let's build the context layer together. say hello ↗

email
msplmansit@gmail.com
phone
+91 77580 11732
linkedin
/in/mansit-suman ↗
based
New York · since Jun '25