Open to Analytics roles · New York

Mansit Suman_

~/analytics_engineer # data · engineering · ai · self-service

Doing context & data quality engineering before it was cool.

Engagement Manager · Data & AI Consultant based in New York since June 2025. Six years at MathCo designing canonical data models and self-service foundations across Fortune 10 and Fortune 100 clients — in commercial, supply chain, and People & Operations. Currently architecting an agentic ETL system that automates engineering work end-to-end.

6 yrs analytics & data eng. 30-person team led F10 & F100 clients NY · Jun '25 → DES'25 speaker

— 01 / work

Selected projects, filter to taste.

Latest first. Click any row to expand the spec.

summaryA multi-agent ETL system that automates data engineering end-to-end — from STTM authoring through transformation generation to customer-ready datasets aligned to pre-defined canonical models.

agentsSource profiler · STTM author · transform generator · model validator · doc & lineage writer. Human-in-the-loop checkpoints between stages.

whyEngineers should architect; agents should bolt the parts together. Compress weeks of plumbing into hours, so consultants spend time on stakeholders and design.

statusActive development — demo →

stackPythonLLM orchestrationdbtSnowflakeDuckDB

summaryThree-year analytics roadmap for the South America business group of a Fortune 10 client. Interviewed leaders across every business function, mapped current state to north star, prioritised the build sequence.

stakeholdersC-suite + heads of supply chain, commercial, finance, digital, and R&D across the SAM region.

approachCross-functional discovery → maturity assessment → capability roadmap → 3-year investment plan.

artifactsCapability heat-map · prioritisation matrix · sequenced delivery roadmap · investment ask.

impactAligned executive sponsors on a 3-year analytics investment direction across multiple lines of business.

summaryA centralised, governed context layer connecting enterprise data sources to Text2SQL, DQ, Profiler, STTM, and Agentic ETL agents.

design5 context-object types · LLM auto-tagging with steward review · semantic + hybrid retrieval · RBAC · versioning with rollback · usage tracking.

judgementSteered scope away from a sprawling "do-everything" draft → focused POC the stakeholders could actually fund.

impactReusable middle layer enabling multi-agent trust across the analytics stack.

stackFastAPIPostgresPGVectorAzure BlobReact

summaryDesigned and shipped the A360 Common Data Model for the entire hire-to-retire lifecycle for a Fortune 100 client's People & Operations function. Engineered to be agnostic to future acquisitions — new entities plug in without breaking the schema.

modellingPeople data done right: slowly-changing dimensions for employee attributes, transactional fact tables for HR events, hire-to-rehire continuity preserved across the lifecycle.

lifecycleConceptual diagrams → logical model → physical deployment in production — end-to-end, multiple times, also across supply-chain and commercial domains.

productsAssociate profile views, journey mapping, and AI-insight enablement layered on top of the canonical model.

practiceTrained junior engineers on the modelling discipline across HR, supply chain, and commercial data — codified the mistakes I'd made so they wouldn't have to.

stackSnowflakedbtPower BIAzure

summaryA unified data model spanning sales orders, purchase orders, and supply orders — the first time the commercial and supply-chain teams at a Fortune 100 CPG spoke the same language. Sat at the heart of the SSOT.

modellingOne canonical order grain. One calendar. One product hierarchy. Joinable across LOBs without translation layers — supporting every downstream report the business wanted.

stakeholdersVP IT · Analytics Manager · supply-chain & commercial leads. Translated their separate vocabularies into a single one the data could speak.

slasWarehouse monitoring surfacing workload anomalies + cost drift; +30% data reliability on the SSOT.

impact$650K annual savings · decisions on the same numbers, finally.

stackSnowflakeAzurePower BIPython

summarySSOT for a battery distribution company. Led a 5-person team to build the data foundation that put the right numbers in the right hands at the right time.

productTruck loaders received their morning load list at 8:00 AM — which batteries to load, in what quantities, for what routes. Calibrated against actual demand rather than yesterday's guess.

impactReal money saved on inventory and over-ordering. Data has more power than we usually imagine — sometimes it's the 8 AM list, not the dashboard.

practiceMentored the 5-person team end-to-end through the modelling and delivery lifecycle.

stackSnowflakeAzurePythonPower BI

summaryReusable Python module that bootstraps DQ rules from metadata + lineage via LLMs — semantic checks, anomaly detection, explainable enough that stewards actually approve the rules.

whyHelpful, honest, harmless AI starts with helpful, honest, harmless data.

impactAdopted in 6 projects, secured 4 new engagements.

stackPythonGenAISnowflake

— 02 / principles

How I work.

The implicit contract behind everything above.

Ship before certain.

Bias for action over architectural perfectionism. Pulled the Context Store POC back from a six-month sprawl into something fundable in a quarter. The metric is shipped, not designed.

Full-stack on purpose.

Stakeholder discovery → schemas → pipelines → dashboards → adoption. Not because the title says so — because each layer informs the next. The dashboard fights you when the model is wrong.

Honest data over clever data.

Helpful, honest, harmless AI starts with helpful, honest, harmless data. The only thing worse than no metric is a metric you can't trust. Built a GenAI-powered DQ accelerator on that premise.

Translate, don't transcribe.

"We need a dashboard" means something different to the VP than to the PM. Six years of consulting was a masterclass in figuring out which one matters — and saying so out loud.

— 03 / stack

Tools I reach for without thinking.

Shipped with, repeatedly. Adjacent ones called out separately.

languages

SQL
Python
Scala / Spark
Bash

pipelines & orch.

dbt
Airflow
Azure Data Factory
Databricks · GitHub CI/CD

warehousing

Snowflake
DuckDB
Lakehouse · Medallion
Kimball · Data Vault

bi & genai

Power BI · Tableau
RAG · hybrid retrieval
Multi-agent design
LLM evals

adjacent Hex· MotherDuck· Looker # comfortable picking up fast

— 04 / cv

Six years at one company.

$ git log --oneline

HEAD →

Engagement Manager · Data & AI Consultant

MathCo · New York · 30-person team · F10 + F100 clients

Jun 2025 — Now

9af6e31

Engineering Manager · Delivery Lead

MathCo · Bengaluru · $1.2M ARR · 15-person team

Jul 2024 — May 2025

b3f5c20

Senior Associate

MathCo · Bengaluru

Jul 2023 — Jun 2024

4e7d1a8

Associate · Engineering Horizontal

MathCo · Bengaluru

Jul 2022 — Jun 2023

c91a2b7

Analyst

MathCo · Bengaluru

Sep 2020 — Jul 2022

→ Fastracked to EM in 4 years → Zero escalations across $1.2M ARR account → B.Tech, VNIT Nagpur → SnowPro Core · AWS SAA

— 05 / on stage

Data Engineering Summit, 2025.

Agentic systems are stochastic, not deterministic — so even a 0.1% gain in accuracy matters. Enriching context the right way isn't optional; it's essential. — Architecting Contextual Layers for GenAI · DES'25

Watch

— 06 / said about

Client notes, verbatim.

"Mansit brings a rare blend of technical depth and business insight. Clarity in communication, proactiveness, genuine commitment — the kind of high-performance collaboration every individual should strive for."

VP, IT · U.S.-based Fortune 100 CPG

"They understand things faster than most other contractors I've worked with. Open to feedback, always positive and helpful. Mansit's drive for excellence continues to raise the bar."

Senior Director, Analytics · U.S.-based Fortune 100 CPG

— 07 / off-clock

Books I keep close, and what I tinker with.

currently on the shelf

Designing Data-Intensive Applications — Martin Kleppmann
AI Engineering — Chip Huyen
The Data Warehouse Toolkit — Ralph Kimball

interests & practice work

community Co-founded Tech Minds at MathCo · 250–300 engineers/session
knowledge Built Engineering Hub — practice repo · 50–60 active users/week
framework Problem-assessment framework adopted in 100+ projects · +20% CSAT
mentoring Co-designed curriculum for 190 freshers
side project Data engineering co-pilot · RAG + prompt-engineering pipelines

— 08 / contact

Let's build the context layer together. say hello ↗

msplmansit@gmail.com

phone

+91 77580 11732

/in/mansit-suman ↗

based

New York · since Jun '25