Data engineer.
Building pipelines
teams rely on.
I build pipelines, orchestration systems, and data infrastructure that organizations can trust. I care about data quality, reliability, and code the next engineer can follow. Right now I'm shipping open source contributions to repos at Astronomer and Tesla, working alongside the engineers who built the tools I use every day.
Open Source Contributions
Projects
Data Lineage Claude Skill
Change one column. This finds everything that breaks. Traces all downstream dbt, dashboards, notebooks, and pipelines before you push.
↗ LLM · ETLLLM-Augmented Metadata Pipeline
LLaMA reads your table metadata, writes the SQL. Medallion ETL on PySpark + AWS Glue. 20% faster ingestion, 30% faster analyst turnaround.
↗ LLM · RetrievalRAG Retrieval Optimization
Bad chunking kills recall. Benchmarked 3 hybrid strategies on Milvus. Recall@5 up 14%, search latency down 35%.
↗ Dev Toolcc-catalyst
Same Claude, lower bill. A local proxy between Claude Code and Anthropic's API that cuts token spend without touching your workflow.
↗ Claude Codeclaude-vibecheck
Stop shipping code you don't understand. Narrates non-obvious logic in plain English the moment you write it.
↗ Multi-AgentAgentic Data Engineering Platform
Drop in your schema, walk away. Multi-agent system on Google ADK that handles prep, scheduling, and BigQuery loading. ETL dev from 5+ hours to minutes.
Contact
Let's work
together.
Open to full-time data engineering roles. If your team is building something serious with data, I want to hear about it.
Say hello ↗