Work
Production-grade reference architectures and open-source tools for the modern data & AI stack. 9 projects.
Production-oriented reference architecture for real-time fraud detection using Redpanda, dbt, RisingWave, and Grafana
A production-ready CI/CD pipeline for dbt projects using GitHub Actions and Slim CI — running only modified models and their downstream dependencies to cut build times and warehouse costs.
A locally-running AI agent powered by Ollama and the dbt MCP server — answering questions about your semantic layer without sending data to the cloud.
A framework for defining, enforcing, and monitoring data contracts across a dbt project — guaranteeing that data producers keep their promises to consumers at build time.
A containerised dbt + Airflow setup using Docker Compose — orchestrating dbt runs as Airflow DAGs with isolated environments, centralised logging, and easy local development.
A local-first Modern Data Stack reference using dlt for ingestion, dbt for transformation, and DuckDB as the engine — with a MetricFlow semantic layer and Rill dashboards.
Production-grade Modern Data Stack reference on Databricks Unity Catalog — Medallion architecture with dlt ingestion, dbt transformation, MetricFlow metrics, and GitHub Actions CI/CD.
An automated star-schema data generator for dbt and DuckDB — producing realistic, referentially-intact dimension and fact tables for testing and development.
An automated pipeline that ingests ASX mining stock data via Yahoo Finance, transforms it with dbt, generates 45+ ML features, and surfaces trading signals through Streamlit dashboards.