Skip to content
View kai2055's full-sized avatar
🎯
I build ML systems, and the part I care about most is the one most tutorials ski
🎯
I build ML systems, and the part I care about most is the one most tutorials ski

Block or report kai2055

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
kai2055/README.md

Nikhil Adhikari

MSc Data Science, AI & Digital Business @ GISMA University of Applied Sciences (Berlin). Headed for ML Engineering / MLOps / ML Reliability Engineering.

I build ML systems that don't just work — they stay working.

Most ML portfolios show a model that runs once. Mine show systems that keep running: data validated before it enters, models watched in production, and failures turned into measurable, deployment-gating feedback. One principle, three layers.

The three layers

ML systems fail at three points: bad data gets in, the model drifts in production, or the same failure repeats because nobody learned from it. Each project below hardens one of those points.

1. Data layer — Data Quality Checker

Catches bad input before it ever reaches a model. What breaks without it: silent garbage-in-garbage-out — the model trains or scores on corrupt data and no one notices until the numbers are wrong downstream. Deployed on GCP Cloud Run.

2. Model layer — ML Reliability Pipeline

Keeps a model working after it ships, not just on the day it's deployed. What breaks without it: a model that passed every test on launch day quietly degrades in production, and the first sign is an angry user, not a metric.

Turns past failures into searchable, deployment-gating feedback — a three-layer RAG system that retrieves, diagnoses, and evaluates engineering post-mortems. What breaks without it: the same outage happens twice because the lesson from the first one is buried in a wiki. Built on LangGraph, ChromaDB, and local LLMs via Ollama, with a RAGAS-backed evaluation framework. Layer 1 complete; diagnostic agent in progress.


How I build the foundation: python-llm-guided-practice · ml-study-lab · sql-practice — daily practice, retyped from spec, not skimmed.

Pinned Loading

  1. ml-reliability-pipeline ml-reliability-pipeline Public

    Production ML pipeline with drift monitoring (PSI + Wasserstein), FastAPI serving, and CI/CD on GCP Cloud Run. 110 tests, 26 ADRs.

    Python

  2. incident-postmortem-assistant incident-postmortem-assistant Public

    Three-layer RAG system that retrieves, diagnoses, and evaluates engineering post-mortems — with the evaluation framework as the core, not an afterthought.

    Python

  3. csv-health-tracker csv-health-tracker Public

    Python tool for validating CSV files before data processing. Detects missing values, duplicates, malformed headers, and data quality issues. Learning project: v1 (simple script) → v2 (modular archi…

    Python