John de Graft-Johnson

AI/ML Engineer SME

Production AI for NHS clinical decision support, AI governance, and agentic document pipelines — six shipped projects.

Healthcare AIGovernanceAgentic Pipelines

Impact at a Glance

6

Production Projects

10K+

CPRD Patients Modelled

7

Governance Frameworks Applied

Experience

Eight years across data engineering, applied ML, and AI platform delivery — recently focused on evaluation frameworks, fine-tuning, and responsible-AI tooling.

  1. AI Engineering Lead · Swain Solutions LLC

    2025 – 2026

    Washington, D.C.

    • Built LLM evaluation harnesses end-to-end in Python — deterministic judges, paired LLM auditors, weighted composite scorecards — covering fairness, factuality, and compliance dimensions across regulated proposal and clinical pipelines.
    • Fine-tuned open-weights models (Gemma family) with LoRA and preference-optimization workflows; instrumented training with red-teaming harnesses (inspect_petri), Fairlearn fairness audits, and LangSmith observability.
    • Owned the medallion data architecture (dbt-duckdb, bronze → silver → gold) and GPU inference path (CUDA, sub-1ms latency) for a nine-service Azure-native AI platform, including federated learning across 217 tickers and a live calibration model over 19K+ rows.
    • Embedded a thirteen-point responsible-AI framework (HIPAA, NIST, CMMC, MHRA GMLP) into CI gates; every deployment ships with an evidence pack rather than a checklist.
    PythonLLM evalLoRA / SFTRed-teamingdbtAzure
  2. Product Analytics Manager · W.R. Grace

    2024 – 2025

    Columbia, MD

    • Designed evaluation and drift-detection frameworks for enterprise forecasting models; reduced forecast-to-actual variance by 18% through systematic empirical iteration rather than ad-hoc tuning.
    • Owned the medallion data architecture (bronze / silver / gold) and feature engineering layer; coached analysts on model evaluation, feature selection, and production deployment standards.
    • Translated ambiguous business questions into reproducible Python experiments and decision-ready artefacts for Finance, Operations, and the executive team.
    PythonModel evalDrift detectiondbtForecasting
  3. Senior Data Analyst · W.R. Grace

    2022 – 2024

    New Orleans, LA

    • Directed an enterprise revenue-intelligence program covering $600MM+ in monthly performance; built statistical forecasting models that reduced planning cycle time by 25%.
    • Engineered production data pipelines (Python, SQL, dbt, Airflow) with data-quality gates, partitioning strategy, and CI/CD for analytics infrastructure.
    • Applied experimentation frameworks for customer segmentation and cohort analysis; used model-driven evidence to influence go-to-market prioritization.
    PythonSQLAirflowForecastingExperimentation
  4. Data Engineer · W.R. Grace

    2019 – 2022

    Baltimore, MD

    • Established the data-integrity, governance, and compliance programs later scaled across the enterprise; built audit frameworks for regulated reporting environments.
    • Designed Power BI dashboards with automated anomaly detection; data-driven interventions cut system downtime by 28% and lifted SLA compliance.
    • Raised data accuracy by 14% through predictive modelling, outlier remediation, and systematic root-cause analysis.
    PythonAnomaly detectionGovernanceSQLPower BI

Selected Projects

Each project is shipped end-to-end — data pipeline, model, governance, deployed UI. Cards with an architecture map can be expanded inline.

LLM Evaluation & Oversight(2)

LLM Eval · OversightOpen Source

AI Proposal Intelligence

Production LLM Evaluation Harness · Scalable-Oversight Pattern

Summary
LLM-as-judge evaluation harness with paired auditors and a weighted composite scorecard — gates AI outputs before release.
Tech
Python · FastAPI · pydantic · pytest · CI/CD
Data & AI
Claude · LLM evaluation · LLM-as-judge · paired auditors · scalable oversight
Use Cases
Government / FSI proposal QA · AI eval pipelines · Scalable-oversight tooling
AI Eval HarnessOpen Source

Healthcare Dashboard Ops

LLM-as-Judge platform · Power BI · GIS · Forecast models

Summary
One spec produces a Power BI dashboard, GIS choropleth, and 12-month forecast — gated by 16 deterministic + LLM evaluators on 31M CMS Medicaid rows.
Tech
Python · Leaflet · Azure Container Apps
Data & AI
DuckDB · dbt · Power BI · Claude · LLM evaluation · SARIMA · Prophet
Standards
CMS Medicaid (T-MSIS)
Use Cases
Healthcare AI platforms · BI release gating · Medicaid / payer analytics

RAG / LLM Systems(1)

Clinical RAGLive

Clinical Decision Support RAG Assistant

Evidence-grounded answers · DOI-cited

Summary
DOI-anchored RAG over peer-reviewed biomedical evidence with knowledge-graph entity resolution and low-evidence fallback.
Tech
Python · TypeScript · Next.js · FastAPI
Data & AI
RAG · LangChain · pgvector · Neo4j (knowledge graph)
Standards
DOI citations · peer-reviewed sources
Use Cases
Biomedical evidence assembly · Member triage · Drug-target dossiers · Rare-disease cohorts
GitHub →ArchitectureTry It Out →

Machine Learning & Algorithms(1)

Clinical AILive

Patient Disengagement Prediction

NHS Primary Care · AI Decision Support

Summary
XGBoost early-warning model (AUC 0.94) for GP disengagement, with SHAP explainability, IMD fairness audit, and UK GDPR Art. 22 compliance.
Tech
Python · FastAPI · Next.js
Data & AI
XGBoost · SHAP · Neo4j · Responsible AI · fairness audit
Standards
OMOP CDM · SNOMED CT · QOF · UK GDPR Art. 22
Use Cases
NHS GP practices · ICB risk stratification · Equity-audited clinical AI
GitHub →ArchitectureTry It Out →

Geospatial Analytics(1)

Geospatial AILive

UK Health Map

NHS ICB Risk Visualisation

Summary
Drill-down NHS choropleth (national → ICB → practice) layered with disengagement risk, IMD, and CQC ratings on a Delta Lake silver layer.
Tech
Next.js · TypeScript · Leaflet · Python
Data & AI
Delta Lake · GeoJSON
Standards
NHS ICB · IMD quintiles · CQC ratings
Use Cases
ICB commissioning intelligence · NHS England planning · Population-health overlays
GitHub →ArchitectureTry It Out →

Responsible AI & Governance(1)

Responsible AIPilot

AI Health Equity Audit Tool

Bias Detection · NICE ESF Tier B

Summary
Automated fairness pipeline producing NICE ESF Tier B / Core20PLUS5-aligned equity reports as PDF + machine-readable JSON.
Tech
Python · FastAPI · ReportLab
Data & AI
fairlearn · Responsible AI · equalized-odds difference
Standards
NICE ESF Tier B · NHS Core20PLUS5 · UK GDPR Art. 22
Use Cases
Clinical AI governance · NHS equity audits · Model-monitoring artifacts
GitHub →ArchitectureTry It Out →

Capital Markets & Quant(2)

Algo TradingOpen Source

propfirmbot

Futures Strategy Framework · IBKR Adapter

Summary
MIT-licensed futures-trading framework with a DXY-confluence ORB strategy, broker-agnostic adapter boundary, and HTML backtest reports.
Tech
Python · pandas · ib_insync · pytest
Data & AI
backtest harness · DXY confluence gate · ORB / VCP / liquidity-sweep
Standards
MIT OSS · Interactive Brokers paper account
Use Cases
Prop-firm evaluation runs · Micro-gold ORB trading · Broker-portable strategy research
Capital MarketsLive

StockHub

Equity Research · Macro Signals

Summary
Live equity-research workspace with macro overlays, cached fundamentals, and portfolio risk analytics on a real-time market-data pipeline.
Tech
Next.js · TypeScript · Python · FastAPI · WebSockets
Data & AI
real-time market data · technical indicators · portfolio analytics
Use Cases
Retail equity research · Macro-overlay screening · Portfolio risk monitoring
GitHub →ArchitectureTry It Out →