Skip to content
View designer-coderajay's full-sized avatar

Highlights

  • Pro

Block or report designer-coderajay

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
designer-coderajay/README.md

Typing SVG


LinkedIn arXiv PyPI Email X


πŸ“ Trier, Germany Β |Β  πŸŽ“ MSc AI @ Hochschule Trier (May 2026) Β |Β  🏒 ML Engineer @ One75 Labs, Berlin


Who I Am

I build evaluation infrastructure for language models β€” not dashboards that look good in demos, but pipelines that surface what metrics actually measure versus what they claim to measure.

My core thesis: confidence scores are lying to you. I proved it with a near-zero correlation (r = 0.009) between model confidence and internal reasoning faithfulness. That finding came from combining activation patching, causal circuit analysis, and a reproducible benchmarking framework I built from scratch.

Currently writing my MSc thesis on explainable AI for LLMs with causally grounded natural language explanations, while working as an ML Engineer at One75 Labs in Berlin on production LLM evaluation systems.


Highlights That Matter

What Result
πŸ”¬ Causal circuit discovery speed 1.2s on CPU vs 43.2s baseline β€” 37Γ— faster than ACDC (Conmy et al. 2023)
πŸ“Š Confidence vs. faithfulness correlation r = 0.009 β€” near-zero. Confidence-based eval signals are unreliable.
βœ… LLM explanation quality 99% quality via ERASER metrics vs. 60% template baseline
πŸ§ͺ CI reliability 12/12 passing tests β€” reproducible, auditable evaluation framework
πŸ“¦ Open-source reach Published on arXiv, deployed on Hugging Face, packaged on PyPI with 76 automated tests
πŸ“ Research output Submitted to ICML 2026 Workshop on Mechanistic Interpretability

Featured Projects

Python PyTorch TransformerLens arXiv PyPI Hugging Face

The project that came out of a direct question: can we tell, causally, which parts of GPT-2 drove a specific prediction?

  • Built a causal circuit discovery engine that answers that question in 1.2s on CPU using 3 forward passes β€” 37Γ— faster than the ACDC baseline
  • Quantified r = 0.009 correlation between model confidence and internal reasoning faithfulness, a result with direct implications for EU AI Act compliance
  • Automated generation of all 9 required EU AI Act Annex IV sections from a single function call β€” structured JSON output ready for GRC system import
  • Published on arXiv (2603.09988), deployed a live Hugging Face demo, and shipped to PyPI with a CLI + 76 automated tests

Compliance teams can audit any model in under a minute with zero infrastructure setup.


Azure OpenAI Azure AI Search FastAPI Streamlit GPT-4o-mini

Document Q&A system with source citations built on Azure's full AI stack.

  • Hybrid search combining vector embeddings + keyword matching for semantically-aware retrieval
  • Document ingestion pipeline with 512-token chunking and text-embedding-3-small embeddings
  • FastAPI backend + Streamlit frontend with streaming responses for real-time answer generation

Azure Machine Learning MLflow scikit-learn Azure ML SDK v2

Automated 4-step ML pipeline: data prep β†’ training β†’ evaluation β†’ model registration.

  • 74% test accuracy, 80% F1, 87% AUC-ROC on heart disease prediction (200-record held-out test set)
  • Auto-scaling compute with minimum zero nodes β€” clusters shut down automatically when idle
  • MLflow tracking + Azure ML Model Registry for full experiment reproducibility and version rollback

Research

Explainable AI for LLMs: A Causally Grounded Pipeline Submitted to ICML 2026 Workshop on Mechanistic Interpretability

arXiv

The core finding: traditional attention-based metrics miss 39% of prediction behavior. Ground truth established via 100% sufficiency scoring using activation patching and causal circuit analysis. The pipeline converts technical circuit data into structured natural language explanations validated against ERASER metrics.


Stack

Languages: Python, SQL

ML / Research: PyTorch, TransformerLens, HuggingFace, scikit-learn, NumPy, Pandas

Cloud / Infra: Azure Machine Learning, MLflow, Docker, REST APIs, FastAPI, GitHub Actions, CI/CD

Core Expertise: Mechanistic Interpretability Β· Activation Patching Β· Transformer Architecture Β· LLM Evaluation Methodology Β· Causal Analysis Β· Python Package Development (PyPI) Β· Prompt Engineering


Certifications

Azure AI Engineer Azure AI Fundamentals BlueDot Google UX Research


Currently

  • πŸ“ MSc Thesis β€” Mechanistic interpretability of LLMs with causally grounded explanations
  • 🏒 ML Engineer @ One75 Labs β€” Production LLM evaluation infrastructure, Berlin
  • 🎯 Open to β€” ML Engineer / AI Researcher roles in the EU (post-graduation, May 2026)

I don't just run models β€” I open them up and see what's actually going on inside.


Pinned Loading

  1. Causally-Grounded-Mechanistic-Interpretability-for-LLMs-with-Faithful-Natural-Language-Explanations Causally-Grounded-Mechanistic-Interpretability-for-LLMs-with-Faithful-Natural-Language-Explanations Public

    MSc Thesis: Bridging mechanistic interpretability circuits to faithful natural language explanations using ERASER evaluation metrics

    Jupyter Notebook 2

  2. Glassbox-AI-2.0-Mechanistic-Interpretability-tool Glassbox-AI-2.0-Mechanistic-Interpretability-tool Public

    EU AI Act Annex IV compliance audit platform + mechanistic interpretability toolkit. White-box circuit analysis, black-box audit for any model via API. Open source. MIT.

    Python 1

  3. azure-ai-rag-system azure-ai-rag-system Public

    Production RAG system using Azure OpenAI + Azure AI Search + Blob Storage. Hybrid vector search, document chunking, streaming responses.

    Python 1

  4. azure-ml-pipeline azure-ml-pipeline Public

    End-to-end ML pipeline on Azure Machine Learning for heart disease prediction. Features 4-step automated workflow (data prep, training, evaluation, registration), MLflow experiment tracking, and ma…

    Python 1

  5. activation-patching-framework activation-patching-framework Public

    Causal intervention framework for mechanistic interpretability research. Implements activation patching methodology for identifying causally important components in transformer language models.

    Python 1

  6. logit-lens-explorer logit-lens-explorer Public

    Mechanistic interpretability tool visualizing GPT-2's layer-by-layer predictions using the logit lens technique

    Python 2