I build and deliver end-to-end analytical projects that solve real business problems β combining machine learning, deep learning and LLMs with clear business storytelling. Currently open to Data Analyst, Data Scientist and Business Intelligence roles.
- π Master's in Business Analytics from the University of Leeds
- πΌ Background in Data Analytics + Business Development β I speak both technical and commercial
- π€ Currently building: Real-time Fraud Detection with XGBoost, LSTM and GPT-4o-mini
- π± Learning: LLM integration, MLOps, cloud deployment
- π Based in the UK | Open to remote and hybrid roles
Production-grade fraud detection pipeline on 590K real transactions combining three complementary ML approaches with LLM-powered explanations.
- Built ensemble pipeline: Isolation Forest (unsupervised) + XGBoost (AUC 0.924) + LSTM sequence model (AUC 0.891) to catch fraud patterns no single model detects alone
- Engineered 12 domain-specific fraud features including transaction velocity, card spend deviation and time-based anomaly flags
- Integrated SHAP + GPT-4o-mini to auto-generate plain English fraud analyst reports for every flagged transaction
- Deployed as a live Streamlit dashboard with real-time transaction simulation, fraud alerts and model comparison visualisations
Python XGBoost TensorFlow LSTM SHAP OpenAI Streamlit Plotly
End-to-end churn prediction pipeline for a telecom company β from raw data to deployed ML model to Power BI executive dashboard.
- Cleaned and analysed 7,043 customer records across 21 features, identifying key churn drivers including contract type, tenure and monthly charges
- Built and compared three models β Logistic Regression, Random Forest and XGBoost (AUC 0.91) β with full SHAP feature importance analysis
- Wrote advanced SQL queries (window functions, CTEs, CASE WHEN aggregations) to segment customers by risk profile
- Delivered a Power BI dashboard with slicers by contract type, risk segment and internet service for business stakeholder reporting
Python pandas XGBoost SQLite SQL Power BI scikit-learn seaborn
Quantitative investment model using time series analysis and ML to forecast FTSE 100 stock returns.
- Analysed 500K+ data points using ARIMA, GARCH and ensemble ML models
- Built in R with full statistical validation and backtesting framework
- Delivered actionable investment signals with confidence intervals
R ARIMA GARCH Time Series Python scikit-learn
ML platform for cryptocurrency ICO investment analysis using NLP and sentiment analysis.
- Built predictive models using Naive Bayes classification on ICO whitepaper text data
- Implemented end-to-end sentiment analysis pipeline on social and news data
- Combined NLP signals with financial features for investment scoring
Python NLP Naive Bayes Sentiment Analysis Machine Learning
Interactive executive dashboard analysing Β£963M revenue across 22 countries for strategic decision making.
- Identified market expansion opportunities and underperforming regions through geographic revenue analysis
- Built drill-through reports and dynamic slicers for C-suite stakeholder consumption
- Connected Python data pipeline to Power BI for automated refresh
Power BI Python SQL DAX Data Modelling
πΌ Open to: Data Analyst Β· Data Scientist Β· Business Intelligence Β· ML Engineer roles
π€ Available for: Full-time Β· Contract Β· Remote Β· Hybrid