Skip to content
View k10sj02's full-sized avatar
πŸ˜„
πŸ˜„

Block or report k10sj02

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
k10sj02/README.md

Hi, I’m Stann πŸ‘‹πŸ½

I build end-to-end data systems that turn messy real-world data into decisions about where to focus, who to prioritize, and how to allocate limited resources.

Previously built systems used across:
β†’ $16B+ in modeled fundraising portfolios
β†’ 500K+ leads scored in production CRM pipelines
β†’ 150K+ voter outreach interactions across 13 U.S. states and 5 Canadian provinces, spanning 1200+ communities

Most of my work lives at the intersection of:
β†’ analytics engineering
β†’ applied modeling
β†’ product thinking

Often in environments where the data is incomplete, biased, or operationally messy (fundraising, civic tech, campaigns, early-stage startups).

In practice, much of my work involves building data systems that help organizations decide who to ask, how much to ask, and when to ask β€” whether the audience is donors, voters, or customers.


πŸš€ Featured Projects

πŸ—³οΈ Voter Engagement Targeting & Analytics System

End-to-end data pipeline and targeting system for a national voter outreach campaign across 13 U.S. swing states, uncovering previously unknown outreach locations and shaped targeting and deployment strategy across 15,000+ engagement sites.

β†’ designed and operationalized the geospatial targeting pipeline (Google Places β†’ Census enrichment β†’ BigQuery)
β†’ partnered with messaging analytics pipeline (ThruText β†’ S3 β†’ GCS β†’ BigQuery) to enable downstream reporting

β†’ developed a service density KPI to quantify outreach coverage vs real-world infrastructure
β†’ identified 400K+ locations and prioritized high-impact communities for outreach

πŸ”— https://github.com/k10sj02/barbershop-voter-engagement-analytics


πŸ’Έ Donor Retention & Propensity System

Donor propensity scoring system for nonprofits modeled on real fundraising workflows, built to prioritize outreach toward highest value donors most likely to give again.

β†’ engineered 15+ RFM and donor profile features from transactional data to capture giving lifecycle behavior
β†’ trained Random Forest classifier (ROC-AUC 0.87, 2.5x lift at top 33%) to predict likelihood of repeat giving
β†’ designed four-tier segmentation framework (High / Medium / Low / Very Low) with actionable outreach guidance per tier

β†’ deployed as an interactive Streamlit app with CSV ingestion, column mapping, live filtering, and export for fundraising teams

πŸ”— https://github.com/k10sj02/nonprofit-donor-scoring


πŸ—‚οΈ Voter Registration Data Pipeline

Production-style partner data ingestion pipeline for a voter registration reporting model, handling the full transformation lifecycle from raw extract to unified reporting table.
β†’ designed a three-layer architecture (raw β†’ staging β†’ mart) with an auditable, non-destructive working copy at each stage
β†’ built deterministic validation rules for contact data (email, ZIP, NANP phone) and demographic bounds (age 18–105, recency constraints)
β†’ engineered county enrichment via ZIP lookup using LEFT JOIN to preserve record fidelity over silent data loss
β†’ implemented window-function deduplication with explicit tie-breaking logic (Complete status β†’ recency β†’ email)
β†’ produced schema-aligned UNION ALL integration with semantic field mapping and UUID surrogate keys for lineage tracking

πŸ”— https://github.com/k10sj02/voter-reg-pipeline


🧠 Behavioral Modeling: Gender Norms

Predictive modeling and validation study on how social norms shape behavior, highlighting the limits of prediction in complex, real-world data.

β†’ trained classification models (Random Forest, Logistic Regression) on attitudinal survey data
β†’ identified a hard performance ceiling (~0.65 AUC) and investigated underlying causes of limited predictability
β†’ traced constraints to survey design, measurement limitations, and noise in self-reported attitudes

β†’ paired modeling with a literature review to contextualize results within social science research

β†’ deployed an interactive app to surface predictions and feature-level drivers

πŸ”— https://github.com/k10sj02/gender-norms-predictor


πŸ§ͺ What I Focus On

  • building systems, not one-off analysis
  • turning data into decisions, not dashboards
  • designing metrics that actually reflect real-world behavior
  • working in imperfect, real-world data environments

βš™οΈ Tech

Core: SQL, Python
Warehousing: BigQuery, Snowflake, PostgreSQL, dbt
Apps & Viz: Streamlit, Tableau, Power BI, Looker
Infra: GCP, AWS, Docker, Git


🌐 Elsewhere

Portfolio: https://stannomarjones.com
LinkedIn: https://linkedin.com/in/stannomarjones

Pinned Loading

  1. barbershop-voter-engagement-analytics barbershop-voter-engagement-analytics Public

    Turning barbershops into ballot boxes. Data pipeline and interactive explorer built for Shape Up The Vote's 2024 national voter engagement campaign across 13 swing states.

    Jupyter Notebook

  2. nonprofit-donor-scoring nonprofit-donor-scoring Public

    Donor propensity scoring app for nonprofits. Predicts donor retention using Random Forest with RFM, engagement, and demographic features, with an interactive Streamlit dashboard.

    Python

  3. voter-reg-pipeline voter-reg-pipeline Public

    Interactive Streamlit app demonstrating a production-style voter registration data pipeline. Covers deduplication, contact validation, county enrichment via ZIP lookup, and schema-aligned UNION int…

    Python

  4. gender-norms-predictor gender-norms-predictor Public

    Predicts whether men feel expected to make the first move in romantic relationships, using behavioral and attitudinal data from the 2018 WNYC/FiveThirtyEight masculinity survey (n=1,615). Includes …

    Jupyter Notebook 1