Auditing AI Fairness in Insurance: A Responsible AI Framework Built from Scratch

Summary

I built a complete Responsible AI audit system for health insurance premium prediction — covering fairness, explainability, counterfactual recourse, and regulatory compliance — and deployed it as a live, interactive tool that runs entirely in the browser. Using a dataset of 1,338 US insurance records, I trained a Logistic Regression classifier to predict whether a policyholder's annual charges exceed the median threshold of $9,382. The model achieves 89.9% accuracy and a ROC-AUC of 0.9425. On top of the model, I layered four RAI capabilities: Fairlearn for demographic fairness auditing, SHAP for decision-level explainability, Microsoft's DiCE for counterfactual recourse generation, and a manual mapping to EU AI Act and NIST AI Risk Management Framework requirements. The model is exported to ONNX and runs client-side via WebAssembly — no inference server required. This project demonstrates that responsible AI is not a theoretical aspiration; it is an engineering discipline with concrete, deployable tools.

1. Why I Built This

Let me start with a scenario that is not hypothetical. An insurance company deploys an AI model to help price health insurance premiums. The model is accurate — it predicts high-cost policyholders correctly most of the time. But nobody at the company can explain why it flags a specific 58-year-old woman from the northwest as high-risk. The model just does. The woman's premium goes up. She has no recourse, no explanation, and no way to know whether the decision was fair.

This is the black box problem, and it is not a minor inconvenience. It is a structural failure in how AI is being deployed in high-stakes domains. I built this project because I wanted to demonstrate — concretely, with working code and a live interface — that the black box problem is solvable. Not with philosophy, but with engineering.

The Black Box

Most production AI models produce a prediction with no explanation. A claims adjudicator, an underwriter, or a policyholder cannot interrogate the decision. SHAP (SHapley Additive exPlanations) solves this by computing, for every single prediction, exactly how much each input feature contributed to the outcome. I can tell you not just that the model flagged someone as high-cost, but that age contributed +0.22 to that decision, region contributed +0.11, and smoker status contributed almost nothing.

2. What This Project Does

I built five interconnected capabilities, each addressing a different dimension of responsible AI:

3. Interactive Responsible AI Dashboard

Explore the end-to-end evaluation of the health insurance premium prediction model featuring live ONNX inference, Fairness metrics, SHAP explainability, and DiCE counterfactuals.

Dashboard Chart Legends & Explanations

Chart 1: Disparate Impact Ratio by Region

Horizontal bar chart. Each bar represents one of four US regions (northeast, northwest, southeast, southwest). The bar length shows that region's disparate impact ratio relative to the highest-scoring region (southwest = 1.0, the baseline). The red dashed vertical line at 0.8 marks the regulatory 4/5ths rule threshold — any bar to the left of this line indicates potential discrimination requiring investigation. Northwest scores 0.838, which is above the threshold but borderline — it warrants monitoring in a production deployment. All four regions pass the threshold in this audit. The chart is interactive on the live site: hovering over a bar shows the exact ratio and a pass/fail indicator.

Chart 2: SHAP Waterfall Chart (Case B)

Horizontal bar chart with one bar per feature. Bars pointing LEFT (shown in green/teal) have negative SHAP values — they push the prediction toward Low Cost. Bars pointing RIGHT (shown in red) have positive SHAP values — they push toward High Cost. The baseline (E[f(x)] = model's average prediction) is shown at the left edge; the final prediction probability is shown at the right. For this 19-year-old male, non-smoker from the northwest: age contributes −0.2333 (the longest bar, pointing strongly left — youth is the dominant reason for the low-cost prediction), region_southeast contributes −0.0816, region_northwest contributes −0.0747, BMI contributes −0.0605, and children contributes −0.0274. Sex_male and smoker_yes are near-zero. The chart is interactive: hovering over each bar shows the feature value and exact SHAP contribution.

Chart 3: Counterfactual Recourse Explorer (Case CF2)

Side-by-side card layout. The left card shows the original profile: age 59, female, BMI 27.5, non-smoker, southwest — flagged as High Cost at 83.7% confidence. The three right cards show alternative profiles that would flip the decision to Low Cost. Changed values are highlighted in amber. Alternative 1: only age changes from 59 to 34. Alternative 2: age changes to 34 and BMI changes to 31.8. Alternative 3: age changes to 47. The fact that age alone drives the flip in Alternative 1 is the most important finding from this analysis: the model is so heavily weighted on age that a 25-year age reduction overrides all other factors. This raises a legitimate fairness concern for older policyholders — the model may be encoding age-based structural disadvantage rather than genuine individual risk.

Chart 4: Regulatory Compliance Checklist

Two-panel table. Top panel covers EU AI Act articles; bottom panel covers NIST AI RMF categories. Each row contains: the article or category identifier, the requirement name in plain English, a one-sentence evidence statement showing how this model satisfies the requirement, and a compliance badge. All four mapped items show green "Compliant" badges. EU AI Act Article 10 (Data Governance): satisfied by the Fairlearn fairness audit. EU AI Act Article 13 (Transparency and Explainability): satisfied by SHAP local explanations for every prediction. NIST MAP 1.5 (Impact Characterization): satisfied by the disparate impact ratio monitoring across sex, age, and region. NIST MEASURE 2.7 (Feedback and Recourse): satisfied by the DiCE counterfactual generation, which provides actionable recourse paths for affected individuals. The checklist is interactive on the live site: clicking any row expands to show the full evidence statement and a recommendation for strengthening compliance.

4. The Data

Dataset Overview

The dataset is the Medical Cost Personal Dataset — 1,338 records of US health insurance policyholders with six features: age (18–64), sex (male/female), BMI, number of children covered, smoker status (yes/no), and residential region (northeast, northwest, southeast, southwest). The target variable is annual charges billed by the insurer.

I converted this into a binary classification problem by splitting on the median annual charge of $9,382.03. Policyholders above this threshold are labeled high-cost (1); those below are labeled low-cost (0). The resulting dataset is perfectly balanced — 669 high-cost and 669 low-cost cases — which eliminates class imbalance as a confounding factor in the fairness analysis.

Limitation

This dataset is derived from 1994 US Census data. Demographic patterns, healthcare costs, and regional pricing structures have changed substantially since then. The findings here are directionally valid for demonstrating RAI methodology, but a production deployment would require more recent, representative data.

5. Methods — The Python Pipeline

4a. Binary Target Creation

I chose the median split because it produces a balanced dataset and is an interpretable threshold. In production, this threshold would be set by actuaries.

median_charges = df['charges'].median()
df['high_cost'] = (df['charges'] > median_charges).astype(int)

4b. Preprocessing Pipeline

Mixed feature types are handled with scikit-learn's ColumnTransformer: scaling numeric features and one-hot encoding categorical ones.

preprocessor = ColumnTransformer(transformers=[
    ('num', StandardScaler(), ['age', 'bmi', 'children']),
    ('cat', OneHotEncoder(drop='first', sparse_output=False),
            ['sex', 'smoker', 'region'])
])

4c. Model Choice

Logistic Regression was chosen over Random Forest for three reasons: ONNX export precision, exact SHAP interpretability, and a negligible accuracy trade-off (89.9% accuracy, ROC-AUC 0.9425).

model = Pipeline(steps=[
    ('preprocessor', preprocessor),
    ('classifier', LogisticRegression(max_iter=1000, random_state=42, C=1.0))
])

4g. ONNX Export

To run in browser, the pipeline is decoupled. The classifier is exported as ONNX, while the preprocessor state is saved as JSON for client-side preprocessing.

onnx_model = convert_sklearn(classifier, ...)
preprocessor_config = { "scaler_means": ... }

4d. Fairness Audit with Fairlearn

Fairlearn's `MetricFrame` is the core of the fairness audit. It computes any set of metrics stratified by a sensitive attribute.

Key Finding

The critical metric is the disparate impact ratio. The most significant finding: policyholders under 40 have a disparate impact ratio of 0.295 relative to those 40 and over. They are flagged as high-cost at less than a third the rate of older policyholders (DPD 0.5675). Is the model encoding risk, or is it encoding age as a proxy for something else?

4e. SHAP Explainability

SHAP computes the contribution of each feature to the deviation from the model's average prediction. I used shap.LinearExplainer for Logistic Regression, which produces exact SHAP values.

Global Feature Importance

age (0.2544): Dominant driver — 3× more important than any other feature.
region_southeast (0.1158): Second-tier — southeast policyholders skew higher cost.
region_northwest (0.1041): Third-tier — northwest borderline on disparate impact.
bmi (0.0283): Modest effect — obesity threshold matters but is secondary.
children (0.0243): Small effect — more dependents slightly increases cost tier.
region_southwest (0.0170): Minimal.
sex_male (0.0083): Near-zero — sex has almost no independent effect.
smoker_yes (0.0076): Near-zero — surprising given smoker's known cost impact, but explained by multicollinearity with age.

4f. DiCE Counterfactuals

DiCE (Diverse Counterfactual Explanations) answers the question: what is the minimum change to this person's profile that would flip the model's decision? I configured it to vary only non-protected features — age, BMI, children, and smoker status — while holding sex and region fixed.

6. How This Can Be Used in Practice

Integration with Policy Systems

The RAI audit layer integrates as a sidecar service. It does not replace the core system's decision, but it runs alongside every decision to audit it in real time, adding less than 50ms of latency.

Standalone Retrospective Audit

For legacy systems, the RAI audit runs as an independent microservice processing decision logs via a message queue to generate periodic compliance reports.

Underwriter Decision Support

The live prediction engine can be embedded directly into an underwriter's workstation, showing SHAP waterfalls and counterfactuals to make reasoning transparent and auditable.

Beyond Insurance

This architecture is domain-agnostic. It applies to loan approvals, hiring, medical triage, or credit scoring. Swap the dataset, and the RAI audit layer works identically.

7. Limitations and What Comes Next

Boundary Conditions

Dataset Currency: The dataset is derived from 1994 data. A production deployment requires recent, representative data.
Intersectional Fairness: The audit examines one sensitive attribute at a time. Intersectional analysis requires larger datasets.
Production Drift: The model is a static artifact. Without monitoring, a model that is fair today may become unfair as populations shift.
Age as a Variable: Age is legally protected in many contexts. A more rigorous implementation would hold age fixed in DiCE along with sex and region.

These are not failures of the project. They are the honest boundary conditions of a portfolio demonstration built on a public dataset. The architecture I built — ONNX inference, Fairlearn auditing, SHAP explanations, DiCE recourse, regulatory mapping — is the right architecture.

All code, data artifacts, and Svelte components for this project are available via the linked Colab notebook. The live inference engine runs at the project URL above.

Auditing AI Fairness

Auditing AI Fairness in Insurance: A Responsible AI Framework Built from Scratch

1. Why I Built This

The Black Box

The Bias

The Recourse

The Regulatory

The Black Box

2. What This Project Does

Live Prediction Engine

Fairness Audit

SHAP Explainability

Counterfactual Recourse

Regulatory Compliance Mapping

3. Interactive Responsible AI Dashboard

Dashboard Chart Legends & Explanations

Chart 1: Disparate Impact Ratio by Region

Chart 2: SHAP Waterfall Chart (Case B)

Chart 3: Counterfactual Recourse Explorer (Case CF2)

Chart 4: Regulatory Compliance Checklist

4. The Data

Dataset Overview

Limitation

5. Methods — The Python Pipeline

4a. Binary Target Creation

4b. Preprocessing Pipeline

4c. Model Choice

4g. ONNX Export

4d. Fairness Audit with Fairlearn

Key Finding

4e. SHAP Explainability

Global Feature Importance

4f. DiCE Counterfactuals

6. How This Can Be Used in Practice

Integration with Policy Systems

Standalone Retrospective Audit

Underwriter Decision Support

Beyond Insurance

7. Limitations and What Comes Next

Boundary Conditions