Auditing AI Fairness
Published: June 2026
Auditing AI Fairness in Insurance: A Responsible AI Framework Built from Scratch
1. Why I Built This
Let me start with a scenario that is not hypothetical. An insurance company deploys an AI model to help price health insurance premiums. The model is accurate — it predicts high-cost policyholders correctly most of the time. But nobody at the company can explain why it flags a specific 58-year-old woman from the northwest as high-risk. The model just does. The woman's premium goes up. She has no recourse, no explanation, and no way to know whether the decision was fair.
This is the black box problem, and it is not a minor inconvenience. It is a structural failure in how AI is being deployed in high-stakes domains. I built this project because I wanted to demonstrate — concretely, with working code and a live interface — that the black box problem is solvable. Not with philosophy, but with engineering.
The Black Box
Explainability
The Bias
Demographic Fairness
The Recourse
Counterfactuals
The Regulatory
Compliance
2. What This Project Does
I built five interconnected capabilities, each addressing a different dimension of responsible AI:
Live Prediction Engine
Browser-based Inference
The trained model is exported to ONNX format and runs directly in the browser via WebAssembly. A visitor to my portfolio can enter a policyholder profile — age, sex, BMI, number of children, smoker status, region — and receive a prediction in real time. No server. No API call. The inference happens on the visitor's own machine.
Fairness Audit
Regulatory Thresholds
Using Fairlearn's MetricFrame, I compute selection rates, false positive rates, true positive rates, and disparate impact ratios across three sensitive dimensions: sex, age group, and geographic region. Every metric is compared against regulatory thresholds.
SHAP Explainability
Feature Contribution
For every prediction — both the three pre-computed case studies and any live prediction — I show a waterfall chart breaking down which features drove the decision and by how much. This is the explanation a policyholder, underwriter, or regulator would need to evaluate the decision.
Counterfactual Recourse
Alternative Profiles
For cases flagged as high-cost, DiCE generates alternative profiles that would flip the decision to low-cost, varying only non-protected features (age, BMI, children, smoker status). Sex and region are held fixed — they should not be the levers a person is told to change.
Regulatory Compliance Mapping
EU AI Act & NIST
I manually map the model's documented behaviors against four specific requirements from the EU AI Act and NIST AI RMF, producing a compliance checklist with evidence for each item.
3. Interactive Responsible AI Dashboard
Explore the end-to-end evaluation of the health insurance premium prediction model featuring live ONNX inference, Fairness metrics, SHAP explainability, and DiCE counterfactuals.
Dashboard Chart Legends & Explanations
Chart 1: Disparate Impact Ratio by Region
Horizontal bar chart. Each bar represents one of four US regions (northeast, northwest, southeast, southwest). The bar length shows that region's disparate impact ratio relative to the highest-scoring region (southwest = 1.0, the baseline). The red dashed vertical line at 0.8 marks the regulatory 4/5ths rule threshold — any bar to the left of this line indicates potential discrimination requiring investigation. Northwest scores 0.838, which is above the threshold but borderline — it warrants monitoring in a production deployment. All four regions pass the threshold in this audit. The chart is interactive on the live site: hovering over a bar shows the exact ratio and a pass/fail indicator.
Chart 2: SHAP Waterfall Chart (Case B)
Horizontal bar chart with one bar per feature. Bars pointing LEFT (shown in green/teal) have negative SHAP values — they push the prediction toward Low Cost. Bars pointing RIGHT (shown in red) have positive SHAP values — they push toward High Cost. The baseline (E[f(x)] = model's average prediction) is shown at the left edge; the final prediction probability is shown at the right. For this 19-year-old male, non-smoker from the northwest: age contributes −0.2333 (the longest bar, pointing strongly left — youth is the dominant reason for the low-cost prediction), region_southeast contributes −0.0816, region_northwest contributes −0.0747, BMI contributes −0.0605, and children contributes −0.0274. Sex_male and smoker_yes are near-zero. The chart is interactive: hovering over each bar shows the feature value and exact SHAP contribution.
Chart 3: Counterfactual Recourse Explorer (Case CF2)
Side-by-side card layout. The left card shows the original profile: age 59, female, BMI 27.5, non-smoker, southwest — flagged as High Cost at 83.7% confidence. The three right cards show alternative profiles that would flip the decision to Low Cost. Changed values are highlighted in amber. Alternative 1: only age changes from 59 to 34. Alternative 2: age changes to 34 and BMI changes to 31.8. Alternative 3: age changes to 47. The fact that age alone drives the flip in Alternative 1 is the most important finding from this analysis: the model is so heavily weighted on age that a 25-year age reduction overrides all other factors. This raises a legitimate fairness concern for older policyholders — the model may be encoding age-based structural disadvantage rather than genuine individual risk.
Chart 4: Regulatory Compliance Checklist
Two-panel table. Top panel covers EU AI Act articles; bottom panel covers NIST AI RMF categories. Each row contains: the article or category identifier, the requirement name in plain English, a one-sentence evidence statement showing how this model satisfies the requirement, and a compliance badge. All four mapped items show green "Compliant" badges. EU AI Act Article 10 (Data Governance): satisfied by the Fairlearn fairness audit. EU AI Act Article 13 (Transparency and Explainability): satisfied by SHAP local explanations for every prediction. NIST MAP 1.5 (Impact Characterization): satisfied by the disparate impact ratio monitoring across sex, age, and region. NIST MEASURE 2.7 (Feedback and Recourse): satisfied by the DiCE counterfactual generation, which provides actionable recourse paths for affected individuals. The checklist is interactive on the live site: clicking any row expands to show the full evidence statement and a recommendation for strengthening compliance.
4. The Data
Dataset Overview
The dataset is the Medical Cost Personal Dataset — 1,338 records of US health insurance policyholders with six features: age (18–64), sex (male/female), BMI, number of children covered, smoker status (yes/no), and residential region (northeast, northwest, southeast, southwest). The target variable is annual charges billed by the insurer.
I converted this into a binary classification problem by splitting on the median annual charge of $9,382.03. Policyholders above this threshold are labeled high-cost (1); those below are labeled low-cost (0). The resulting dataset is perfectly balanced — 669 high-cost and 669 low-cost cases — which eliminates class imbalance as a confounding factor in the fairness analysis.
Limitation
This dataset is derived from 1994 US Census data. Demographic patterns, healthcare costs, and regional pricing structures have changed substantially since then. The findings here are directionally valid for demonstrating RAI methodology, but a production deployment would require more recent, representative data.
5. Methods — The Python Pipeline
4a. Binary Target Creation
I chose the median split because it produces a balanced dataset and is an interpretable threshold. In production, this threshold would be set by actuaries.
median_charges = df['charges'].median()
df['high_cost'] = (df['charges'] > median_charges).astype(int)4b. Preprocessing Pipeline
Mixed feature types are handled with scikit-learn's ColumnTransformer: scaling numeric features and one-hot encoding categorical ones.
preprocessor = ColumnTransformer(transformers=[
('num', StandardScaler(), ['age', 'bmi', 'children']),
('cat', OneHotEncoder(drop='first', sparse_output=False),
['sex', 'smoker', 'region'])
])4c. Model Choice
Logistic Regression was chosen over Random Forest for three reasons: ONNX export precision, exact SHAP interpretability, and a negligible accuracy trade-off (89.9% accuracy, ROC-AUC 0.9425).
model = Pipeline(steps=[
('preprocessor', preprocessor),
('classifier', LogisticRegression(max_iter=1000, random_state=42, C=1.0))
])4g. ONNX Export
To run in browser, the pipeline is decoupled. The classifier is exported as ONNX, while the preprocessor state is saved as JSON for client-side preprocessing.
onnx_model = convert_sklearn(classifier, ...)
preprocessor_config = { "scaler_means": ... }4d. Fairness Audit with Fairlearn
Fairlearn's `MetricFrame` is the core of the fairness audit. It computes any set of metrics stratified by a sensitive attribute.
Key Finding
The critical metric is the disparate impact ratio. The most significant finding: policyholders under 40 have a disparate impact ratio of 0.295 relative to those 40 and over. They are flagged as high-cost at less than a third the rate of older policyholders (DPD 0.5675). Is the model encoding risk, or is it encoding age as a proxy for something else?
4e. SHAP Explainability
SHAP computes the contribution of each feature to the deviation from the model's average prediction. I used shap.LinearExplainer for Logistic Regression, which produces exact SHAP values.
Global Feature Importance
- age (0.2544): Dominant driver — 3× more important than any other feature.
- region_southeast (0.1158): Second-tier — southeast policyholders skew higher cost.
- region_northwest (0.1041): Third-tier — northwest borderline on disparate impact.
- bmi (0.0283): Modest effect — obesity threshold matters but is secondary.
- children (0.0243): Small effect — more dependents slightly increases cost tier.
- region_southwest (0.0170): Minimal.
- sex_male (0.0083): Near-zero — sex has almost no independent effect.
- smoker_yes (0.0076): Near-zero — surprising given smoker's known cost impact, but explained by multicollinearity with age.
4f. DiCE Counterfactuals
DiCE (Diverse Counterfactual Explanations) answers the question: what is the minimum change to this person's profile that would flip the model's decision? I configured it to vary only non-protected features — age, BMI, children, and smoker status — while holding sex and region fixed.
6. How This Can Be Used in Practice
Integration with Policy Systems
The RAI audit layer integrates as a sidecar service. It does not replace the core system's decision, but it runs alongside every decision to audit it in real time, adding less than 50ms of latency.
Standalone Retrospective Audit
For legacy systems, the RAI audit runs as an independent microservice processing decision logs via a message queue to generate periodic compliance reports.
Underwriter Decision Support
The live prediction engine can be embedded directly into an underwriter's workstation, showing SHAP waterfalls and counterfactuals to make reasoning transparent and auditable.
Beyond Insurance
This architecture is domain-agnostic. It applies to loan approvals, hiring, medical triage, or credit scoring. Swap the dataset, and the RAI audit layer works identically.
7. Limitations and What Comes Next
Boundary Conditions
- Dataset Currency: The dataset is derived from 1994 data. A production deployment requires recent, representative data.
- Intersectional Fairness: The audit examines one sensitive attribute at a time. Intersectional analysis requires larger datasets.
- Production Drift: The model is a static artifact. Without monitoring, a model that is fair today may become unfair as populations shift.
- Age as a Variable: Age is legally protected in many contexts. A more rigorous implementation would hold age fixed in DiCE along with sex and region.
These are not failures of the project. They are the honest boundary conditions of a portfolio demonstration built on a public dataset. The architecture I built — ONNX inference, Fairlearn auditing, SHAP explanations, DiCE recourse, regulatory mapping — is the right architecture.
All code, data artifacts, and Svelte components for this project are available via the linked Colab notebook. The live inference engine runs at the project URL above.