Salt Shaker Press: Bias and fairness analysis (Machine systems)

Here is a list of 100 command prompts for performing a comprehensive bias and fairness analysis on a machine learning system.

1. Scoping & Problem Formulation

Define the fairness goals (e.g., equality, equity, non-discrimination) for this project.
Identify all legally protected sensitive attributes (e.g., race, gender, age, disability).
Identify any additional context-specific sensitive attributes (e.g., income level, zip code).
Define the "favorable" and "unfavorable" outcomes of the model (e.g., "loan approved" vs. "loan denied").
Formulate hypotheses about potential sources of bias in the data and model.
Establish a "bias and fairness" review board or internal stakeholder group.
Audit the problem formulation itself for inherent bias (e.g., "predicting criminality").
Define "harm" in the context of this application (e.g., allocative vs. representational harm).
Select the primary fairness metric(s) to optimize for (e.g., Demographic Parity, Equalized Odds).
Document the societal, historical, and legal context of the problem the model is solving.

2. Data Collection & Auditing

Analyze the demographic representation of each sensitive attribute in the dataset.
Compare dataset demographics to real-world "ground truth" population data.
Analyze missing data patterns (NaNs) across all defined subgroups.
Audit data labels for historical bias (e.g., are "performance" ratings skewed by gender?).
Test for selection bias in the data collection methodology.
Test for measurement bias (e.g., is "risk" measured differently for different groups?).
Calculate the class imbalance of the target variable *within* each subgroup.
Visualize the distribution of a key feature (e.g., 'income') for each subgroup.
Perform a keyword audit on all text data for biased associations (e.g., "doctor" -> "he").
Audit image datasets for stereotypical representations or under-representation.
Identify all potential proxy variables for the sensitive attributes (e.g., zip code as a proxy for race).
Quantify the historical bias in the target variable itself (e.g., past loan decisions).
Sample and manually review [N] data points from each subgroup to identify annotation bias.
Document the data's full provenance and its known limitations.
Test for temporal bias by comparing data collected at different time periods.
Analyze outliers to see if they disproportionately belong to a specific subgroup.
Check for label quality disparities between groups.

3. Data Preprocessing & Feature Engineering

Test the fairness impact of the chosen imputation method (e.g., mean imputation) for missing data.
Analyze if feature scaling (e.g., standardization) affects group-specific distributions differently.
Evaluate if feature selection methods (e.g., RFE) disproportionately remove features important for a minority group.
Analyze the impact of text cleaning (e.g., stopword removal) on text from different dialects (e.g., AAVE).
Test the impact of removing sensitive attributes on both model performance and fairness.
Test the impact of removing known proxy variables on both model performance and fairness.
Analyze word embeddings (e.g., Word2Vec) for biased analogies (e.g., man:doctor::woman:nurse).
Debias word embeddings using a post-processing algorithm (e.g., Hard-Debias, LIN).
Analyze the fairness impact of outlier removal on subgroup representation.
Test synthetic data (e.g., SMOTE) to see if it amplifies or mitigates bias.
Analyze the binning strategy for continuous variables (e.g., 'age') to ensure fair groupings.
Create a "data-bias" report *before* model training begins.

4. Model Training & Evaluation (Group Fairness)

Establish a "bias baseline" using a simple, interpretable model (e.g., Logistic Regression).
Calculate Demographic Parity (Statistical Parity) to compare selection rates across groups.
Calculate Equalized Odds to compare True Positive Rates (TPR) and False Positive Rates (FPR) across groups.
Calculate Equal Opportunity to compare True Positive Rates (TPR) across groups.
Calculate Conditional Use Accuracy (e.g., Predictive Parity) to compare Positive Predictive Value (PPV) across groups.
Compare the False Negative Rates (FNR) for each subgroup.
Compare the False Positive Rates (FPR) for each subgroup.
Generate and analyze the model's confusion matrix for *each subgroup* side-by-side.
Plot the Receiver Operating Characteristic (ROC) curve for each subgroup on the same graph.
Plot calibration curves for each subgroup to check for miscalibration.
Test for "fairness gerrymandering" by evaluating on different data subsets.
Compare the bias-fairness trade-off of different model architectures (e.g., Tree vs. NN).
Audit the model's objective function (loss function) for any implicit bias.
Evaluate model performance and fairness on "edge cases" from minority groups.
Analyze the model for disparate impact using the "80% rule" (four-fifths rule).
Test for representation bias in generative models (e.g., "a picture of a CEO").
Analyze the language of LLM outputs for biased statements or stereotypes.
Test the model's robustness to perturbations in sensitive attributes (e.g., change "Mrs." to "Mr.").

5. Intersectional & Individual Fairness

Calculate all fairness metrics for intersectional groups (e.g., "Black women," "Asian men").
Identify which intersectional group has the worst-case performance.
Test for individual fairness: "Treat similar individuals similarly."
Develop and run counterfactual fairness checks (e.g., "What if this person's gender was different, holding all else equal?").
Calculate individual fairness metrics (e.g., "consistency") to ensure similar inputs get similar outputs.
Manually review the model's highest-confidence errors (false positives/negatives) for each subgroup.
Analyze the range and distribution of the model's output "scores" (e.g., 'risk_score') across groups.

6. Model Interpretation & Explainability (XAI)

Generate local explanations (e.g., LIME, SHAP) for predictions for individuals in each subgroup.
Analyze the explanations: Are they equally coherent or stable across groups?
Generate global feature importance (e.g., SHAP summary plot) for each subgroup's model.
Identify if the model relies on different features for different groups (e.g., 'income' for men, 'job_title' for women).
Identify which features are the largest contributors to biased predictions.
Use explanations to audit the model for reliance on proxy variables.
Check if the model's "confidence" in its predictions is systematically higher for one group.
Analyze error patterns: Is the model failing in *different ways* for different groups?

7. Mitigation Strategies

Apply a **pre-processing** mitigation technique (e.g., Reweighing, Disparate Impact Remover).
Apply an **in-processing** mitigation technique (e.g., Adversarial Debiasing, Regularization).
Apply a **post-processing** mitigation technique (e.g., Calibrated Equalized Odds, Reject Option Classification).
Implement group-specific prediction thresholds to equalize odds.
Evaluate the model's fairness on a hold-out "fairness" test set (not used for training or tuning).
Re-train the model on a dataset augmented to correct for representation bias.
Compare the performance and fairness of the mitigated model against the original baseline.
Document the trade-offs (e.g., loss in overall accuracy) introduced by the mitigation technique.
Test the model for "re-biasing" (i.e., fixing bias for one group but worsening it for another).
Analyze the long-term impact of deploying a "fair" model (e.g., "feedback loops").

8. Deployment & Monitoring

Develop and publish a "bias impact statement" or "model card" before deployment.
Implement a "human-in-the-loop" (HITL) review process for high-stakes, low-confidence predictions.
Define and document the "red-line" for unacceptable bias or performance disparity.
Roll out the model using a canary release to monitor initial bias in a live environment.
Create an "Algorithmic Bias" dashboard for internal stakeholders to monitor key metrics.
Implement a continuous monitoring system to track fairness metrics in real-time.
Set up automated alerts for "fairness metric" drift or sudden changes.
Monitor the model for performance and fairness decay across all subgroups over time.
Establish a clear feedback channel for end-users to report perceived biased outcomes.
Schedule a regular (e.g., quarterly) "bias audit" of the live model by an independent team.
Analyze the model's performance on new, out-of-distribution data from all subgroups.
Create a plan for rapid model disabling ("kill switch") if severe bias is detected.
Test the model for fairness in simulated, adversarial scenarios.
Ensure the "right to explanation" can be fulfilled for individuals in all groups.
Compare model-driven decisions against a "human-only" baseline for bias.
Audit the user interface (UI) that presents the model's output for priming or bias.
Ensure all bias-related findings and mitigations are version-controlled with the model.
Plan and budget for the computational and human cost of continuous bias monitoring and re-mitigation.

Salt Shaker Press

Search This Blog

Bias and fairness analysis (Machine systems)

1. Scoping & Problem Formulation

2. Data Collection & Auditing

3. Data Preprocessing & Feature Engineering

4. Model Training & Evaluation (Group Fairness)

5. Intersectional & Individual Fairness

6. Model Interpretation & Explainability (XAI)

7. Mitigation Strategies

8. Deployment & Monitoring

No comments:

Post a Comment

2003 Test Sample Model--School Analysis Experiment

Shaker Posts