Reading Note: Noise-tolerant fair classification

blog_noise_tolerant

This post is the reading note for "Noise-tolerant fair classification." by Lamy, Alex, et al. This paper mainly focus on the question that Whether one can still learn fair classifiers given noisy sensitive features? (e.g., race or gender).

A quick answer is yes. The authors claim that if one measures fairness using the mean-difference score, and sensitive features are subject to noise from the mutually contaminated learning model, then owing to a simple identity we only need to change the desired fairness-tolerance.

To understand this paper, we must first review the following 2 major questions:

Mutually contaminated learning
Fairness learning

where mutually contaminated learning is a model for learning the distribution of samples with corrupted labels.

Mutually contaminated learning

$D_{corr}$ . The MC learning assumes that The class-conditional distributions are mixtures of their true counterparts:

$D_1$ $D_0$ represent the distribution of samples with positive label and negative label, respectively.

Fairness learning

Typically, fairness is achieved by adding constraints which depend on the sensitive feature, and then correcting one’s learning procedure to achieve these fairness constraints. There are two central objectives: designing appropriate application-specific fairness criterion, and developing predictors that respect the chosen fairness conditions. Fairness objectives can be categorized into individual- and group-level fairness. In this paper, the authors only focus on group-level fairness.

For a fairness-aware binary classification, the formulation is typically

$L_D$ $\Lambda_D$ is the fairness score function. Here the fairness score function is determined by the definition of fairness. There are two fairness definitions discussed in this paper.

The corresponding fairness score functions are then defined by

disparity of demographic parity (DDP)

disparity of equality of opportunity (DEO)

where

Noise-tolerant fairness classification

$\alpha,\beta\in (0,1)$ $\alpha+\beta<1$

Similarly, for EO fairness,

Fairness constraints under MC learning

To learn fair classifier with corrupted sensitive attributes, the authors proved Theorem 2 and claimed that the fairness constraint on the clean distribution is equivalent to a scaled constraint on the noisy distribution.

$\Lambda_D$ $\tau\geq0$ $D_{corr}$ $\tau'=\tau\cdot(1-\alpha-\beta)$ $\alpha, \beta$ $\alpha, \beta$ $\tau'$ .

The corresponding algorithm is shown below:

Experiments

There are two application scenarios for noise-tolerant fairness learning

(Privacy setting)Even if noise-free sensitive features are available, we may wish to add noise so as to obfuscate sensitive attributes.
(PU setting)We wish to analyze data where the presence of the sensitive feature is only known for a subset of individuals, while for others the feature value is unknown.

where PU means that samples' sensitive attributes are either positive or unlabeled.

$\rho^+$ $\rho^-$ $\rho^+=\rho^-$ $\rho^+=0$ $\rho^+=0$ .

The experiment of privacy setting is done on the COMPAS dataset

For PU setting, it is done on the low school dataset, one can refer to the original paper for detailed descriptions.

Reference: Lamy, Alex, et al. "Noise-tolerant fair classification." Advances in Neural Information Processing Systems. 2019.

PandaCid's Blog

Search This Blog