More and more companies are automating processes with the help of ML. This has tremendous advantages since an algorithm can scale in an almost unlimited fashion. Moreover, in case rich datasets are available, it can often detect patterns that humans cannot discover. As a well-known illustration, think about how Deepmind used a ML algorithm to optimize energy consumption of Google's data centers. When algorithms are used to make important decisions that impact people's lives, such as deciding on medical treatment, granting a loan, or performing risk assessments in parole hearings, it is of paramount importance that the algorithm is fair. Because the models become ever more complicated, this is not easy to assess. As a consequence, both the public, legislators and regulators are aware of this issue; see e.g. a report on algorithmic systems, opportunities and civil rights by the Obama administration and (in Europe) recital 71 of GDPR.
In order to detect bias/unfairness it is important to come with a proper (mathematical) definition to make sure that we can measure deviations. Below we describe a few alternatives.
Let us assume we are building a model to determine whether somebody may receive a loan. Typically such model will use information (so-called attributes) such as your credit history, marital status, education, profession, etc. in order to estimate the probability that you will be able to pay off your debts. The dataset will also include historical data on defaults (i.e. people who were not able to repay their loan). Now, given the above, the financial institution wants to make sure that the algorithm is fair with respect to gender, skin colour, etc. These are called protected attributes.
A traditional solution is to simply remove these protected attributes from the dataset altogether. Of course, when testing such model, simply changing the value of any of the protected attributes is not going to impact to output of the model. However, through so-called redundant encodings an algorithm might be able to guess the value of the protected attributes from other information. As an example, let us assume we train our credit model on a dataset representing the people of Belgium. The dataset has a protected attribute called language that can take two values (French or Dutch). People whose language attribute equals Dutch live primarily in the Northern part of the country while the French speaking citizens live in the Southern part. Hence our algorithm can infer the language attribute (statistically) by looking at the address. Therefore, simply removing this attribute from the dataset does not help.
Another approach is so-called demographic parity. In that case one requires that the membership of a protected attribute is uncorrelated with the output of the algorithm. Let us assume that granting a loan is indicated by a target binary variable Y = 1 (the ground truth) and that the protected (binary) attribute is called A. The forecast of the model is called Z. Demographic parity then means that
Pr[ Z = 1 | A = 0] = Pr[ Z=1 | A=1]
This notion however is also flawed. A key issue with the approach is that if the ground truth (i.e. default in our example) does depend on the protected attribute, the perfect predictor (i.e. Y=Z) cannot be reached and therefore the utility and predictive power of the model reduce. Moreover, by requiring demographic parity, the model has to yield (on average) the same outcome for the different values of the protected attribute. In our example, demographic parity would imply that the model would have to refuse good candidates from one category and accept bad candidates from the other category in order to reach the same average level. Concretely, assuming that the Dutch speaking people historically have defaulted on 3% of their loans while the French speakers only on 1%, demographic parity would typically be in the disadvantage of the Walloons.
A more subtle suggestion for fairness was proposed by Hardt et al:
Pr[Z=1 | Y=y, A=0] = Pr[Z=1 | Y=y, A=1]
for y=0,1. In other words, relatively speaking the model has to be right (or wrong) as often for either value of the protected attribute. This definition incentivizes more accurate models and the property is called oblivious as it depends on the joint distribution of A,Y and Z.
As can be seen from the above, mastering fairness in ML/AI is crucial. This is why many tech giant such as Google, Facebook, Microsoft and IBM have created initiatives to tackle this problem. However, given that bias in algorithms is just one instance of model risk, we believe that the best approach is to separate concerns and hand off bias detection to entities that are independent from the model developers. In other words, we suggest to leverage the separation between first and second line of defence to tackle bias efficiently.