[PDF] Biases in Data Science Lifecycle

Abstract

In recent years, data science has become an indispensable part of our society. Over time, we have become reliant on this technology because of its opportunity to gain value and new insights from data in any field - business, socializing, research and society. At the same time, it raises questions about how justified we are in placing our trust in these technologies. There is a risk that such powers may lead to biased, inappropriate or unintended actions. Therefore, ethical considerations which might occur as the result of data science practices should be carefully considered and these potential problems should be identified during the data science lifecycle and mitigated if possible. However, a typical data scientist has not enough knowledge for identifying these challenges and it is not always possible to include an ethics expert during data science production. The aim of this study is to provide a practical guideline to data scientists and increase their awareness. In this work, we reviewed different sources of biases and grouped them under different stages of the data science lifecycle. The work is still under progress. The aim of early publishing is to collect community feedback and improve the curated knowledge base for bias types and solutions.

Full PDF

BBiases in Data Science Lifecycle

Dinh-an Ho [email protected]

Oya Beyan [email protected]

Abstract

In recent years, data science has become an indispensable part of our society. Overtime, we have become reliant on this technology because of its opportunity to gainvalue and new insights from data in any ﬁeld - business, socializing, research andsociety. At the same time, it raises questions about how justiﬁed we are in placingour trust in these technologies. There is a risk that such powers may lead to biased,inappropriate or unintended actions. Therefore, ethical considerations which mightoccur as the result of data science practices should be carefully considered and thesepotential problems should be identiﬁed during the data science lifecycle and mitigatedif possible. However, a typical data scientist has not enough knowledge for identifyingthese challenges and it is not always possible to include an ethics expert during datascience production. The aim of this study is to provide a practical guideline to datascientists and increase their awareness. In this work, we reviewed diﬀerent sources ofbiases and grouped them under diﬀerent stages of the data science lifecycle. The workis still under progress. The aim of early publishing is to collect community feedbackand improve the curated knowledge base for bias types and solutions. We would liketo ask your feedback in one of the following ways:1. please participate in the survey https://forms.gle/tN75M9GMBgTj3rCH7 or2. leave your comment to the Google Docs https://bit.ly/35brLic.

Data science has developed into an indispensable part of our society. We use this tech-nology to collect information and extract knowledge from data. Almost all sectors usedata science to achieve this. There are more and more new innovations, so that this worksmore and more eﬃciently. However, we have to think about ethical risks. Since humandata is often involved, biases can occur. The ethical consequences that could result fromdata science-related practices should therefore be carefully considered and potential prob-lems should be identiﬁed and, if possible, mitigated during the life cycle of data science.However, a typical data scientist does not have suﬃcient knowledge to identify these chal-lenges, and it is not always possible to involve an ethical expert in data science production.The aim of this study is to provide a practical guideline to data scientists and increasetheir awareness. In this work we described diﬀerent sources of biases in each stage of datascience, provided some examples and gave references to the best practices. This work isconducted as part of master thesis ”An Ethics guideline for data scientist: developing anexecutable guideline for responsible data science” at the

RWTH Aachen University,Chair of Computer Science 5 Information Systems . The main goal is to give datascientists an overview of the biases that can occur when programming and analyzing data,and how these problems can be solved.The ﬁnal outcome of the thesis will be a practicaltool which data scientists can use in their daily work environment for accessing the relatedinformation and reporting their solutions for identiﬁed biases in their work. The thesis is1 a r X i v : . [ c s . C Y ] O c t till under progress.In this paper, we will present the outcome of the literature review. The next sections, ﬁrstwe will present our methodology, then provide the conceptual steps of a data life sciencecycle. And then present a collection of sources of biases in each phase with an example andwith best practices for mitigating them. The ﬁndings in this document are still incompleteat this stage and will be completed in the progress of the master thesis. The aim of earlypublishing is to collect community feedback and improve the curated knowledge base forbias types and solutions. The primary purpose of this chapter is to ﬁgure out diﬀerent types of biases that appearthroughout the phases of the data science pipeline and map each bias with its uniquephase. In order to do this multidisciplinary task, we explored 85 articles (cited properly)that belong to their diﬀerent journals using Google Scholar, IEEE digital library, ACMdigital library, Wiley online library, Semantic Scholar, Science direct, arXiv and so forth.In this chapter, we reviewed these articles and collected relevant information regardingthe source of biases and mitigation methods. Then we mapped each source to the datascience life cycle. Each phase of the data science pipeline is studied in terms of respectivebiases that can appear in that phase along with the Use-case examples and best practices.Data masters can utilize these best practices amid these biases.The input for the search is based on the data science phases, for example, one inputwas ’Data ingestion bias’. Analogously, the other biases were found. Examples werefound based on the most read papers for the speciﬁc biases. In parallel, examples or bestpractices (here for example in the case of algorithmic bias) were searched by keywords like’algorithmic bias example’, ’algorithmic bias best practices’, ’algorithmic bias mitigation’.For papers with keywords matching were collected and added to the search for papers.

In this research, we list all biases at each phase so that we have an overview of the biasesthat occur at that phase. There are several ways to deﬁne the phases of data science.Figure 1 shows one possibility which are as follows:1.

Data Ingestion: here the data are collected and imported by the data scientistsfrom databases or self-produced data collections2.

Data Scrubbing: here the data is cleaned so that machines can understand andprocess it3.

Data Visualization: here signiﬁcant patterns and trends are ﬁltered by statisticalmethods4.

Data Modeling: here data models are predicted and forecasted using e.g. artiﬁcialintelligence algorithms5.

Data Analysis: here the results are interpreted, and knowledge is extracted.2igure 1: This is the data science pipeline proposed by Dineva et al. [24]Another model by Suresh et al. [77] describes the following phases: world, population,dataset, training data, model, model output and real world implications. You can seethat there are parallels between the two models. The ﬁrst model started directly fromthe data set and even included the data scrubbing phase, as this is an important part ofdata science analysis. However, this model has included the training data in this phase.The modeling and analysis phase is identical. There are other well-known models, but weagree to use Figure 1 as a lifecycle.

In this section we iteratively go through all data science phases and indicate which biasesmay occur. All biases are divided into a short description, examples and best practices.An overview of all biases is shown in Figure 1.

Table 1

Overview of biases

Data Ingestion Data Scrubbing Data Visualization Data Modeling Data AnalysisData B. Exclusion B. Cognitive Biases: Algorithmic B. Deployment B.Sampling B. Data Enrichment B. Framing Eﬀect Hot Hand Fallacy Rescue B.Measurement B. Availability B. Bandwagon B. OverﬁttingSurvey B. Overconﬁdence B. Group Attribution B. UnderﬁttingSeasonal B. Anchoring B. Aggregation B.Survivorship B. Conﬁrmation B.Selection B. and Signal ErrorHistorical B.

In this phase, the data are collected and imported by the data scientists from databasesor self-produced data collections. The following describes the biases that can occur in thisstage. 3 .1.1 Data Bias (Representation Bias)

Data bias is a systematic distortion in the data that compromises its representativeness .It is directly related to sampling that conﬁrms whether the sample is representative ofthe larger population or not. It occurs during the data accumulation phase. Althoughcapturing bias-free dataset is not feasible, data scientists can estimate the biasness in thedata by comparing the sample with multiple samples having diﬀerent contexts [60].

Example:

In [10], the writers claimed that machine learning algorithms discriminateagainst people on race, gender and ethnicity. They represent that the databanks of Adienceand IJB-A primarily hold light-skinned subjects — 86.2 percent and 79.6% respectively,which can bias towards the underrepresented dark skin groups.

Best practice:

In [52], novel data Representation bias Removal (REPAIR) technique isintroduced to resolve the generalization issues present in training datasets by employingRepair-Algorithm. Proper labeling of the data just like the nutrition chart is another wayto reduce data bias by task-oriented categorization of data [40]. Using supporting sheetsfor datasets can be valuable while lessening the data bias. Advanced data mining andproper data targeting are some other options for data recruiters to hire in order to obtainless discriminated data [56].

Selection bias occurs when wrong contributors are selected or allowed to contribute, andtherefore, the proper randomization is not achieved.

Example:

Selection bias occurs for example, if the residents of rural areas were selectedto participate in a city transportation network related project. Due to the nature of VGIprojects selection bias is one of the most important and inﬂuencing types of biases andalso relatively hard to detect and treat [6].

Best practice:

To correct this bias, it is especially important to ensure that selectionbias is avoided when recruiting and retaining the sample population. Picking up subgroupsrandomly can also be beneﬁcial to limit selection bias [38].

Sampling bias is closely related to selection bias that can occur when all the members ofdata are not objectively representative of the target population about which conclusionsare to be drawn. In addition, errors in the process of collecting samples generate samplingbias while errors in the subsequent processes cause selection bias [28]. Non-representativesamples often lead to models that exhibit systematic errors. In biased sampling, the wholedataset is divided into two groups namely minority and majority groups. Hence the modelmay be trained according to the dominating and prejudicial behavior of the assessments.Therefore, proper selection of training data is the crucial part for data scientists as it isextremely challenging for them to compute the ground reality [68].

Examples:

1. Tainted training examples might wrongly instruct the machine to see features thatactually predict success on the job as indicators of poor performance.4. A classic example of a biased sample happens in politics, such as the famous 1936opinion polling for the U.S. presidential elections carried out by the American Lit-erary Digest magazine, which over-represented rich individuals and predicted thewrong outcome [74].

Best practice:

If some groups are known to be under-represented and the degree ofunder-representation can be quantiﬁed, then sample weights can correct the bias [55].Simple random sampling, and Stratiﬁed random sampling are some valuable tricks tomitigate sampling bias [71]. Stratiﬁcation involves the division of the whole population intodiﬀerent subgroups, for instance, measuring the similar attributes of multiple subgroupsunder same conditions. Hence, this approach oﬀers in-depth inspection of the relationsamong the groups and highly pr´ecised scores as variability is low in homogenous groups[23].

Measurement bias arises when the data analyst tries to get desired results by selecting,operating, and measuring a particular feature [56]. This may cause us to skip someimportant factors or create group or noise in the process that would lead to disaster. Itrelates with the data bias to some extent but the main diﬀerence is that data bias is due toinherent bias in the data that gives biased outcome due to non-standardize data while, onthe other hand, includes the addition of unnecessary data unconsciously or deliberately.

Example:

Measurement bias was spotted in the risk prediction tool COMPAS, in whichthe former arrests and family arrests were employed as proxy variables to measure level ofcrime. COMPAS predicted incorrectly based on dissimilar subgroups. As marginal groupsare controlled and policed more often, thus they have a higher rate of arrests [77].

Best practice:

Systematic errors cannot be avoided simply by collecting more data,but having multiple measuring devices (or observers of instruments), and data specialiststo compare the output of these devices [55].

As per the name, survey bias occurs when researchers receive prejudiced, inconsistent andtailored feedback or no feedback to the interviews, surveys and questionnaires from therespondents. The main behind all that is the presence of secret or sensitive topics ofdiscussions concerning income, sex, drugs, race, violence etc. Consequently, self-reportingdata can be inﬂuenced by two types of external bias: (1) Social approval (that can un-derestimate the original value); and (2) Recall Bias (mostly people recall and answer thatmight be erroneous) [55]. The concept of data linkage is very important to understandfor the data analysts here. Data linkage involves in the data collection process wheninformation about the same entity is gathered from two or more distinct sources. Prob-abilistic matching and Individual reference identiﬁers can be useful while joining two ormore survey datasets [16].

Sometimes the available data is related to seasonal entities which simply means that thedataset exhibits seasonal growth patterns. Data interpreters when considering this kind5f situational data (Time Series) for training the supervised models, they are said to beseasonally biased. Additionally, predictive models are gravely impacted by seasonalitybecause of the dynamic ﬂuctuations present in the records.

Example:

The Indian ﬁnancial year ends on 31st March. It is a high time for Indianinsurance industry because people tend to buy more insurance items to claim the rebatesat that time. An analysis of past 10 years of insurance business data shows that 25-30%of business of insurance industry in India come in the month of March. Similarly, there isa surge in sales of consumer goods in the UK and US leading up to Christmas [75]. Dataanalyzers have to keep track of trends that data has been facing since previous years toget optimized results from their models.

Best practice:

Having in-depth knowledge about the seasonal trends of the targetedindustry is essential to avoid seasonal bias. Targeted industry is used as a generalizedterm, that includes the industry under examination, regardless of which industry it is(examples are given above). Data specialists can compare the values of peak time withnormal day value and they should measure only what they need. Reviewing the historicaltrend to predict future patterns can also be the good approach for seasonal adjustments[9].

Survivorship bias occurs when only certain successful subsets of a group are consideredwhile the failures are dropped out of observation. This type of dataset selection skews theaverage output upward showing fake performance [41]. Data scientists when they try tomake sense out of incomplete data, they fell prey to the survivorship bias.

Example:

During World War II, researchers from the non-proﬁt research group theCenter for Naval Analyses were tasked with a problem. They needed to reinforce themilitary’s ﬁghter planes at their weakest spots. To accomplish this, they turned to data.They examined every plane that came back from a combat mission and made note ofwhere bullets had hit the aircraft. Based on that information, they recommended that theplanes be reinforced at those precise spots. The problem, of course, was that they onlylooked at the planes that returned and not at the planes that didn’t. Of course, data fromthe planes that had been shot down would almost certainly have been much more usefulin determining where fatal damage to a plane was likely to have occurred, as those werethe ones that suﬀered catastrophic damage [2].

Best practice:

Data scientists may alleviate survivorship bias in backtest with survivor-ship bias free datasets and/or more recent data. The former one includes information ofdelisted equities during the test period while it is likely that fewer stocks are delisting ina more recent, shorter time period [82].

The pre-existing bias arises due to social and technical disagreements in the world as itis; and it seeps into the data even after selecting features and collecting samples perfectly[77]. 6 xample:

In 2018, an image search for women CEOs showed fewer results as there wereonly 5 percent women in Fortune 500 CEOs. Even the output was the entire reality,algorithms should consider or avoid these inherent discrimination is the hot topic for datascientists [77].

Best practice:

Recognizing historical bias requires a retrospective understanding ofthe application and data generation process over time. Historical bias is often concernedwith demonstrating the representational harms (such as reinforcing a stereotype) againsta distinct group [77].

This phase is also known as data cleaning phase. In this phase, the data is cleaned so thatmachines can understand and process it. The two biases that can occur in this stage aredescribed below.

Data cleansing is an essential phase of the data science lifecycle that comes after datacollection. In ethical perspective, removal of corrupt or unethical data involving bothupper and lower extremes and exceptions is crucially important. For instance, outliers(values that deviate from the pattern) and duplications, from big raw data to make it lessredundant, more consistent and reliable for model training. Since excluding un-actionableand duplicate insights is an important part while cleaning noise from the data, experts getbiased while they get into it. Exclusion bias occurs when data handlers do not identifyand remove the undesired chunks of data that should be removed in order to make dataethical and to maintain the accuracy of the results [62].

Examples:

1. We do customer proﬁling and ﬁnd out that the average annual income of customersis 0.8 million dollar. But, there are two customers having annual income of 4 dollarand 4.2 million dollar. These two observations will be seen as outliers. Exclusionbias will occur if data managers do not exclude these two customers as their annualincome is much higher than the rest of the population [34]. Exclusion bias can alsoarise when some important chunk is deleted from the data source while reﬁning thedata.2. Exclusion of subjects who have recently migrated into the study area (this mayoccur when newcomers are not available in a register used to identify the sourcepopulation). Excluding subjects who move out of the study area during follow-upis rather equivalent to dropout or non-response, a selection bias in that it ratheraﬀects the internal validity of the study [3].

Best practice:

To avoid this bias, ﬁle manipulators and data scientists should haveintimate knowledge of data attributes, database sources as well as the data collectionprocess in order to verify what to exclude or not (ethical exclusion). Secondly, extremevalues from the data can be precisely detected through various techniques including k-Nearest Neighbor technique, ARIMA methodology, Regression analysis [63] or an Angel-based Outlier Detection [81]. 7 .2.2 Data Enrichment Bias

After excluding outliers, data should be structured properly via managing the abnormalsegments and missing segments. Organizing the available data means eliminating typosor grammatical mistakes from the data, and handling the missing data that is becauseof incomplete responses from the participants (Non-response bias). Data enrichment biasoccurs due to typing mistakes of data entry operators or when they misinterpret thecontext of data and add the wrong input (extreme input) to fulﬁll empty ﬁelds [83]. Datacleaning though is the time consuming task but it can ultimately improve the decisionmaking process.

Example:

Imagine a data entry operator mistakenly type

Hspital instead of

Hospital or evaluates the student pass having less than 50% marks while others are fail in the listhaving same percentage.

Best practice:

The serendipitous search in AI algorithms has enough potential to mit-igate data-enrichment bias by exploring the unexplored parts through diﬀerent rankingparameters [70]. A quality assurance committee having diverse experience of several disci-plines should be formalized to review the data sources repetitively from the lens of moralityto ensure fairness and reduce data discrimination [70]. Using techniques like Kernel-basedlocal outlier factor (LOF) to identify the incorrect data can be helpful [83]. Furthermore,data dealers can create a separate category for records having missing values or ﬂag themwith 0 (if numeric) to make the algorithm aware of it [30].

Here signiﬁcant patterns and trends are ﬁltered by statistical methods. Cognitive bias canoccur in the data visualization phase. This bias is divided into diﬀerent types.

Studies have shown that visual transformations of data actually aﬀect the data that im-pacts both decision making and the results. Tools that present data into the visual formatsalways try to make visualization easy for the auditors and by doing so, they may alter theoriginal pattern of data [54]. Such improperly created graphs can trigger cognitive biasesfor the viewers [7]. Five types of cognitive biases are discussed in [18].1. Data Visualization and Framing Eﬀect: Individuals respond to a particular problemin diﬀerent ways depending upon how the problem is framed to them, a bias iscalled framing eﬀect [17]. Data visualization tools often prioritize the data in themost comprehensible way. Meanwhile, they mostly alter the original sequence ofdata and trigger the framing bias for the auditor to cope with.

Example : Graph highlights that 30 percent (70 percent) of a client’s usualtrade credits from suppliers are denied (awarded) might impact auditors’ assess-ment of the probability and severity of their audit client’s ﬁnancial diﬃculties,which is one of the key conditions of an entity’s ability to continue as a goingconcern (AU 341.06 An Entity’s Ability to Continue as Going Concern, Con-sideration of Conditions and Events). Due to the framing eﬀect, auditors whoreceive or process the negatively framed information (credit denial) are more8ikely to have substantial doubt about the client’s ability to continue as a go-ing concern. Therefore, improperly designed visualizations can trigger and/oraggravate framing eﬀects [18].

Best practice : Data analysts should use visualization tools strategically, letsay, use an eﬀective tool at the early stage of lifecycle [5].2. Data Visualization and Availability Bias: Availability bias relates to the survivor-ship bias expressing the tendency to use the already available information and con-sider such information more relevant than evidence that is hard to attain. Datavisualization directly enhances the vividness and evaluability of data that inﬂuenceavailability bias but on the other hand, it compromises the overall quality of makingeﬀective decisions [54].

Example : After reading an article about lottery winners, you start to overes-timate your own likelihood of winning the jackpot. You start spending moremoney than you should each week on lottery tickets [78], [21].

Best practice : People should spend proper time and eﬀort contemplatingother options, to properly weight them in terms of how well they meet theobjective, or to consider the reliability, validity, certainty and accuracy of in-formation [31].3. Data visualization and Overconﬁdence Bias: Overconﬁdence bias refers to an ana-lyst’s tendency to overestimate their own ability to perform tasks or to make accuratediagnoses or other judgments and decisions [32]. Generally, the decision makers feelmore conﬁdent in the graphical visualization of the data as compared to the textualformat. Overconﬁdence leads to less cautious behavior that can be dangerous whilemaking sensitive decisions. • Example : In a survey conducted of 300 fund managers, asking if they believein their managerial abilities with options average, above average and belowaverage. Figures show that 74 percent believed that they were above averageand of the remaining 26 percent thought they were average. While no onethought that he/she was below average. So, it is clear that these ﬁndings arestatistically impossible or manipulated which is not suitable for data modeling[15]. • Best practice : Overconﬁdence can lead to overestimation and over-precisionthat is intolerable in statistical analysis. Therefore, one can channelize his/heroverconﬁdence by creating scientiﬁc a mindset, by challenging his viewpoints,by listening to the criticism and by constant learning attitude [1].4. Data visualization and Anchoring Bias: Anchoring bias refers to the situation inwhich individuals rely too much on the initial piece of information oﬀered and makefuture decisions by using this information. While visualizing data, anchoring biasmay disturb the future interpretations and evaluations of insights coming from thesame data based on preliminary evidence [18]. • Example : E-commerce stores take beneﬁts by using anchoring techniques byshowing costly things ﬁrst. Seeing 500 dollar shirts ﬁrst and 60 dollar shirts atthe second place, one will be prone to see the second shirt cheap. • Best practice : Critical thinking can be beneﬁcial in avoiding it. One canstudy his/her own anchoring behavior and analyze its prospects. Making future9ecisions based on historical patterns can also limit anchoring bias. Asking yourcolleague for the review is not a bad choice [57].5. Data Visualization, Conﬁrmation Bias and Signal Error: One can be the victim ofboth conﬁrmation bias and signal error when there is a huge amount of data inhand. Signal error occurs when data analysts just overlook the major gaps betweenthe data that make it inconsistent or unreliable. On the other hand, conﬁrmationbias is the situation in which model builders unconsciously process the subset of datavisualization that conﬁrm their prior feelings and viewpoints [25] [18] . In addition,a trainer may actually keep training a model until it produces a result that alignswith their original hypothesis; this is called experimenter’s bias. All of these biasescan impede while making decisions. • Example : Peter O. Gray [35] in his book presents an example of conﬁrmationbias in the doctor’s diagnosis. He explained that a doctor forecasts the diseaseafter asking some queries from the patient and looks for the evidence that tendsto conﬁrm his/her diagnosis while overseeing the sign that inclines to defeat hisanalysis. Same is the case with data scientists who mostly tend to ignore thedata that contradicts their hypothesis that ultimately have negative impactson the process. Therefore, a data scientist should know all his biases and thinkscientiﬁcally to avoid such blunders. • Best practice : Conﬁrmation Bias can be countered by continuously challeng-ing your thoughts, by ﬁnding alternative sources of information and testing it[58].

In this phase, data models are predicted and forecasted using e.g. artiﬁcial intelligencealgorithms.

Studies have shown the probability of unfairness in data is much greater than that ofalgorithms. More precisely, datasets are previously discriminated before passing throughthe algorithms that exhibit biased decisional pictures [46]. Machine learning algorithmsbased on AI, are commonly used while training the models in a supervised learning frame-work. Fairness is an increasingly important concern as autonomous models are used tosupport decision making in high-stakes applications such as mortgage lending, hiring, andprison sentencing [8]. To understand the responsibility of model failure, understanding theaccountability matrix for algorithms is essential. Algorithmic bias is when an algorithmdoes not neutrally extract or transform the data. Scholars are trying hard to ﬁgure outthe ways of mitigating the algorithmic biases present in Google searches, Facebook feeds,or in YouTube recommendations [22].

Example:

Online retailer Amazon, whose global workforce is 60 percent male and wheremen hold 74 percent of the company’s managerial positions, recently discontinued use of arecruiting algorithm after discovering gender bias. The data that engineers used to createthe algorithm were derived from the resumes submitted to Amazon over a 10-year period,which were predominantly from white males. The algorithm was taught to recognize wordpatterns in the resumes, rather than relevant skill sets, and these data were benchmarked10gainst the company’s predominantly male engineering department to determine an ap-plicant’s ﬁt. As a result, the AI software penalized any resume that contained the word women’s in the text and downgraded the resumes of women who attended women’s col-leges, resulting in gender bias [80].

Source of Algorithmic Bias[22]

1. Biased training data can be the source of algorithmic bias.2. Algorithms can be biased via diﬀerential use of information (using morally irrelevantcategories to make morally relevant and sensitive judgments).3. During the data processing, the algorithm can itself be biased, called ”AlgorithmProcessing Bias”. The most obvious instance of algorithmic processing bias is theuse of a statistically biased estimator in the algorithm for better future predictions.So, this bias mostly occurs due to deliberate choice.4. Algorithmic bias can also occur when the speciﬁc model is employed outside ofits context, commonly known as Transfer Context Bias (for instance, using an au-tonomous system worldwide that was designed to be used in United States). It isbasically the user bias but labeled as the algorithmic bias .5. Sometimes the information that an algorithm produces mismatch with the infor-mation that user expects, is known as Interpretation Bias. In the manual systems,misjudging the results of algorithms is actually the user bias but also notorious as algorithmic bias . In autonomous systems, biased judgments about causal structureor strength (i.e., that deviate from the actual causal structure in the world) caneasily be misused in biased ways by autonomous systems.6. Algorithmic bias can occur when algorithms make decisions based on race, usuallycalled racial bias. It may be due to the unintentional or open inclusion of racialcharacteristics by the developer in the databank or may be due to historical biasin data. Advanced health-care systems rely on commercial prediction algorithms toidentify and help patients with complex health needs, therefore, there are high risksattached with the biased predictions. A clear example of racial disparity is thatAfrican American patients are considerably sicker than white patients, as evidencedby signs of uncontrolled illnesses [59].

Best practice:

AIF360 is the ﬁrst system to bring together in one open source toolkit:bias metrics, bias mitigation algorithms, bias metric explanations, and industrial usability.By integrating these aspects, AIF360 can enable stronger collaboration between AI fair-ness researchers and practitioners, helping to translate our collective research results topracticing data scientists, data engineers, and developers deploying solutions in a varietyof industries [8]. The fairness metrics and the algorithms the tool is using is shown hereand here. The algorithms used by AIF360 are the following:Pre-processing:1.

Disparate impact remover is a preprocessing technique that edits featurevalues increase group fairness while preserving rank-ordering within groups [29].11.

Learning fair representations is a pre-processing technique that ﬁnds alatent representation which encodes the data well but obfuscates informationabout protected attributes [84].3.

Optimized preprocessing is a preprocessing technique that learns a proba-bilistic transformation that edits the features and labels in the data with groupfairness, individual distortion, and data ﬁdelity constraints and objectives [11].4.

Reweighing is a preprocessing technique that weights the examples in each(group, label) combination diﬀerently to ensure fairness before classiﬁcation[42].Post-processing:1.

Adversarial debiasing is an in-processing technique that learns a classiﬁer tomaximize prediction accuracy and simultaneously reduce an adversary’s abilityto determine the protected attribute from the predictions [85].2.

GerryFair Model is an algorithm for learning classiﬁers that are fair withrespect to rich subgroups [12].3. The meta algorithm here takes the fairness metric as part of the input andreturns a classiﬁer optimized w.r.t that fairness metric [14].4.

Prejudice remover is an in-processing technique that adds a discrimination-aware regularization term to the learning objective [44].Post-processing1.

Calibrated equalized odds postprocessing is a post-processing techniquethat optimizes over calibrated classiﬁer score outputs to ﬁnd probabilities withwhich to change output labels with an equalized odds objective [65] .2.

Equalized odds postprocessing is a post-processing technique that solvesa linear program to ﬁnd probabilities with which to change output labels tooptimize equalized odds [37].3.

Reject option classiﬁcation is a postprocessing technique that gives favor-able outcomes to unprivileged groups and unfavorable outcomes to privilegedgroups in a conﬁdence band around the decision boundary with the highestuncertainty [43].There are many metrics that measure individual and group fairness.1.

Statistical Parity Diﬀerence:

This is the diﬀerence in the rate of positive resultsthat the unprivileged group receives compared to the privileged group.2.

Equal Opportunity Diﬀerence:

The diﬀerence in truly positive rates betweenunprivileged and privileged groups.3.

Average Odds Diﬀerence:

The average diﬀerence between the false positive andtrue positive rate between unprivileged and privileged groups.4.

Disparate Impact:

The ratio of rate of favorable outcome for the unprivilegedgroup to that of the privileged group.5.

Theil Index:

Measures the inequality in beneﬁt allocation for individuals.12.

Euclidean Distance:

The average Euclidean distance between the samples fromtwo datasets.7.

Mahalanobis Distance:

The average Mahalanobis distance between the samplesfrom two datasets.8.

Manhattan Distance:

The average Manhattan distance between the samples andtwo datasets.

FAIRECSYS

It is an algorithm that mitigates algorithmic bias by post-processing therecommendation matrix with minimum impact on the utility of recommendations pro-vided to the end-users [26]. By giving people control of their digital footprints can behelpful to reduce data bias. Algorithm transparency is another way to address the issueof algorithmic bias. Policy makers should design and implement discrimination-free lawsto counter the lag in proper decision making that is, due to racial bias [50].

In the data science lifecycle, this bias appears when data experts use a particular modelrepeatedly for all the problems based on its historical performance without testing othersuitable models. In this phenomenon, a person who got best results recently is supposedto have greater chances of success in future [48].

Example:

When we throw a coin 20 times, then there is a 50 percent chance of gettingfour heads in a row, a 25 percent chance of ﬁve in a row and a 10 percent chance of arun of six. However, if you give this sequence to most individuals they will consider thatthese were patterns in the data and not at all random. This explains the hot hand fallacyin which we think we are on a winning streak – in whatever that may be – from cards tobasketball to football. In each of these areas where the data is random but happens toinclude a sequence, we massively over-interpret the importance of this pattern [76].

Best practice:

Data scientists should think systematically and treat each problem inde-pendently as per its requirements. If we again examine the coin toss example, just becauseyou threw tails three times and won, it does not mean the fourth toss will also result ina win. Therefore, one should segregate each problem and try to make logical conclusionsfor the choices [13].

The bandwagon eﬀect is the type of cognitive bias that refers to the individual tendencyto follow the behavior of the mass [51]. In model building, this eﬀect appears when weare derived by the impulse to adopt a speciﬁc methodology just because it previouslybeen adopted by others. Hence, the data scientists blindly select the model without anyevaluation as human brains love shortcuts.

Example:

Tonsillectomy is cited as a recent example of medical bandwagons. Althoughthe practice is said to be beneﬁcial in some speciﬁc cases, scientiﬁc support for the universaluse it saw was lacking. Doctors were drawn to tonsillectomy not on the basis of itseﬀectiveness, but because they saw it was widely used [66].13 est practice:

Despite Bandwagon eﬀect can help to adopt healthy behavior, dataanalysts should think twice before jumping into it. One must re-think either he/sheis rational or inﬂuenced by the environment/group while making inﬂuential judgments.One must evaluate the algorithms before rushing towards them without knowing aboutconstraints [49].

It is to believe that a person’s traits always follow the ideologies of a group to which he/shebelong and the decisions of the group manifest the beliefs it is every member [33]. Thisbias consist of two categories: • In-group bias: A preference for members of a group to which you also belong, or forcharacteristics that you also share.

Example:

Two engineers training a r´esum´e-screening model for software de-velopers are predisposed to believe that applicants who attended the samecomputer-science academy as they both did are more qualiﬁed for the role[33]. • Out-group homogeneity bias: A tendency to stereotype individual members of agroup to which you do not belong, or to see their characteristics as more uniform.

Example:

Two engineers training a r´esum´e-screening model for software de-velopers are predisposed to believe that all applicants who did not attend acomputer-science academy do not have suﬃcient expertise for the role [33].

Best practice:

In order to avoid group attribution biases, data scientistsshould not behave judgmentally rather they should analyze the situation andeﬃciently respond to the situation. Emotional and Cultural intelligence is an-other skill that can be handy in mitigating fundamental attribution errors whilescheming a model. Self-analysis is one of the versatile techniques to avoid severefavoritism [64].

Aggregation bias occurs while model creation, when one framework is used for the groupshaving distinct conditional distributions, P ( Y /X ). In many applications, the concernedpopulation is heterogeneous, and a single model does not ﬁt all the subgroups. Aggregationbias can lead to a model that is suitable for the dominant population, or a model thatdoes not ﬁt any group at all (if combined with representation bias) [77].

Example:

Diabetes patients have known diﬀerences in associated complications acrossethnicities [73]. Studies have also suggested that HbA1c levels (widely used to diagnoseand monitor diabetes) diﬀer in complex ways across ethnicities and genders [39]. Becausethese factors have diﬀerent meanings and importance within diﬀerent subpopulations, asingle model to predict complications is unlikely to be best-suited for any group in thepopulation even if they are equally represented in the training data [77].

Best practice:

Coupled learning techniques, for instance, multitask learning and Fairrepresentation learning approach such that, space projection of data can be useful whilecountering aggregation bias [53], [61]. 14 .4.6 Evaluation Bias

Evaluation bias happens during model iteration and evaluation when the training or bench-mark data do not represent the targeted population. Evaluation bias can also arise fromthe use of performance metrics that are not appropriate for the way in which the modelwill be used. This can be intensiﬁed by the use of inappropriate metrics that are arrayedto report performance boost [77].

Example:

In Buolamwini and Gebru [10], the example discussed under data bias in thecontext of the structure of face recognition refers to the drastically inferior performanceof commercially used face analysis algorithms. Looking at some common facial analysisbenchmark datasets, it becomes apparent why such algorithms were considered appropri-ate for use - 7.4 percent and 4.4 percent of the images in benchmark datasets such asAdience and IJB-A are of dark-skinned female faces. Algorithms that underperform onthis slice of the population therefore suﬀer quite little in their evaluation performance onthese benchmarks. The algorithms’ underperformance was likely caused by representa-tion bias in the training data, but the benchmarks failed to discover and penalize this.Since this study, other algorithms have been benchmarked on more balanced face datasets,changing the development process to encourage models that perform well across groups[67].

Best practice:

To mitigate evaluation bias an approach namely Subgroup Evaluationcan be used to comprehend the group matrices clearly by comparing them. Multiplemetrics and conﬁdence intervals is another useful technique in choosing relevant metricsfor modeling [10], [67]. Targeted data augmentation (e.g., SMOTE) is also used to populateparts of the data distribution that are underrepresented [20].

In the data analysis phase, the results are interpreted and knowledge is extracted. In thefollowing we describe the biases that occur in this phase.

Deployment Bias occurs after model deployment, when a system is used or interpreted ininappropriate ways. A model is built to carry out a particular task but what happens whenan autonomous model is moderated by institutional structure (also called Framing Trap)[69] or its recommendations would be interpreted erroneously by humans, a deploymentbias will occur [36].

Example:

Risk assessment tools in the criminal justice system predict a risk score, buta judge may interpret this in unexpected ways before making his or her ﬁnal decision [77].

Best practice:

Testing the model in the real world environment can aid to minimize itsharms. User training of the model could be an eﬀective step to handle deployment bias.Ethical model training using unbiased and transparent data and careful planning can behelpful to get optimal results [19]. 15 .5.2 Rescue Bias

Rescue bias is an interpretive bias related to analyzer’s personal preconceptions that pumpshim to discount the data by ﬁnding selective faults in a trial that undermines his/herexpectations. In other words, it is a planned attempt to emasculate the ﬁndings and drawthe pre-planned conclusions. One may fall for rescue bias after seeing the unexpected orbelow par ends of the experimentation [45].

Example:

Binge eating disorder could be excluded from ICD-11 due to arguments thatit might stigmatize people who eat a lot or individuals who have a high body mass in-dex. However, given the elevated mortality and other health risks associated with eatingdisorders, this would have a signiﬁcant adverse impact, particularly on young women [72].

Best practice:

To control the behavioral biases like this, data experts should thinkcritically and try to be logical. Self-criticism or self-evaluation is one of the eﬃcientprocedures to not be biased [58].

Overﬁtting and Under-ﬁtting are two problems with machine learning models. Overﬁttingis when the model grabs the trend of training data patterns so well that it does notimprove its ability to solve problem anymore. Under-ﬁtting is reverse of overﬁtting. Itis the statistical model that cannot capture the underlying trend of data or model overidealizes its experience [47].

Example:

Lets assume we want to predict if a student will land a job interview basedon her resume. Then, we assume that we train a model from a dataset of 10000 resum´esand their outcomes. Then, we try the model out on the original dataset, and it predictsoutcomes with 99 percent accuracy. When we run the model on a new ( unseen ) datasetof resumes, we only get 50 percent accuracy [27].

Best practice:

Process Mining Algorithms can reduce the gap between overﬁtting andunder-ﬁtting [79]. In [4], authors presented two important techniques namely PenaltyMethods, and Early Stopping, to limit these eﬀect (overﬁtting and under-ﬁtting). Fur-thermore, a novel L1/4 regularization method to overcome this issue, is stated in [47].

In this research paper we analyzed diﬀerent phases of the data science lifecycle. Whichbiases can occur in these phases? We have provided all these biases with a description,examples and best practices. In the further research we are dealing further with mitigationmethods. How can data scientists avoid all biases in this document? Furthermore, wecreate a vocabulary from the ﬁndings of these researchers to create an knowledge graph fordata scientists. This graph should help them to identify relations between each keywords.

References [1] Theodora S. Abigail.

Avoid Overconﬁdence Bias at the Workplace with these 7 Ac-tionable Tips , 2018 (accessed Jul. 26, 2020).162] R. Agarwal. , 2020(accessed July 20, 2020).[3] AIU.

SAMPLE BIAS, BIAS OF SELECTION AND DOUBLE-BLIND. , 2020 (ac-cessed July 23, 2020).[4] Haider Allamy. Methods to avoid over-ﬁtting and under-ﬁtting in supervised machinelearning (comparative study). 12 2014.[5] Vairam Arunachalam, Buck K. W. Pei, and Paul John Steinbart. Impression Manage-ment with Graphs: Eﬀects on Choices.

Journal of Information Systems , 16(2):183–202, 09 2002.[6] A. Basiri, M. Haklay, and Z. Gardner. The impact of biases in the crowdsourcedtrajectories on the output of data mining processes. 2018.[7] V. Beattie and M. Jones. The use and abuse of graphs in annual reports: Theoreticalframework and empirical study.

Accounting and Business Research , 22:291–303, 1992.[8] Rachel K. E. Bellamy, Kuntal Dey, Michael Hind, Samuel C. Hoﬀman, StephanieHoude, Kalapriya Kannan, Pranay Lohia, Jacquelyn Martino, Sameep Mehta,Aleksandra Mojsilovic, Seema Nagar, Karthikeyan Natesan Ramamurthy, John T.Richards, Diptikalyan Saha, Prasanna Sattigeri, Moninder Singh, Kush R. Varshney,and Yunfeng Zhang. AI fairness 360: An extensible toolkit for detecting, understand-ing, and mitigating unwanted algorithmic bias.

CoRR , abs/1810.01943, 2018.[9] J. Brownlee.

How to Identify and Remove Seasonality from Time Series Data withPython , 2016 (accessed July 23, 2020).[10] J. Buolamwini.

Gender Shades: Intersectional Accuracy Disparities in CommercialGender Classiﬁcation , 2018 (accessed July 22, 2020).[11] Flavio Calmon, Dennis Wei, Bhanukiran Vinzamuri, Karthikeyan Natesan Rama-murthy, and Kush R Varshney. Optimized pre-processing for discrimination preven-tion. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan,and R. Garnett, editors,

Advances in Neural Information Processing Systems 30 ,pages 3992–4001. Curran Associates, Inc., 2017.[12] Flavio P. Calmon, Dennis Wei, Karthikeyan Natesan Ramamurthy, and Kush R.Varshney. Optimized data pre-processing for discrimination prevention, 2017.[13] Capital.

Your ultimate guide to avoid hot hand fallacy in trading , 2018 (accessed Jul.24, 2020).[14] L. Elisa Celis, Lingxiao Huang, Vijay Keswani, and Nisheeth K. Vishnoi. Clas-siﬁcation with fairness constraints. In

Proceedings of the Conference on Fairness,Accountability, and Transparency - FAT . ACM Press, 2019.[15] CFI.

Overconﬁdence Bias - Deﬁnition, Overview and Examples in Finance. , 2012(accessed Aug. 20, 2020).[16] Ray Chambers and Andrea Diniz da Silva. Improved secondary analysis of linked data:a framework and an illustration.

Journal of the Royal Statistical Society: Series A(Statistics in Society) , 183(1):37–59, 2020.1717] C. Janie Chang, Sin-Hui Yen, and Rong-Ruey Duh. An empirical examination ofcompeting theories to explain the framing eﬀect in accounting-related decisions.

Be-havioral Research in Accounting , 14(1):35–64, 2002.[18] Chengyee Chang and Yan Luo. Data visualization and cognitive biases in audits.

Managerial Auditing Journal , ahead-of-print, 05 2019.[19] PhD Charla Griﬀy-Brown and PhD Mark Chun.

Avoiding Bias and Identifying Riskswhen Deploying Artiﬁcial Intelligence - What Business Leaders Need to Know , 2020(accessed Jul. 25, 2020).[20] Irene Chen, Fredrik D Johansson, and David Sontag. Why is my classiﬁer discrimi-natory? In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, andR. Garnett, editors,

Advances in Neural Information Processing Systems 31 , pages3539–3550. Curran Associates, Inc., 2018.[21] Kendra Cherry.

Availability Heuristic Aﬀecting Your Decision Making , 2019 (accessedAug. 20, 2020).[22] David Danks and Alex John London. Algorithmic bias in autonomous systems. In

Proceedings of the Twenty-Sixth International Joint Conference on Artiﬁcial Intelli-gence . International Joint Conferences on Artiﬁcial Intelligence Organization, August2017.[23] Patrick Dattalo.

Ethical Dilemmas in Sampling , 2010 (accessed July 26, 2020).[24] Kristina Dineva and Tatiana Atanasova. Osemn process for working over data ac-quired by iot devices mounted in beehives. 01 2018.[25] H. R. K. Yunfei Du.

Data Science for Librarians , 2020 (accessed Jul. 23, 2020).[26] Bora Edizel, Francesco Bonchi, Sara Hajian, Andr´e Panisson, and Tamir Tassa.FaiRecSys: mitigating algorithmic bias in recommender systems.

International Jour-nal of Data Science and Analytics , 9(2):197–213, March 2019.[27] EDS.

Overﬁtting in Machine Learning: What It Is and How to Prevent It , 2020(accessed Jul. 26, 2020).[28] Barbara Fadem.

Behavioral science - Barbara Fadem.

Board review series. LippincottWilliams, Wilkins, Baltimore, 4th ed. edition, 2005.[29] Michael Feldman, Sorelle A. Friedler, John Moeller, Carlos Scheidegger, and SureshVenkatasubramanian. Certifying and removing disparate impact. In

Proceedings ofthe 21th ACM SIGKDD International Conference on Knowledge Discovery and DataMining - KDD . ACM Press, 2015.[30] Formplus.

Data Cleaning: Deﬁnition, Methods, and Uses in Research , 2020 (accessedJul. 24, 2020).[31] Landsittel Glover and Thomson.

The institute of Management Accountants (IMA)The institute of internal Auditors (IIA) Committee of Sponsoring Organizations ofthe Treadway Commission Preface COSO Board Members , 2012 (accessed Aug. 20,2020). 1832] Steven Glover, Douglas Prawitt, Sam Ranzilla, Robert Chevalier, and George Her-rmann. Elevating professional judgment in auditing and accounting: The kpmg pro-fessional judgment framework.

KPMG Monograph , 01 2011.[33] Google.

Fairness: Types of Bias, Machine Learning Crash Course , 2020 (accessedJul. 25, 2020).[34] Praveen Govindaraj.

Outliers - What it say during data analysis , 2018 (accessed July23, 2020).[35] Peter O. Gray.

Psychology , 2002 (accessed Jul. 23, 2020).[36] Ben Green and Yiling Chen. Disparate interactions. In

Proceedings of the Conferenceon Fairness, Accountability, and Transparency - FAT . ACM Press, 2019.[37] Moritz Hardt, Eric Price, and Nathan Srebro. Equality of opportunity in supervisedlearning, 2016.[38] James J. Heckman.

Selection Bias and Self-Selection , pages 242–266. PalgraveMacmillan UK, London, 2010.[39] William H. Herman and Robert M. Cohen. Racial and ethnic diﬀerences in therelationship between HbA1c and blood glucose: Implications for the diagnosis ofdiabetes.

The Journal of Clinical Endocrinology & Metabolism , 97(4):1067–1072,April 2012.[40] Sarah Holland, Ahmed Hosny, Sarah Newman, Joshua Joseph, and Kasia Chmielinski.The dataset nutrition label: A framework to drive higher data quality standards, 2018.[41] N. Ingram.

Survivorship bias – lessons from World War Two aircraft , 2016 (accessedJuly 20, 2020).[42] Faisal Kamiran and Toon Calders. Data preprocessing techniques for classiﬁcationwithout discrimination.

Knowledge and Information Systems , 33(1):1–33, December2011.[43] Faisal Kamiran, Asim Karim, and Xiangliang Zhang. Decision theory fordiscrimination-aware classiﬁcation. In . IEEE, December 2012.[44] Toshihiro Kamishima, Shotaro Akaho, Hideki Asoh, and Jun Sakuma. Fairness-awareclassiﬁer with prejudice remover regularizer. In

Machine Learning and KnowledgeDiscovery in Databases , pages 35–50. Springer Berlin Heidelberg, 2012.[45] Ted J Kaptchuk. Eﬀect of interpretive bias on research evidence.

BMJ ,326(7404):1453–1455, 2003.[46] Keith Kirkpatrick. It’s not the algorithm, it’s the data.

Communications of the ACM ,60(2):21–23, January 2017.[47] Johnson Kolluri, Vinay Kumar Kotte, M.S.B. Phridviraj, and Shaik Razia. Reduc-ing overﬁtting problem in machine learning using novel l1/4 regularization method.In . IEEE, June 2020. 1948] Marko Kovic and Silje Kristiansen. The gambler’s fallacy fallacy (fallacy).

Journalof Risk Research , 22(3):291–302, September 2017.[49] The Decision Lab.

Bandwagon Eﬀect - Biases and Heuristics , 2020 (accessed Jul. 25,2020).[50] Nicol Turner Lee. Detecting racial bias in algorithms and machine learning.

Journalof Information, Communication and Ethics in Society , 16(3):252–260, August 2018.[51] Y.-J Lee and G. Woo. Analyzing the dynamics of stock networks for recommendingstock portfolio *.

Journal of Information Science and Engineering , 35:411–427, 032019.[52] Yi Li and Nuno Vasconcelos. Repair: Removing representation bias by dataset re-sampling, 2019.[53] Paul Pu Liang, Terrance Liu, Liu Ziyin, Nicholas B. Allen, Randy P. Auerbach,David Brent, Ruslan Salakhutdinov, and Louis-Philippe Morency. Think locally, actglobally: Federated learning with local and global representations, 2020.[54] Nicholas H. Lurie and Charlotte H. Mason. Visual representation: Implications fordecision making.

Journal of Marketing , 71(1):160–177, 2007.[55] Fernando Mart´ınez-Plumed, C`esar Ferri, David Nieves, and Jos´e Hern´andez-Orallo.Fairness and missing values, 2019.[56] Ninareh Mehrabi, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, and AramGalstyan. A survey on bias and fairness in machine learning, 2019.[57] KELLY A. MORROW and CECILIA T. DEIDAN. Bias in the counseling process:How to recognize and avoid it.

Journal of Counseling & Development , 70(5):571–577,1992.[58] MTCT.

Avoiding Psychological Bias in Decision Making , 2020 (accessed Jul. 26,2020).[59] Ziad Obermeyer, Brian Powers, Christine Vogeli, and Sendhil Mullainathan. Dissect-ing racial bias in an algorithm used to manage the health of populations.

Science ,366(6464):447–453, October 2019.[60] Alexandra Olteanu, Carlos Castillo, Fernando Diaz, and Emre Kiciman. Social data:Biases, methodological pitfalls, and ethical boundaries.

SSRN Electronic Journal , 012016.[61] Luca Oneto, Michele Doninini, Amon Elders, and Massimiliano Pontil. Takingadvantage of multitask learning for fair classiﬁcation. In

Proceedings of the 2019AAAI/ACM Conference on AI, Ethics, and Society . ACM, January 2019.[62] J. W. Osborne.

Best Practices in Quantitative Methods , 2008 (accessed July 20, 2020).[63] Thor-Bjorn Ottosen and Prashant Kumar. Outlier detection and gap ﬁlling method-ologies for low-cost air quality measurements.

Environ. Sci.: Processes Impacts ,21:701–713, 2019.[64] Pat.

The Fundamental Attribution Error: What It is and How to Avoid It , 2017(accessed Jul. 25, 2020). 2065] Geoﬀ Pleiss, Manish Raghavan, Felix Wu, Jon Kleinberg, and Kilian Q. Weinberger.On fairness and calibration, 2017.[66] L Rikkers. The bandwagon eﬀect.

Journal of Gastrointestinal Surgery , 6(6):787–794,December 2002.[67] Hee Jung Ryu, Hartwig Adam, and Margaret Mitchell. Inclusivefacenet: Improvingface attribute detection with race and gender diversity, 2018.[68] V. Honavar S. Barocas, E. Bradley and F. Provost.

Meet the New Boss: Big Data:Companies Trade in Hunch-Based Hiring for Computer Modeling , 2016 (accessed July22, 2020).[69] Andrew D. Selbst, Danah Boyd, Sorelle A. Friedler, Suresh Venkatasubramanian, andJanet Vertesi. Fairness and abstraction in sociotechnical systems. In

Proceedings ofthe Conference on Fairness, Accountability, and Transparency - FAT . ACM Press,2019.[70] EXL Service.

Ethics and Bias in Artiﬁcial Intelligence - What Every Executive Need. ,2020 (accessed Aug. 20, 2020).[71] Gaganpreet Sharma.

Pros and cons of diﬀerent sampling techniques , 2019 (accessedJuly 26, 2020).[72] Fr´ed´erique R. E. Smink, Daphne van Hoeken, and Hans W. Hoek. Epidemiologyof eating disorders: Incidence, prevalence and mortality rates.

Current PsychiatryReports , 14(4):406–414, May 2012.[73] Elias K. Spanakis and Sherita Hill Golden. Race/ethnic diﬀerence in diabetes anddiabetic complications.

Current Diabetes Reports , 13(6):814–823, September 2013.[74] PEVERILL SQUIRE. WHY THE 1936 LITERARY DIGEST POLL FAILED.

PublicOpinion Quarterly , 52(1):125–133, 01 1988.[75] T. Srivastava.

Festive season special: Building models on seasonal data , 2013 (accessedJuly 23, 2020).[76] Colin Strong. The challenge of big data: What does it mean for the qualitative re-search industry?

Qualitative Market Research: An International Journal , 17(4):336–342, September 2014.[77] Harini Suresh and John V. Guttag. A framework for understanding unintended con-sequences of machine learning, 2020.[78] A. Tversky and D. Kahneman. Availability: A heuristic for judging frequency andprobability. 1973.[79] W. M. P. van der Aalst, V. Rubin, H. M. W. Verbeek, B. F. van Dongen, E. Kindler,and C. W. G¨unther. Process mining: a two-step approach to balance between under-ﬁtting and overﬁtting.

Software & Systems Modeling , 9(1):87–111, November 2008.[80] James Vincen.

Amazon reportedly scraps internal AI recruiting tool that was biasedagainst women , 2018 (accessed Jul. 24, 2020).2181] Tian Wang, Haoxiong Ke, Xi Zheng, Kun Wang, Arun Kumar Sangaiah, and AnfengLiu. Big data cleaning based on mobile edge computing in industrial sensor-cloud.

IEEE Transactions on Industrial Informatics , 16(2):1321–1329, 2 2020.[82] Leo Wong.

Introductory Backtesting Notes for Quantitative Trading Strategies UsefulMetrics and Common Pitfalls Introductory Backtesting Notes , 2019 (accessed July20, 2020).[83] X. Xu, Y. Lei, and Z. Li. An incorrect data detection method for big data cleaningof machinery condition monitoring.

IEEE Transactions on Industrial Electronics ,67(3):2326–2336, 2020.[84] R. Zemel, Y. Wu, K. Swersky, T. Pitassi, and C. Dwork. Learning fair representations. , pages 1362–1370,01 2013.[85] Brian Hu Zhang, Blake Lemoine, and Margaret Mitchell. Mitigating unwanted biaseswith adversarial learning. In