[PDF] A Human-Centered Review of the Algorithms used within the U.S. Child Welfare System

Abstract

The U.S. Child Welfare System (CWS) is charged with improving outcomes for foster youth; yet, they are overburdened and underfunded. To overcome this limitation, several states have turned towards algorithmic decision-making systems to reduce costs and determine better processes for improving CWS outcomes. Using a human-centered algorithmic design approach, we synthesize 50 peer-reviewed publications on computational systems used in CWS to assess how they were being developed, common characteristics of predictors used, as well as the target outcomes. We found that most of the literature has focused on risk assessment models but does not consider theoretical approaches (e.g., child-foster parent matching) nor the perspectives of caseworkers (e.g., case notes). Therefore, future algorithms should strive to be context-aware and theoretically robust by incorporating salient factors identified by past research. We provide the HCI community with research avenues for developing human-centered algorithms that redirect attention towards more equitable outcomes for CWS.

Full PDF

AA Human-Centered Review of the Algorithms used withinthe U.S. Child Welfare System

Devansh Saxena

Marquette UniversityMilwaukee, WI, [email protected]

Karla Badillo-Urquiola

University of Central FloridaOrlando, FL, [email protected]

Pamela J. Wisniewski

University of Central FloridaOrlando, FL, [email protected]

Shion Guha

Marquette UniversityMilwaukee, WI, [email protected]

ABSTRACT

The U.S. Child Welfare System (CWS) is charged with improv-ing outcomes for foster youth; yet, they are overburdened andunderfunded. To overcome this limitation, several states haveturned towards algorithmic decision-making systems to reducecosts and determine better processes for improving CWS out-comes. Using a human-centered algorithmic design approach,we synthesize 50 peer-reviewed publications on computationalsystems used in CWS to assess how they were being devel-oped, common characteristics of predictors used, as well asthe target outcomes. We found that most of the literature hasfocused on risk assessment models but does not consider the-oretical approaches (e.g., child-foster parent matching) northe perspectives of caseworkers (e.g., case notes). Therefore,future algorithms should strive to be context-aware and theoret-ically robust by incorporating salient factors identiﬁed by pastresearch. We provide the HCI community with research av-enues for developing human-centered algorithms that redirectattention towards more equitable outcomes for CWS.

Author Keywords

Child Welfare System; Algorithmic Decision-Making;Human-centered Algorithm Design

CCS Concepts • Applied computing → Computing in government; • Information systems → Decision support systems;

INTRODUCTION

As of September 2016, there were 437,465 children in thechild welfare system (CWS) in United States [87]. This is asigniﬁcant (10%) rise in just 4 years since September 2012[87], and this number is expected to keep rising unless signiﬁ-cant efforts are made to improve youth outcomes [87]. Childabuse and neglect are severe issues that policymakers in theUnited States continue to battle with, and which is consis-tently at the foreground of public policy [37]. In recent years,CWS has been the center of public and media scrutiny [38]because of the potential damage done to the children who are

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor proﬁt or commercial advantage and that copies bear this notice and the full citationon the ﬁrst page. Copyrights for components of this work owned by others than theauthor(s) must be honored. Abstracting with credit is permitted. To copy otherwise, orrepublish, to post on servers or to redistribute to lists, requires prior speciﬁc permissionand/or a fee. Request permissions from [email protected].

CHI ’20, April 25–30, 2020, Honolulu, HI, USA.

Copyright is held by the owner/author(s). Publication rights licensed to ACM.ACM ISBN 978-1-4503-6708-0/20/04 ...$15.00.http://dx.doi.org/10.1145/3313831.3376229 removed from the care of their parents [45]. Therefore, thereis signiﬁcant pressure on CWS to systematize the decision-making process and show that these decisions were unbiasedand evidenced-based [85]. For most policymakers, algorithmicdecisions are perceived to be the epitome of being unbiased,evidence-based, and objective [109, 2]. Thus, algorithms havebeen developed for almost every aspect of services providedby CWS in different states. For instance, models have beendeveloped for predicting risk of future maltreatment event ofa child [110], recommending appropriate placement settings[97] and matching children with foster parents who can meetthe unique needs of every child [80]. Many of these algo-rithms have achieved various degrees of early successes andhave shown to reduce costs [93] for CWS. However, they havealso come under signiﬁcant criticisms for being biased [34,16], being opaque [109], complex and hard to explain [110,33], being too reductive [36] and non-contextual [99] and fornot incorporating factors that arise from relevant social scienceresearch literature [30].The SIGCHI community is at the forefront of research onalgorithmic bias [43, 20, 69], and has begun to examine someof the challenges of algorithmic decision-making within CWS.Brown et al. [25] studied community perspectives on algo-rithmic decision-making systems in CWS and found severalaspects of algorithmic systems that bolstered distrust, perpet-uated bias, concern over the lack of contextual understand-ing and ‘black-box’ nature of the algorithms, as well as con-cerns about how these algorithms may negatively impact child-welfare workers’ decisions. Moreover, scholars outside ofHCI have discussed how algorithms impact decision-makingin CWS [29, 99, 100, 51]. Engaging in research that helpspeople and organizations, such as CWS, is well-suited andimportant for the HCI community. Therefore, a critical stepin building a strategic research agenda is to synthesize thebreadth of work that has already been done to identify a path-way forward. To forge this path, we posed the followinghigh-level research questions:

RQ1:

What methods have been used to build algorithms inthe child welfare system?

RQ2:

What factors (i.e., independent variables) have beenshown to be salient in predicting CWS outcomes?

RQ3:

What outcomes (i.e., dependent variables) have CWSorganizations been predicting?

To answer these questions, we conducted a comprehensive lit-erature review (n=50) of algorithms used for decision-making a r X i v : . [ c s . C Y ] M a r n CWS in the United States. We qualitatively analyzed thesearticles using the lens of human-centered algorithm design[13]. Overall, we found that majority of the algorithms inCWS are empirically constructed, even though the empiricalknowledge is quite fragmented [55]. Our results also revealedconsiderable differences in the predictors currently being usedand those found salient in the child-welfare literature. Fi-nally, CWS has traditionally focused on ‘risk assessment,’rather than positive outcomes that improve the lives of fosterchildren. Based on Woobrock and Kientz’s encapsulation ofresearch contributions in HCI [117], this paper is a survey ofthe existing literature and makes the following unique researchcontributions:1. We apply a human-centered conceptual framework [13] tocritically review the algorithms used within the U.S. childwelfare system.2. We introduce domain knowledge from the child welfaresystem to embed it within the SIGCHI community to allowfor collaborative research between the two disciplines.3. We identify the potential gaps in the existing literature andrecommend future research opportunities with careful atten-tion to the human-centered design of algorithms to beneﬁtCWS.In the following sections, we discuss Human-Centered Algo-rithm Design and how we used this framework to inform ourliterature review methodology. Next, we situate our researchwithin the SIGCHI community. A Human-Centered Approach to Algorithm Design

As algorithms begin to permeate through every aspect of so-cial life, HCI researchers have begun to ask, "Where is theHuman?", that is, recognizing that humans are a critical, if notthe central component of many domains for which ArtiﬁcialIntelligence (AI) systems are being developed. A workshop or-ganized at CHI 2019 [60], tackled this topic to identify severalpertinent issues in algorithmic design, such as the opaque andisolated development of algorithms and a lack of involvementof the human stakeholders, who use these systems and aremost affected by them. To address these problems, Baumerproposed Human-Centered Algorithm Design (HCAD) [13];a conceptual framework founded in practices derived fromhuman-centered design [61]. It incorporates human and so-cial interpretations through both the design and evaluationphases [13]. Baumer [13] lays out three strategies that helpalgorithm design become more human-centered, namely, 1)theoretical, 2) speculative and 3) participatory strategies. Wedraw from the theoretical perspective to frame our researchquestions and as the qualitative lens for our analysis. Human-centered theoretical design strategy informs algorithm designas follows: • Meaning-making: Theoretical foundations provide a much-needed scaffolding for dealing with complexity, identifyingand evaluating design opportunities [89]. Designers needto study the socio-cultural domain in which they intend tosituate their work. • Design: Theoretical approaches aim to incorporate conceptsand theories from social sciences into data science [13]. • Evaluation: The stakeholders’ social interpretations of re-sults can help ensure that the algorithm has higher utilityand integrates well with practice.CWS is one such domain that suffers from a complete lackof human perspectives through the design process. Therefore,our work focuses on how HCAD strategies can be employedto answer critical research questions in CWS.

BACKGROUND

We situate our research within the SIGCHI community andprovide an overview of the work that has been done to developintegrated data systems for CWS.

SIGCHI Research to Support the Child Welfare System

The SIGCHI community has recognized the importance ofconducting research with organizations that help disadvan-taged communities, such as those experiencing homelessness[108, 118] or recovering from substance abuse [75]. For exam-ple, Strohmayer, Comber, and Balaam [108] partnered with acenter for people of low social stability to understand homelessyoung adults’ perceptions of education. Similarly, Woelferand Hendry [118] created a community technology center ata local service agency to work with homeless young people,case managers, and outreach workers. Similarly, SIGCHIresearchers have started to engage with CWS to ﬁnd waysto improve the lives of youth who have been displaced fromtheir families. Some SIGCHI research has focused on fosteryouth and parents. For instance, Gray et al.’s [56] researchwith fostered and adopted children introduces a new digitalmemory box for creating and storing childhood memories.More recently, researchers have begun to study algorithmicdecision-making systems within the child-welfare community.Badillo-Urquiola et al. [9] presented the challenges fosterparents face mediating teens’ technology use within the home.Most relevant to our current work, Brown et al. [25] engagedin a participatory design effort and conducted workshops withfamilies involved in CWS, child-welfare workers, and serviceproviders. They found that participants were uncomfortablewith algorithmic systems. Participants felt that these systemsused deﬁcit-based frameworks to make decisions and ques-tioned the bias present within the data. Based on their ﬁndings,the investigators provide recommendations for researchersand designers to work together with public service agencies todevelop systems that provide a higher comfort level to the com-munity. Our study builds upon this related work by criticallyinvestigating the algorithms used within CWS and highlight-ing opportunities for future research. We provide a foundationfor implementing human-centered approaches in the designand development of algorithmic systems for CWS.

Sociotechnical Systems for Child-Welfare

In this section, we provide necessary background context aboutintegrated data systems that laid the foundation for algorith-mic work in CWS. In 1995, the federal government launched

SACWIS (State Automated Child Welfare Information System)initiative to provide states with a federally funded and auto-mated case management tool. These data systems allow statesto collect and maintain data for program management andinforming their decision making [66]. States that implement

ACWIS must also report their data to federal databases, suchas

NCANDS (National Child Abuse and Neglect Data System)[86] and AFCARS (Adoption and Foster Care Analysis andReporting System) [87], to allow for the continual curationof comprehensive national databases. These data systems be-came the foundation for actuarial risk assessment tools, whichhave been mandated into practice, even though controversystill remains as to whether these tools should override the judg-ment of case workers who are most knowledgeable about aparticular child’s case [99, 98, 100, 111, 29].Past survey papers have analyzed algorithms in CWS from amacro perspective, focusing on their reliability and validitywith respect to consensus-based or clinical risk assessmentmodels [99, 29]. Yet, they do not examine the mathematicalor human-centered construction of these algorithms, that is,the techniques, the variable sets, or the outcomes predicted.This is especially important in CWS because each case ofchild neglect or abuse is contextually different and cannot beevaluated using the same set of signiﬁcant predictors derivedempirically [29]. To this end, we conducted a systematicliterature review and identify the potential gaps in the literaturewith careful attention to the development of algorithms acrosstime, as well as the methods and variable sets used.

METHODS

We describe our scoping criteria, systematic literature searchand data analysis process.

Scoping Criteria: Deﬁning Algorithms

To understand how "algorithms" are used in CWS, we ﬁrstneed to contextualize what we mean by algorithms. We con-ceptualized "algorithms" through the lens of

Street-level Al-gorithms , a term recently coined by Alkhatib and Bernstein[6] in the HCI community. Street-level algorithms are algo-rithmically based systems that directly interact with and makeon-the-ground decisions about human lives and welfare in asociotechnical system [6]. From a more technical perspec-tive, we use recent inclusive deﬁnitions [48, 68] for a wholesuite of computational methods from statistical modeling (fore.g., generalized linear models) and machine learning. Thisallowed us to take a holistic viewpoint toward most forms ofquantitative data analysis in CWS. Statistical modeling andmachine learning are not mutually exclusive but we differenti-ate between them based on assumptions made about the dataas speciﬁed by Breiman [23].

Systematic Literature Search

This study has been undertaken as a systematic literature re-view based on the guidelines proposed by Webster and Wat-son [115]. The unit of analysis for this literature review waspeer-reviewed articles. We wanted to examine not just thealgorithms currently being used in CWS but also newer so-lutions (algorithms) being proposed by researchers to betterassess the current state of research. We used the followingsearch terms to ﬁnd papers at the intersection of CWS andalgorithms – "child protective services," "child welfare," "fos-ter care," "child and family services," "algorithm," "compu-tation," "regression," "machine learning," "neural network,""data-driven," "actuarial," "computer program," "application".We used the following inclusion criteria for the articles:

Code n Breakdown

Peer reviewed 43 (social science); (computer science)Agency report 7 —Theory 5 (implemented); (proposed)Psychometric scales 30 —Actual system 27 (RAs); (PLs); (MT)Hypothetical system 23 (RAs); (PLs); (MT); (S-PL)Model performance 35 — RA : Risk Assessment model PL : Placement Recommendation model MT : Child-Foster parent Matching model S-PL : Characteristics of successful placements

Table 1: Descriptive Characteristics of the Data Set • The paper was peer-reviewed, published work or a systems(or policy) report produced by a government agency. • The study (or report) engaged in a technical discussionabout the computational methods, predictors and outcomes.Articles that did not meet these two criteria were consideredirrelevant for this study and were not included in our review.We conducted a comprehensive search to identify relevantresearch across multiple disciplines. We searched a diverseset of digital libraries which included the ACM Digital Li-brary, IEEE Xplore, Routledge, Elsevier, and Springer. Wechose these libraries to take into account research publishedin multi-disciplinary conferences and journals. We then cross-referenced the citations of each article to identify additionalarticles or government reports that met our inclusion criteria.We did not place any constraints on our search based on thetime period in which the papers were published. We identiﬁed relevant articles that met our inclusion criteria. Data Analysis Approach

To analyze our data, we conducted a structured qualitativeanalysis to answer our over-arching research questions. Weused a grounded thematic process [22] to generate codes basedon the data as shown in Table 2 . We deﬁne theory in two ways– the system discussed in the study was developed using a the-oretical framework or the system was developed theoreticallybased upon factors considered signiﬁcant in evidence-basedsocial work. The ﬁrst author coded all of the articles, and co-authors were consulted to form a consensus around codes earlyin the coding process and again during coding to resolve am-biguous codes. We also coded for descriptive characteristicsof the article set as shown in Table 1.

RESULTS

In this section, we present our key ﬁndings from our reviewof the literature. We begin by ﬁrst discussing the descriptivecharacteristics of our data set. Next, we organize and presentthe results by our three research questions, as shown in Table 2.Finally, we explore the relationship between the computationalmethods, predictors, and outcomes identiﬁed in our analyses.

Descriptive Characteristics of the Data Set

The majority of the papers (n=40 or 80%) were published insocial science venues with 3 papers (6%) published in com-puter science conferences, [4, 7, 33] all in 2018. We alsoincluded 7 reports (14%) from non-proﬁt organizations, in-cluding the Children’s Research Center [3]. One study dis-cussed an algorithm which was theoretically constructed based esearch Dimension Codes Count % ExampleQuestion

Inferential Statistics Generalized Linear Models (GLM) 28 56% [110]

RQ1

Discriminant Analysis/Statistical tests (DAS) 6 12% [96](Computational Machine Learning Supervised Learning (SUP) 13 26% [33]Method) Unsupervised Learning (UNSUP) 3 6% [78]Demographics Child Demographics (C-DEM) 20 40% [7]Biological parents Demographics (P-DEM) 10 20% [65]Systemic Factors Characteristics of Agency (AGENCY) 2 4% [80]Characteristics of Caseworker (WORKER) 1 2% [80]Child Strengths Child Strengths (CHI-S) 11 22% [32]Child Needs Functioning (CHI-F) 15 30% [77]Child Behavioral/Emotional Needs (CHI-BE) 26 52% [70]Child Risks Suicide Risk (CHI-SR) 9 18% [39]

RQ2

Child Risk Behaviors (CHI-BR) 20 40% [93](Predictor Traumatic Experiences (CHI-T) 30 60% [10]variables) Child Involvement in CWS (CHI-CWS) 9 18% [110]Bio-Parent Risk/Needs Needs and Risky behavior (PAR-NS) 26 52% [32]Foster Parents Characteristics (income, occupation) (FP-CHAR) 4 8% [7]Preferences (FP-PREF) 2 4% [80]Past performance (FP-PAST) 1 2% [80]Capabilities (training/certiﬁcations) (FP-CAPS) 1 2% [80]Outcome Risk of a future maltreatment event (RISK) 28 56% [65]

RQ3

Placement recommendation for a child (PLACE) 15 30% [32](Outcome Matching children with foster parents (MATCH) 2 4% [80]Variables) Characteristics of a successful placement (S-PLACE) 5 10% [107]

Table 2: Structured Codebook: Dimensions are mapped onto their respective research questionson child-welfare research literature and four studies proposedtheoretically-driven solutions. 30 papers (60%) employed psy-chometric scales [53] to assess the strengths, needs and risksassociated with foster children and/or the biological parents.27 papers (54%) discussed an actual algorithmic system and23 papers (46%) proposed a new algorithmic system. Modelperformance was reported by 35 papers (70%).

Computational Methods used to build Algorithms (RQ1)

In this section, we discuss the computational methods usedto develop algorithms and organize them into the

InferentialStatistics and

Machine Learning dimensions.

Inferential Statistics approaches

Inferential statistics account for computational methods usedin the majority of papers (68%), with 28 papers (56%) usinga form of a generalized linear model (GLM). In Figure 1,we see a dramatic rise in the use GLMs after 1995, i.e., thepost-

SACWIS era. GLMs are being used to develop mostlytwo types of models; actuarial risk assessment and placementrecommendation models. There was a general trend around theuse of GLMs for developing risk assessment models [110, 29,49]. We also identiﬁed two major concern surrounding GLMs:their atheoretical and reductive nature and performance withrespect to outliers.Social scientists use validated psychometric scales [73, 27]to quantify the level of risk. GLMs have been developed us-ing these psychometric scales, such as the CANS Algorithm[32] that only uses the most statistically signiﬁcant items fromthe scale. This reductive and atheoretical model developmenthas received criticism [29, 99, 100]. Each case of child ne-glect/abuse is contextually different and factors that are signif-icant for one case might be peripheral to another. Moreover,GLMs do not account for the contextual factors that inﬂuencecaseworker decisions leading to variable omission bias [29]. Outliers can signiﬁcantly impact the performance of a regres-sion model [88]. Traditionally, regression models seek to omitoutliers as a means of improving predictive power and stillaccount for majority of the variance explained by signiﬁcantvariables [88]. However, for CWS, cases of severe abuse andneglect are the statistical outliers [21]. Regression modelsthat are designed to predict the most moderate (average) out-comes tend to perform poorly on outliers [106]. In terms ofCWS, poor performance on outliers raises several ethical andaccountability concerns [40].Four papers (8%) used discriminant analysis to differentiatebetween the characteristics of foster children served by differ-ent placement settings. Figure 1 illustrates that discriminantanalysis was a popular technique during 1985-1990, however,with the advent of regression techniques it gradually fadedaway. These papers were some of the earliest attempts at intro-ducing algorithms to aid decision-making in CWS. However,the data was limited and its quality questionable because ofthe lack of standardized data collection processes [103].

Machine Learning (ML) approaches

Machine Learning methods in CWS gained some momentumas early as 1986 with the introduction of

PLACECON , a sys-tem designed to assist CWS with placement decisions [95].However, with the increasing popularity of risk assessmentmodels and limited funding available, resources were directedtowards traditional regression models. Figure 1 shows a resur-rection of ML methods starting 2015 and a growing interestwithin the computer science communities towards studyingthe research problems in CWS starting 2018 [4, 7, 33, 59].Thirteen papers (26%) utilized ML methods in the form ofdecision trees, Bayesian networks or inference trees. Decision-tree learning has been popular as a means of organizing largeamount of factual and empirical knowledge in the form ofrules [102]. The CART (Classiﬁcation and Regression Trees)igure 1: Methods used to build Algorithms (RQ1)algorithm has been recently used to build a child-foster parentmatching system [7]. It has also been used to identify thecharacteristics of the most troubled children in CWS [39] aswell as to study trends in child abuse and neglect data [4].However, with such a strong emphasis on risk assessment,Children’s Research Center [3] used ML methods to developthe Structured Decision-Making (SDM) model.SDM is a decision-making framework where a risk assessmenttool is used in conjunction with clinical assessment [65]. SDMutilizes an array of ML tools such as decision, value andinference trees, and Bayesian networks [57] and has beenadopted by CWS in several states [19]. However, severalstudies have also shown that SDM produces mixed resultsespecially when accounting for race and ethnicity [42, 44,64]. There is also an ongoing struggle between caseworker’stheoretical assessments and the tool’s empirical judgment [99,100]. Three papers (6%) used unsupervised ML methods inthe form of neural networks [77, 78] and natural languageprocessing (NLP) [24]. Brindley et al. [24] propose a webplatform that allows foster youth to create personalized goalsand talk to a chat-bot that uses NLP to parse inputs and respondintelligently with recommendations about goals, ﬁnances, andhousing. McDonald et al. [78] and Marshall et al. [77] proposethe use of neural networks over regression techniques becausetheir non-parametric approach performs better at modelingnon-linear relationships and interactions.One possible reason for the perpetual conﬂict between MLrisk assessment tools and caseworkers’ assessment might be atthe core of Machine Learning itself and how it handles outliers.Statistical outliers in the case of child maltreatment are themost severe cases of child abuse and neglect [21]. Researchers[11] suggest that in the case of CWS, outliers are often moreimportant for caseworkers and demands signiﬁcant attentionbeyond the norm. Figure 1, depicts a signiﬁcant dearth in theuse of unsupervised learning methods with only two paperspublished in early 2000s [78, 77] and one paper publishedin 2018 [24]. Employing neural networks in social sciencescomes with its own complexities because there needs to betransparency about the proposed decisions [33]. Vaithianathanet al. [110] explored several ML methods such as Naive Bayesand Random Forests for risk assessment and achieved higher accuracies. However, they reverted to using a probit regressionmodel because the outcomes were more explainable.

Predictors used in Algorithms (RQ2)

In this section, we examine the predictors that are being used inalgorithms in CWS. Most algorithms are using over a hundredpredictors so we systematically coded them and then groupedthe emergent codes into seven dimensions (see Table 2).

Demographics and Systemic Factors

Child demographics were accounted for by 20 papers (40%)and biological parents demographics were accounted for by10 papers (20%). Surprisingly, more than half the papers didnot include child or parent demographics in their models eventhough racial and ethnic disparities in CWS have been recog-nized in social sciences [84, 91, 47]. Figure 2, illustrates thatafter 1990, there was a decline in the number of studies thatused demographic variables in their algorithms. The

Systemicfactors dimension includes factors associated with CWS, suchas, characteristics of the agency and caseworkers. Two papers(4%) use variables relating to characteristics of the agency,such as location and stafﬁng vacancies and one paper (2%)accounted for the characteristics of the caseworker, such as,caseloads and the level of training. This is surprising becausechild-welfare literature acknowledges the impact casework-ers have on child outcomes [94, 30]. The caseworker is achild’s primary contact between the biological parents, fosterparents and CWS. They navigate through the system and ﬁndservices for children and families. In fact, caseworker turnoveris directly associated with placement instability [30]. Factorsthat lead to high caseworker turnover include low salary, highcaseloads, administrative burdens, low levels of training andlack of supervisory support [30]. Systemic factors is one of thebiggest reasons why children experience multiples placementmoves in CWS [41]. This once again alludes to the atheoret-ical model construction that does not account for the salientfactors well-established in evidence-based social work.

Foster-child related factors

Seven codes emerged out of the coding process and weregrouped into three dimensions: child strengths, child needs,and child risks. 11 papers (22%) use variables that align with

Child Strengths , such as, interpersonal skills, coping skillsand level of optimism. Twenty-six papers (52%) took intoaccount a child’s emotional and behavioral needs and 15 pa-pers (30%) recorded the child’s day-to-day well-being andfunctioning, such as, their school attendance and behavior, per-sonal hygiene and communication skills. We also coded forvariables associated with risk factors that endanger child well-being. Suicide risk, risk behaviors, traumatic experiences, andchild involvement with CWS were our four emergent codesthat were grouped under the

Child Risks dimension. 9 papers(18%) conducted a mental health screening to see if a childwas suicidal or having suicidal thoughts. 20 papers (40%) ac-counted for risk behaviors such as self-harm, recklessness, so-cial misbehavior, and 30 papers (60%) accounted for traumaticexperiences such as neglect, physical/sexual abuse, history offamily violence, and community violence. We noticed a trendhere in that, almost all the risk assessment systems focusedigure 2: Predictors used in Algorithms (RQ2)heavily on the

Child Risks dimension, whereas, placement rec-ommendation systems focused on the

Child Needs dimension.Figure 2 depicts a rise in the number of studies that accountfor child strengths, child risk and child needs since 1995, thatis, the post-

SACWIS era. This alludes to the fact that thesechild characteristics are well-documented in

SACWIS and arebeing used for modeling purposes.All the studies we reviewed accounted for foster child relatedfactors in terms of their needs and associated risks. However,only one study accounts for the child’s interactions with otherpeople, such as siblings, relatives, and the system itself. Mooreet al. [80] account for factors such as

Placement with a sibling , Proximity to child’s home/relatives , and

Characteristics of theagency and caseworker ; factors well-studied in child-welfareliterature [30]. Fluke et al. [49] found that placement decisionsmay be made as a result of interaction effects of non-caserelated factors such as characteristics of the agency and/or thecaseworker. A study conducted in San Diego County foundthat 70% of the placement moves were a result of systemic orpolicy related factors [63].

Biological parents related factors

26 papers (52%) accounted for the biological parents’ riskbehaviors and needs, such as, physical/mental health, sub-stance abuse problems, residential stability and knowledge ofchild’s needs. We coded these variables into the

Bio-ParentsRisks/Needs dimension. Figure 2 shows that biological parentfactors have been consistently used by several studies, how-ever, we see a decline during 2005-2010. We also see a risein the use of foster child related factors during the same timeperiod. The introduction of CANS algorithm that focuses onthe child’s level of need may be a plausible explanation forthis trend. Different algorithms are using biological parentrelated variables differently. For example, risk assessmentmodels quantify biological parents’ risky behavior so as todiscern the risk of a future maltreatment. On the other hand,placement recommendation models are using this dimensionto determine the level of trauma a child has experienced and recommend a placement setting based on their level of need.Factors surrounding biological parents have been studied ingreat detail and accounted for by most algorithms.

Foster parents related factors

Four papers (8%) that we reviewed accounted for the char-acteristics of the foster parents, that is, their income level,occupation, demographics etc. Figure 2 shows that only 4studies account for foster parent related factors with a signiﬁ-cant gap between 1985 and 2016 where no study accountedfor these factors. Two papers (8%) look at the preferencesof foster parents and one paper (2%) accounts for the fosterparents’ past performance and capabilities. Matching childrenwith foster parents that are trained and prepared to meet theirbehavioral and emotional needs leads to increased stabilityfor the children [30]. Matching children with foster parentsthat come from the same cultural background also leads tobetter outcomes because it leads to smoother transitions, lowerstress and a feeling of security for the children [26]. These fac-tors are well-studied in child-welfare literature [30, 92, 114],however, we see that very little research has been done froman algorithmic perspective. CWS has historically focused onensuring safety and permanency rather than child well-being,that is, improving the quality of lives of foster children [16].

Target Outcomes of Algorithms (RQ3)

In this section, we examine the target outcomes of the algo-rithms used in CWS. Figure 3 depicts the trends in the targetoutcomes that algorithms have sought to model.

Risk Assessment

Predicting the risk of future maltreatment involves devel-oping models using the empirical study of cases of childabuse/neglect [10]. The factors that show a strong associationwith abuse and/or neglect outcomes are selected to create anactuarial model which is then used to assess new cases ofalleged abuse/neglect. Twenty-eight papers (56%) focused onpredicting risk as their target outcome. Figure 3 illustratesthat risk assessment has always received signiﬁcantly moreattention than any other outcome since the introduction ofregression models in social sciences. The greatest criticismagainst these models is that they are not theoretically founded;these models are probabilistic in nature and not causal [10,67, 99, 100]. Therefore, these models need to be empiricallyvalidated by follow-up studies to ensure their reliability. Di-rect comparison of any two actuarial models is a hard problem[10] and requires an in-depth understanding of the contexts inwhich the predictors were collected, measured and weightedin the models.Studies conducted on risk assessment models show that thesemodels are more accurate at predicting target events like childmaltreatment than unaided judgment, however, they lack util-ity [99]. Seven papers (14%) discuss the Structured Decision-Making (SDM) model, a framework that integrates predictiveand contextual assessments. CWS in several states have devel-oped their own versions of SDM, however, there are signiﬁcantenough differences to treat them independently as part of ourreview. Even though SDM is designed to assist caseworkerdecisions, studies have found that there are constant disagree-ments between the tool (empirically-driven) and caseworkerigure 3: Target Outcomes of Algorithms (RQ3)assessment (conceptually/theoretically-driven) to the pointthat caseworkers detest using these tools as they were intended[99]. However, caseworkers must continue to rely on thesetools as a means of standardizing decisions in CWS, especiallyin cases of high uncertainty [100].

Placement Recommendations and Successful Placements

Models that focused on these two target outcomes were theprecursors in the development of algorithms in CWS. Figure3 depicts that these target outcomes were being studied duringthe time period of 1985-1990. However, no studies werepublished between 1991 and 2005 that focused on these targetoutcomes. A plausible explanation for this decline would bethe increased focus on studying risk assessment during thatperiod. 20 papers (40%) discussed recommendation systemsfor foster care placements.The most prominent algorithm that determines the placementcriteria based on a child’s level of need is the CANS algorithm[32]. 6 papers (12%) discuss the CANS algorithm which isdeveloped using the CANS psychometric scale [73]. CWS in afew states have developed their own versions of this algorithm,and therefore, were treated independently as part of our review.It makes a recommendation from six levels of care in the orderof increasing severity – independent living, transitional livingprogram, foster home, specialized foster care, group home,and residential treatment center. It is used in a hybrid approachin conjunction with a multi-disciplinary team which allowsCWS to follow a standardized admission criteria for cases withlower levels of uncertainty [32]. This is a good initial approachto ensure child safety, however, it is a minimal approach anddoes not seek to improve the quality of a child’s life.

Child-Foster Parent Matching

This approach seeks to match the speciﬁc needs of a childwith the capabilities of foster parents. That is, placing childrenwith foster parents who are trained and certiﬁed to managetheir needs. It is a proactive approach towards improving thequality of lives of children and not just minimizing risk ofmaltreatment. Figure 3 shows that

Child-Foster parent match-ing has only been implemented since 2015 (2 studies). Thisapproach is different from the placement recommendation ap-proach in that it addresses the speciﬁc needs of the child andpreferences of the caregiver. For instance, matching with re-spect to child temperament, parent temperament, and parental

Computational methods

SupervisedMachineLearning UnsupervisedMachineLearning GeneralizedLinearmodels Discriminantanalysis/statisticaltestsRA 9 2 16 1

Outcome

PL 2 1 8 4

Variables

MT 2 - - -S-PL - - 4 1 RA : Risk Assessment model PL : Placement Recommendation model MT : Child-Foster parent Matching model S-PL : Characteristics of successful placements

Table 3: Crosstabs between Methods and Outcomesexpectations leads to increased stability [92]. Placing childrenwho have higher emotional needs with foster parents who pre-fer to be emotionally involved offers these children a betterchance towards stability [113] than placing these children in arestrictive treatment setting.

Child-Foster parent matching iswell-studied in evidence-based social work and is known toimprove stability and permanency outcomes [30, 92]. How-ever, there is a dearth of information within CWS on how toguide this process [92]. This is a signiﬁcant knowledge gap forboth CWS and social scientists who seek to computationallymodel this approach. Moore et al. [80] recently validated amatching algorithm that was implemented by CWS in the stateof Kansas for resulting in more stable placements.

Relationship between Methods, Predictors and Outcomes

Relationship between Algorithms (RQ1) and Outcomes (RQ3)

Table 3 depicts crosstabs between the computational methodsused and the outcome from all the papers in our corpus. Wesaw that generalized linear models have mostly been used fordeveloping risk assessment models (16 studies) followed byplacement recommendation models (8 studies). Even withthe emergence of newer machine learning methods, majorityof the studies still continue to focus on risk assessment. 9studies used supervised machine learning for risk assessment,2 studies focused of placement recommendation and 2 studiesfocused on child-foster parent matching.

Relationship between Predictors (RQ2) and Outcomes (RQ3)

Table 4 depicts the crosstabs between predictors used by com-putational models and the outcome they seek to predict. First,we cross examine the risk assessment models with respect tothe predictors that inform child characteristics. Majority ofthe models use a combination of predictors that assess

ChildBehavioral/Emotional Needs (7 studies),

Child Risk Behaviors (7 studies), and

Traumatic Experiences (15 studies). Severalpredictors coded under these dimensions (e.g., self-harm, reck-lessness, physical/sexual abuse) are assessed by a caseworkerat an initial investigation and made available for predictivemodeling. These predictors might already exist in the data ifthe family has previously come under the attention of CWS.This approach of aggregating the negative aspects of people’slives while ignoring the positive aspects has been criticized be-cause of its deﬁcit-based nature [25]. There is also an overlapbetween the

Traumatic Experiences of a child and the

Bio-Parents Needs/Risk Behavior because the same predictors (for utcome Variables Computational Methods

RA PL MT S-PL SUP UNSUP GLM DASChild demographics 8 6 2 4 7 1 7 5Bio-parents demographics 5 2 2 1 5 1 2 2Characteristics of Agency - - - 2 - - 2 -Characteristics of Caseworker - - - 1 - - 1 -Child Strengths 3 5 2 1 5 1 5 -Functioning 3 10 2 - 5 1 7 2

Predictors

Child Behavioral/EmotionalNeeds 7 13 2 4 7 - 13 6Suicide Risk 2 7 - - 3 - 4 2Child Risk Behaviors 7 12 1 - 6 1 10 3Traumatic Experiences 15 10 2 3 9 1 16 4Child Involvement with CWS 2 6 1 - 3 - 3 3Bio-Parent Risk/Needs 15 8 - 3 7 1 16 3Foster parent characteristics - - 2 2 2 - 1 1Foster parent preferences - - 1 1 1 - - 1Foster parent past performance - - 1 - 1 - - -Foster parent capabilities - - 1 - 1 - - - RA : Risk Assessment model SUP : Supervised Machine Learning PL : Placement Recommendation model UNSUP : Unsupervised Machine Learning MT : Child-Foster parent Matching model GLM : Generalized Linear models

S-PL : Characteristics of successful placements

DAS : Discriminant Analysis/Statistical tests

Table 4: Relationship between Computational Methods (RQ1), Predictors (RQ2) and Outcomes (RQ3)e.g., history of physical/sexual abuse, medical trauma, parents’criminal activity) are used to conduct both needs assessmentfor a child and risks assessment for a parent.Next, we cross examine the predictors used by placementrecommendation models. These models are not employedat the onset of an investigation and are used by CWS whena child needs to be placed in a permanent foster care set-ting. These models are generally more equitable as comparedto risk assessment models because they try to weigh-in thepositive characteristics of a child, such as talents, interests,cultural identity, and school achievements to ﬁnd an appropri-ate placement setting that meets their needs. Table 4 showsthat placement recommendation models account for predictorsaround

Child Strengths (5 studies),

Functioning (10 studies),and

Child Behavioral/Emotional Needs (13 studies) to weighin the positive aspects and needs of a child and balance thatwith predictors around

Child Risk Behaviors (12 studies) and

Traumatic Experiences (10 studies) to ﬁnd a suitable place-ment setting well equipped to meet their needs.

Relationship between Methods (RQ1) and Predictors (RQ3)

Table 4 depicts the crosstabs between predictors and computa-tional methods. Most computational methods including bothsupervised machine learning and generalized linear modelsfocused on

Child Behavioral/Emotional Needs , Child RiskBehaviors , and

Traumatic Experiences to assess the risk of amaltreatment event or the needs of a child. Some predictorsthat inform these three codes include traumatic events (e.g.,physical/sexual abuse, medical trauma), child’s conduct andanger management, and delinquent behavior. After an initialinvestigation is conducted by a caseworker and psychometricrisk assessments completed, these predictors become avail-able for modeling. However, quantifying risk from such anarrow set of predictors has been criticized because it fails toaccount for the wide range of risk factors that arise as a resultof systemic issues in CWS itself [54].

DISCUSSIONAlgorithms Need to be theoretical & context-aware (RQ1)

Overall, we found a lack of theoretically derived and validatedalgorithms that demonstrated that they took measures to inte-grate knowledge from the social sciences into their designs.Only one study [80] constructed their model based on thechild-welfare literature. Four studies even discussed this lackof theory and proposed solutions in the form of cumulative riskmodels [74], causal models [98], and revised SDM modelsgrounded in risk and resilience theory [99]. Yet, based on thepublished literature, such models have yet to be consistentlyimplemented.This ﬁnding is problematic because it shows that these algo-rithms ignore many factors that affect how decisions are madein CWS. For instance, the decisions are often constrainedby current policies or the scarce resources [55]. Many cur-rent empirical models frustrate child-welfare workers becausethey do not account for such systemic factors. While someresearchers have suggested [66] that empirical prediction isenough and that theory, context, or causal inferences are notalways necessary in policy making when outcomes remaindesirable, we argue that this is not a desirable stance to take inchild-welfare contexts because there is signiﬁcant debate onhow and which types of data, models, and outcomes are to beused in predictive modeling (with or without theory). Empiri-cal knowledge related to child-welfare practice is fragmentedand social science theories must be used to ﬁll the gaps [54].Therefore, we recommend that human-centered theoretical ap-proaches be used to incorporate factors arising from evidence-based social work [30] and understand the causal pathwaysthat often dictate decision-making processes. Such algorithmsthat are informed by appropriate causal theory would alsohave a greater likelihood of utilization as compared to their a-theoretical counterparts [98]. Signiﬁcant work has also foundisconnects between the functioning of algorithms and theirsocial interpretations [13]. We see a similar phenomena inCWS where the caseworkers using Structured Decision Mak-ing (SDM) model must translate information from both formsof assessments (clinical and algorithmic) leading to uncertaintyand unreliable decision making [100]. Therefore, algorithmsthat are meant to aid decision-making often become the sourceof frustration and force caseworkers to abandon their contex-tual judgments [99]. Human-centered theoretical approachescan help by placing the meaning-making process [89] at thecenter of the design process. It can help designers understandthe theory of practice and uncover practitioners’ sense-makingprocesses (e.g., how they perceive quantiﬁed metrics [13]).Child-welfare workers are generally not trained in statisticalthinking and make decisions based on experience, intuition,and individual heuristics [54]. Human-centered theoreticalapproaches can help us understand the mental models of child-welfare workers, inform feature selection (design), as well asinterpret the results (evaluation).Our results also indicate that several states adopted the SDMapproach because it was supposed to integrate predictive andcontextual assessments, however, it has fallen short of thatgoal [100, 99]. There are several factors at play in regards toany child-welfare case and it becomes critical to offer contextto the case instead of focusing on a few broad factors with-out giving weight to important nuances [83]. For instance,understanding contextual knowledge with respect to an orga-nization requires incorporating the organizational memory ofthe organization and its people [76, 5] which is inherently HCIresearch. Social workers are trained in writing detailed casenotes by translating their context-speciﬁc experiences into text[35]. This unstructured, unanalyzed, textual data is added to

SACWIS systems [87]. We hypothesize that valuable theoret-ical signals from these case notes can be considered withinmethodological approaches like topic modeling that can makegood use of such unstructured data. Indeed, in recent years,HCI has developed a rich methodological tradition [14, 81,31] of using signals from such unstructured data as predictorswithin algorithms to study complex, sociotechnical systems.

Going Beyond what is "Easily Quantiﬁable" (RQ2)

Our results suggest that majority of the algorithms used predic-tors around child and parent characteristics, such as their needs,strengths, and associated risks (see Table 2). The vast majorityof these predictors that are used for predictive modeling arederived from information that is easily available and readilyquantiﬁable. For example, child-welfare workers use psycho-metric scales [42] to assess child and parent associated risksand needs during an initial investigation which then becomeavailable for predictive modeling. Some of these predictorsare found in almost every risk assessment model even thoughthey have no predictive validity. For instance, severity of abuseis easily quantiﬁable and is found in several risk assessmentmodels even though there is little to no indication that it is re-lated to recurrence of abuse [28]. Moreover, several predictors(parenting skills, parent conﬂict etc.) have not been properlyvalidated [54], and can lead to unreliable predictions [54].Such issues led to the Illinois CWS (in 2017) to shut downtheir predictive analytics program [62]. In addition, none of the predictors account for the temporality of risk assessment.After an allegation of abuse, the assumption of escalation isthe baseline for risk assessment leading to inﬂated risk scoresand excessive interventions from CWS [104].Human-centered theoretical approaches can result in a rigor-ous feature selection process that relies on predictors that havebeen well-studied, understood, and validated in social sciences[13]. De Choudhury et al.â ˘A ´Zs [46] work in mental health isa good example, where the researchers validated constructs,focused on data biases and unobserved factors, as well as con-ducted sensitivity analysis. Moreover, it compels us to looktowards sources of information that have been hereto hard toquantify. For instance, referring to our prior example aroundcase notes, advances in natural language processing [112] nowallows us to quantify and make holistic inferences about all thestakeholders involved in a child-welfare case. This can addresspersistent issues among cases which appear similar based onthe empirical data but exhibit high variation in outcomes [54].Furthermore, human-centered participatory design [13] allowsfor HCI researchers to actively engage with domain experts inchild-welfare to understand how risk accumulates (and howto model it), as well as engage with other stakeholders tobetter understand the systemic factors around policies, lawsand organizational culture [116]. Here, PD [82] can navi-gate the thorny, contextual differences between different legaland policy systems and needs/values of stakeholders. Lodatoand DiSalvo [72] highlight the different forms and limitationsof PD as well as how PD can be conducted within such in-stitutional constraints. Advances in CWS data systems [50]can accommodate for the collection of several new predictorsconcerning child well-being and systemic factors. Here, theactive consideration of needs and values of all stakeholderscan help avoid the same reliability and validity pitfalls for thenew predictors that exist for many of the current predictors.

Improve Lives and not just ‘Minimize Risk’(RQ3)

One of the fundamental goals of CWS in the United States isto ensure positive outcomes for foster children [1], however,as our results conﬁrm, majority of the efforts in computationalmodeling continues to be focused on risk assessment (seeTable 2). Risk assessment models only seek to minimize therisk of future harm and not improve the quality of lives offoster children. The target outcome of "risk of maltreatment"is poorly deﬁned [120]. Federal and State law dictate howchild abuse and neglect are deﬁned and the state deﬁnitionsoften vary and establish the grounds for intervention by CWS[1]. Algorithms are trained on cases of substantiation, that is,cases where CWS judged maltreatment to have occurred [54].This judgment in itself is very subjective and depends on statelaws, policies, and CWS intervention criteria which is oftendictated by the level of funding and caseloads [30].Human-centered approaches can help theoretically deﬁne notonly the predictors but also the target outcomes with the helpof stakeholders and domain experts to ensure these key in-gredients needed for algorithm design are validated and reli-able. Human-centered participatory design can also unravelconcerns around the social interpretations of algorithmically-based systems. For instance, Brown et al [25] investigated theommunity perspectives of risk assessment models in child-welfare. Child-welfare workers criticized these models be-cause of their ‘deﬁcit-based’ nature, that is, this approach onlycaptures negative inputs to predict a negative outcome. Thereis growing concern that such an approach drives disproportion-ate negative caseworker perceptions that ultimately leads tonegative actions [25]. Badillo-Urquiola et al. [9] and Pinteret al. [90] also recognized the problems with a deﬁcit-basedframing in that it creates a sense of moral panic and divertsattention away from positive outcomes. They suggested thatresearchers focus on "strength-based approaches" that focuson positive factors that help improve lives.CWS should actively focus on approaches that disrupt thestatus quo [58] and seek to improve the lives of foster children,such as

Child-Foster Parent Matching [30, 92, 113]. Thisrequires an ongoing engagement with foster parents and fosterchildren to understand their speciﬁc values and needs as wellas their cultural and parental expectations. HCI can contributehere by drawing on its rich tradition of work in action research,participatory design, and value sensitive design to incorporatethe values and needs of the stakeholders [52, 105, 8, 12, 18].In addition, HCI researchers have developed methodologicalapproaches that not only incorporate stakeholders into thedesign process but also the data analysis and interpretationprocesses [15, 119]. Moreover, advocating for foster chil-dren, a vulnerable and marginalized population, is inherentlya social justice issue. HCI researchers have a long history ofcontending with social injustices and have developed theoreti-cal and methodological approaches that seek "not so much topredict the future, but rather to imagine a radically better one[52]." Given the paucity of human-centered research into thisdomain and the richness of available social science literature[30, 92, 17, 26, 94, 41], this presents HCI researchers with aset of complex socio-technical challenges to study.

Recommendations for Future Research

Bridging AI and HCI Through Participatory Design

Our results indicate that there is a lack of theoretically-designed algorithms (see Table 1) which adds to the frus-trations of child-welfare workers who are being pressured intousing these algorithms as a means of standardizing decisions[99]. This situation is further exacerbated by a lack of PD lead-ing to algorithmic systems that offer low utility [100]. Onlyone study in our corpus engaged with child-welfare work-ers to understand their concerns and needs [25]. PD [82]allows for the active inclusion of people most affected by asystem. Engaging child-welfare workers in the design as wellas evaluation processes ensures that their need are met and thatthe system integrates well with child-welfare practice. Child-welfare workers who use algorithms on a daily basis stronglystress the need to be able to explain these models to each otherand to policymakers [83]. Not only does this depend on whichcomputational methods are used to construct an algorithm,how they are deployed but also how outcomes are deﬁned andmeasured. This offers research pathways for HCI researcherswho have increasingly started devoting attention to explainingoutcomes and predictions [71, 79]. Moreover, it is imperativethat researchers engage with the stakeholders because thereare both, ethical and legal ramiﬁcation of using certain types of data. For instance, legal requirements might not allow ajuvenile’s criminal record or history of physical and/or sexualabuse to be used for modeling [101].

Algorithmic Decision Making via Speculative Design

We found that 56% of studies took a deﬁcit-based approachto mitigate risks even though child-welfare literature has dis-cussed the signiﬁcance of equitable outcomes (e.g., child-foster parent matching). Recent studies based in newer tech-nologies still continue to focus on risk assessment and uncrit-ically reproduce the status quo. Designing against the statusquo means setting our goals beyond risk assessment, and mov-ing more ambitiously toward design that challenges underlyingproblems [58]. Human-centered speculative design [13] canallow stakeholders to shift their focus away from algorithmsand be truly innovative in how they imagine problems and theirunderlying causes without being constrained by what mightbe technologically feasible. This is especially important foralgorithm design where the boundaries of possibility changeevery day [13]. For instance, child-foster parent matching iswell-documented in child-welfare literature for almost twodecades but it has only recently been explored in an algorithm[80] because of advances in decision-tree learning. Similarly,algorithmic advances also create novel avenues for studyingthe interactions and decision pathways resulting from differentpolicies, practices and programs [116].

LIMITATIONS AND FUTURE WORK

We conducted a comprehensive and systematic literature re-view which was limited to the US-based child welfare system.We may have also missed algorithms used within CWS thatare not publicly available for review. For instance, reports bynon-proﬁt organizations or state governments may have beendistributed internally. Therefore, we plan to work directly withCWS agencies and conduct user interviews about the systemsand algorithms being used within CWS to identify any otheralgorithms that have been implemented. To move towards us-ing a human-centered approach to build new, evidence-based,and theoretically-driven algorithms, we plan to work withstakeholders in CWS to understand how different policies,practices, and programs create different decision pathways forchild placements and services offered to families.

CONCLUSION

In conclusion, we recommend that the HCI community part-ners with CWS to do the following: A renewed focus ontheoretically-designed algorithms with the active engagementof stakeholders through the design and evaluation phases, Develop algorithms for practice that incorporate a more com-prehensive set of predictors well-studied in child-welfare lit-erature, as well as predictors hard to quantify thus far, and Focus on equitable outcomes founded in evidence-based child-welfare research that improve the quality of lives of fosterchildren instead of merely mitigating future risks.

ACKNOWLEDGMENTS

This research is funded in part by the Facebook Computa-tional Social Science Methodology Research Award, WilliamT. Grant Foundation (187941 and 190017), and the Northwest-ern Mutual Data Science Institute.

EFERENCES [1] 2013.

How the Child Welfare System Works . TechnicalReport. Children’s Bureau: Child Welfare InformationGateway.[2] 2018.

Using Data To Help Protect Children andFamilies Act . 115th Congress, Senate of the UnitedStates.[3] 2019. Child Welfare Goals, Legislation, andMonitoring. (2019). Retrieved April 2, 2019 from [4] Abdurazzag A Aburas, Mohammad Hassan, Hilary Lin,and Shreshtha Batshu. 2018. Child MaltreatmentForecast Using Bigdata Intelligent Approaches. In . IEEE, 302–308.[5] Mark S Ackerman and Christine Halverson. 2004.Organizational memory as objects, processes, andtrajectories: An examination of organizational memoryin use.

Computer Supported Cooperative Work (CSCW)

13, 2 (2004), 155–189.[6] Ali Alkhatib and Michael Bernstein. 2019. Street-LevelAlgorithms: A Theory at the Gaps Between Policy andDecisions. In

Proceedings of the 2019 CHI Conferenceon Human Factors in Computing Systems . ACM, 530.[7] Rachmadita Andreswari, Irfan Darmawan, and WarihPuspitasari. 2018. A Preliminary Study on DetectionSystem for Assessing Children and Foster ParentsSuitability. In .IEEE, 376–379.[8] Mariam Asad and Christopher A Le Dantec. 2015.Illegitimate civic participation: supporting communityactivists on the ground. In

Proceedings of the 18thACM Conference on Computer Supported CooperativeWork & Social Computing . ACM, 1694–1703.[9] Karla Badillo-Urquiola, Xinru Page, and PamelaWisniewski. 2019. Risk vs. Restriction: The DigitalDivide between Providing a Sense of Normalcy andKeeping Foster Teens Safe Online. In

Proceedings ofthe 2019 CHI Conference on Human Factors inComputing Systems . ACM.[10] Christopher Baird, Dennis Wagner, Theresa Healy, andKristen Johnson. 1999. Risk assessment in childprotective services: Consensus and actuarial modelreliability.

Child Welfare

78, 6 (1999), 723.[11] Zuriana Abu Bakar, Rosmayati Mohemad, AkbarAhmad, and Mustafa Mat Deris. 2006. A comparativestudy for outlier detection techniques in data mining. In . IEEE, 1–6.[12] Shaowen Bardzell. 2014. Utopias of participation:design, criticality, and emancipation. In

Proceedings of the 13th Participatory Design Conference: ShortPapers, Industry Cases, Workshop Descriptions,Doctoral Consortium papers, and Keynoteabstracts-Volume 2 . ACM, 189–190.[13] Eric PS Baumer. 2017. Toward human-centeredalgorithm design.

Big Data & Society

4, 2 (2017),2053951717718854.[14] Eric PS Baumer, David Mimno, Shion Guha, EmilyQuan, and Geri K Gay. 2017a. Comparing groundedtheory and topic modeling: Extreme divergence orunlikely convergence?

Journal of the Association forInformation Science and Technology

68, 6 (2017),1397–1410.[15] Eric PS Baumer, Xiaotong Xu, Christine Chu, ShionGuha, and Geri K Gay. 2017b. When Subjects Interpretthe Data: Social Media Non-use as a Case for Adaptingthe Delphi Method to CSCW. In

Proceedings of the2017 ACM Conference on Computer SupportedCooperative Work and Social Computing . ACM,1527–1543.[16] Lawrence M Berger, Sarah K Bruch, Elizabeth IJohnson, Sigrid James, and David Rubin. 2009.Estimating the "impact" of out-of-home placement onchild well-being: Approaching the problem of selectionbias.

Child development

80, 6 (2009), 1856–1876.[17] Joan M Blakey, Sonya J Leathers, Michelle Lawler,Tyreasa Washington, Chiralaine Natschke, TonyaStrand, and Quenette Walton. 2012. A review of howstates are addressing placement stability.

Children andYouth Services Review

34, 2 (2012), 369–378.[18] Alan Borning and Michael Muller. 2012. Next steps forvalue sensitive design. In

Proceedings of the SIGCHIconference on human factors in computing systems .ACM, 1125–1134.[19] Emily Adlin Bosk. 2018. What counts? quantiﬁcation,worker judgment, and divergence in child welfaredecision making.

Human Service Organizations:Management, Leadership & Governance

42, 2 (2018),205–224.[20] Engin Bozdag. 2013. Bias in algorithmic ﬁltering andpersonalization.

Ethics and information technology

The journal of childpsychology and psychiatry and allied disciplines

40, 8(1999), 1221–1229.[22] Virginia Braun and Victoria Clarke. 2006. Usingthematic analysis in psychology.

Qualitative researchin psychology

3, 2 (2006), 77–101.[23] Leo Breiman and others. 2001. Statistical modeling:The two cultures (with comments and a rejoinder bythe author).

Statistical science

16, 3 (2001), 199–231.24] Meredith Brindley, James P Heyes, and Darrell Booker.2018. Can Machine Learning Create an Advocate forFoster Youth?

Journal of Technology in HumanServices

36, 1 (2018), 31–36.[25] Anna Brown, Alexandra Chouldechova, EmilyPutnam-Hornstein, Andrew Tobin, and RhemaVaithianathan. 2019. Toward AlgorithmicAccountability in Public Services: A Qualitative Studyof Affected Community Perspectives on AlgorithmicDecision-making in Child Welfare Services. In

Proceedings of the 2019 CHI Conference on HumanFactors in Computing Systems . ACM, 41.[26] Jason D Brown, Natalie George, Jennifer Sintzel, andDavid St Arnault. 2009. Beneﬁts of cultural matchingin foster care.

Children and Youth Services Review

Social WorkResearch

19, 3 (1995), 174–183.[28] Michael J Camasso and Radha Jagannathan. 2000.Modeling the reliability and predictive validity of riskassessment in child protective services.

Children andYouth Services Review

22, 11-12 (2000), 873–896.[29] Michael J Camasso and Radha Jagannathan. 2013.Decision making in child protective services: A riskybusiness?

Risk analysis

33, 9 (2013), 1636–1649.[30] Sarah Carnochan, Megan Moore, and Michael J Austin.2013. Achieving placement stability.

Journal ofEvidence-Based Social Work

10, 3 (2013), 235–253.[31] Stevie Chancellor, Zhiyuan Lin, Erica L Goodman,Stephanie Zerwas, and Munmun De Choudhury. 2016.Quantifying and predicting mental illness severity inonline pro-eating disorder communities. In

Proceedings of the 19th ACM Conference onComputer-Supported Cooperative Work & SocialComputing . ACM, 1171–1184.[32] Ka Ho Brian Chor, Gary M McClelland, Dana AWeiner, Neil Jordan, and John S Lyons. 2012.Predicting outcomes of children in residentialtreatment: A comparison of a decision supportalgorithm and a multidisciplinary team decision model.

Children and Youth Services Review

34, 12 (2012),2345–2352.[33] Alexandra Chouldechova, Diana Benavides-Prado,Oleksandr Fialko, and Rhema Vaithianathan. 2018. Acase study of algorithm-assisted decision making inchild maltreatment hotline screening decisions. In

Conference on Fairness, Accountability andTransparency . 134–148.[34] Christopher E Church and Amanda J Fairchild. 2017.In Search of a Silver Bullet: Child Welfare’s Embraceof Predictive Analytics.

Juvenile and Family CourtJournal

68, 1 (2017), 67–81. [35] James Clifford and George E Marcus. 1986.

Writingculture: The poetics and politics of ethnography . Univof California Press.[36] Patricia Cohen, Stephen G West, and Leona S Aiken.2014.

Applied multiple regression/correlation analysisfor the behavioral sciences . Psychology Press.[37] US Congress. 2008. Fostering connections to successand increasing adoptions act of 2008. (2008).[38] Lindsay D Cooper. 2005. Implications of mediascrutiny for a child protection agency.

J. Soc. & Soc.Welfare

32 (2005), 107.[39] Katharan D Cordell, Lonnie R Snowden, and LauraHosier. 2016. Patterns and priorities of service needidentiﬁed through the Child and Adolescent Needs andStrengths (CANS) assessment.

Children and YouthServices Review

60 (2016), 129–135.[40] Michael Corrigan. 2019. Building A ComprehensiveChild Welfare Information System. (Jan 2019). https://chronicleofsocialchange.org/child-welfare-2/building-comprehensive-child-welfare-information-system/33426 [41] Theodore P Cross, EUN Koh, Nancy Rolock, andJennifer Eblen-Manning. 2013. Why do childrenexperience multiple placement changes in foster care?Content analysis on reasons for instability.

Journal ofPublic Child Welfare

7, 1 (2013), 39–58.[42] Amy D’andrade, Michael J Austin, and Amy Benton.2008. Risk and safety assessment in child welfare:Instrument comparisons.

Journal of evidence-basedsocial work

5, 1-2 (2008), 31–56.[43] David Danks and Alex John London. 2017.Algorithmic Bias in Autonomous Systems.. In

IJCAI .4691–4697.[44] EW Danktert and Kristen Johnson. 2013. Riskassessment validation: A prospective study.

LosAngeles: California Department of Social Services,Children and Family Services Division (2013).[45] Elizabeth Davoren. 1975. Foster placement of abusedchildren.

Children today

4, 3 (1975), 41.[46] Munmun De Choudhury and Emre Kiciman. 2018.Integrating Artiﬁcial and Human Intelligence inComplex, Sensitive Problem Domains: Experiencesfrom Mental Health.

AI Magazine

39, 3 (2018), 69–80.[47] Alan J Dettlaff, Stephanie L Rivaux, Donald JBaumann, John D Fluke, Joan R Rycraft, and JoyceJames. 2011. Disentangling substantiation: Theinﬂuence of race, income, and risk on thesubstantiation decision in child welfare.

Children andYouth Services Review

33, 9 (2011), 1630–1637.[48] David Donoho. 2017. 50 years of data science.

Journalof Computational and Graphical Statistics

26, 4(2017), 745–766.49] John D Fluke, Martin Chabot, Barbara Fallon, BruceMacLaurin, and Cindy Blackstock. 2010. Placementdecisions and disparities among aboriginal groups: Anapplication of the decision making ecology throughmulti-level analysis.

Child Abuse & Neglect

34, 1(2010), 57–69.[50] Administration for Children and Families.

Comprehensive Child Welfare Information System (106ed.). Vol. 81. Federal Register: The Daily Journal ofthe United States.[51] Patrick J Fowler, Katherine E Marcal, Saras Chung,Derek S Brown, Melissa Jonson-Reid, and Peter SHovmand. 2019. Scaling up housing services withinthe child welfare system: policy insights fromsimulation modeling.

Child maltreatment (2019),1077559519846431.[52] Sarah Fox, Jill Dimond, Lilly Irani, Tad Hirsch,Michael Muller, and Shaowen Bardzell. 2017. SocialJustice and Design: Power and oppression incollaborative systems. In

Companion of the 2017 ACMConference on Computer Supported Cooperative Workand Social Computing . ACM, 117–122.[53] R Michael Furr. 2017.

Psychometrics: an introduction .Sage Publications.[54] Eileen Gambrill and Aron Shlonsky. 2000. Riskassessment in context. (2000).[55] Eileen Gambrill and Aron Shlonsky. 2001. The needfor comprehensive risk management systems in childwelfare.

Children and Youth Services Review

23, 1(2001), 79–107.[56] Stuart Gray, Kirsten Cater, Chloe Meineck, RachelHahn, Debbie Watson, and Tom Metcalfe. 2019. trove:A digitally enhanced memory box for looked after andadopted children. In

Proceedings of the 18th ACMInternational Conference on Interaction Design andChildren . ACM, 458–463.[57] Robin Gregory, Lee Failing, Michael Harstone,Graham Long, Tim McDaniels, and Dan Ohlson. 2012.

Structured decision making: a practical guide toenvironmental management choices . John Wiley &Sons.[58] Ellie Harmon, Matthias Korn, Ann Light, and AmyVoida. 2016. Designing against the status quo. In

Proceedings of the 2016 ACM Conference CompanionPublication on Designing Interactive Systems . ACM,65–68.[59] Teresa M Harrison, Donna Canestraro, Theresa Pardo,Martha Avila-Marilla, Nicolas Soto, Megan Sutherland,Brian Burke, and Mila Gasco. 2018. A tale of twoinformation systems: transitioning to a data-centricinformation system for child welfare. In

Proceedings ofthe 19th Annual International Conference on DigitalGovernment Research: Governance in the Data Age .ACM, 108. [60] Kori Inkpen, Stevie Chancellor, MunmunDe Choudhury, Michael Veale, and Eric PS Baumer.2019. Where is the Human?: Bridging the GapBetween AI and HCI. In

Extended Abstracts of the2019 CHI Conference on Human Factors in ComputingSystems . ACM, W09.[61] Luma Institute. 2012.

Innovating for people:Handbook of human-centered design methods . LUMAInstitute, LLC.[62] David Jackson and Gary Marx. 2017. Data miningprogram designed to predict child abuse provesunreliable, DCFS says. (dec 2017). [63] Sigrid James. 2004. Why do foster care placementsdisrupt? An investigation of reasons for placementchange in foster care.

Social service review

78, 4(2004), 601–627.[64] K Johnson and D Wagner. 2003. California StructuredDecision Making. Risk Assessment Revalidation: AProspective Study. Childrenâ ˘A ´Zs Research Center.SAMSHAâ ˘A ´Zs National Registry of Evidence-basedPrograms and Practices (NREPP). (2003).[65] Will Johnson. 2004. Effectiveness of Californiaâ ˘A ´Zschild welfare structured decision making (SDM) model:a prospective study of the validity of the CaliforniaFamily Risk Assessment.

Madison (Wisconsin, USA):Childrenâ ˘A ´Zs Research Center (2004).[66] Jon Kleinberg, Jens Ludwig, Sendhil Mullainathan,and Ziad Obermeyer. 2015. Prediction policy problems.

American Economic Review

Social WorkResearch

32, 2 (2008), 105–116.[68] J Nathan Kutz. 2013.

Data-driven modeling &scientiﬁc computation: methods for complex systems &big data . Oxford University Press.[69] Anja Lambrecht and Catherine Tucker. 2019.Algorithmic Bias? An Empirical Study of ApparentGender-Based Discrimination in the Display of STEMCareer Ads.

Management Science (2019).[70] Mark D Lardner. 2015. Are restrictiveness of caredecisions based on youth level of need? A multilevelmodel analysis of placement levels using the child andadolescent needs and strengths assessment.

ResidentialTreatment for Children & Youth

32, 3 (2015), 195–207.[71] Min Kyung Lee and Su Baykal. 2017. Algorithmicmediation in group decisions: Fairness perceptions ofalgorithmically mediated vs. discussion-based socialdivision. In

Proceedings of the 2017 ACM Conferenceon Computer Supported Cooperative Work and SocialComputing . ACM, 1035–1048.72] Thomas Lodato and Carl DiSalvo. 2018. Institutionalconstraints: the forms and limits of participatory designin the public realm. In

Proceedings of the 15thParticipatory Design Conference: Full Papers-Volume1 . ACM, 5.[73] John S Lyons, Dana Aron Weiner, and Melanie BuddinLyons. 2004. Measurement as communication inoutcomes management: The child and adolescentneeds and strengths (CANS).

The Use of PsychologicalTesting for Treatment Planning and OutcomesAssessment. Volume 2: Instruments for Children andAdolescents (2004).[74] Michael J MacKenzie, Jonathan B Kotch, and Li-ChingLee. 2011. Toward a cumulative ecological risk modelfor the etiology of child maltreatment.

Children andyouth services review

33, 9 (2011), 1638–1647.[75] Diana MacLean, Sonal Gupta, Anna Lembke,Christopher Manning, and Jeffrey Heer. 2015.Forum77: An analysis of an online health forumdedicated to addiction recovery. In

Proceedings of the18th ACM Conference on Computer SupportedCooperative Work & Social Computing . ACM,1511–1526.[76] James G March. 1991. Exploration and exploitation inorganizational learning.

Organization science

2, 1(1991), 71–87.[77] David B Marshall and Diana J English. 2000. Neuralnetwork modeling of risk assessment in child protectiveservices.

Psychological Methods

5, 1 (2000), 102.[78] Thomas P McDonald, John Poertner, and GardeniaHarris. 2002. Predicting placement in foster care: Acomparison of logistic regression and neural networkanalysis.

Journal of social service research

28, 2(2002), 1–20.[79] Hannah Miller Hillberg, Zachary Levonian, DanielKluver, Loren Terveen, and Brent Hecht. 2018. What ISee is What You Don’t Get: The Effects of (Not)Seeing Emoji Rendering Differences across Platforms.

Proceedings of the ACM on Human-ComputerInteraction

2, CSCW (2018), 124.[80] Terry D Moore, Thomas P McDonald, and KariCronbaugh-Auld. 2016. Assessing risk of placementinstability to aid foster care placement decision making.

Journal of Public Child Welfare

10, 2 (2016), 117–131.[81] Michael Muller, Shion Guha, Eric PS Baumer, DavidMimno, and N Sadat Shami. 2016. Machine learningand grounded theory method: Convergence,divergence, and combination. In

Proceedings of the19th International Conference on Supporting GroupWork . ACM, 3–8.[82] Michael J Muller. 2009. Participatory design: the thirdspace in HCI. In

Human-computer interaction . CRCpress, 181–202. [83] Judge Michael Nash. 2017. Examination of usingStructured Decision Making and Predictive Analyticsin assessing Safety and Risk in Child Welfare.

Countyof Los Angeles Ofﬁce of Child Protection (May 2017).[84] Barbara Needell, M Alan Brookhart, and Seon Lee.2003. Black children and foster care placement inCalifornia.

Children and Youth Services Review

25, 5-6(2003), 393–408.[85] Kathleen G Noonan, Charles F Sabel, and William HSimon. 2009. Legal accountability in the service-basedwelfare state: Lessons from child welfare reform.

Law& Social Inquiry

34, 3 (2009), 523–568.[86] US Department of Health and Human Services. 2017.Child Maltreatment 2017.

Children’s Bureau (Ed.) (2017).[87] US Department of Health, Human Services, and others.2017. The AFCARS report: Preliminary FY 2016estimates as of Oct 2017.

Children’s Bureau (Ed.)

Anintroduction to statistical methods and data analysis .Nelson Education.[89] Sarah Pink, Kerstin Leder Mackley, Val Mitchell,Marcus Hanratty, Carolina Escobar-Tello, TracyBhamra, and Roxana Morosanu. 2013. Applying thelens of sensory ethnography to sustainable HCI.

ACMTransactions on Computer-Human Interaction(TOCHI)

20, 4 (2013), 25.[90] Anthony T Pinter, Pamela J Wisniewski, Heng Xu,Mary Beth Rosson, and Jack M Caroll. 2017.Adolescent online safety: Moving beyond formativeevaluations to designing solutions for the future. In

Proceedings of the 2017 Conference on InteractionDesign and Children . ACM, 352–357.[91] Emily Putnam-Hornstein, Barbara Needell, Bryn King,and Michelle Johnson-Motoyama. 2013. Racial andethnic disparities: A population-based examination ofrisk factors for involvement with child protectiveservices.

Child Abuse & Neglect

37, 1 (2013), 33–46.[92] Richard E Redding, Carrie Fried, and Preston ABritner. 2000. Predictors of placement outcomes intreatment foster care: Implications for foster parentselection and service delivery.

Journal of child andfamily studies

9, 4 (2000), 425–447.[93] Jeanne S Ringel, Dana Schultz, Joshua Mendelsohn,Stephanie Brooks Holliday, Katharine Sieck, IfeanyiEdochie, and Lauren Davis. 2018. Improving childwelfare outcomes: balancing investments in preventionand treatment.

Rand health quarterly

7, 4 (2018).[94] Joseph P Ryan, Philip Garnier, Michael Zyphur, andFuhua Zhai. 2006. Investigating the effects ofcaseworker characteristics in child welfare.

Childrenand Youth Services Review

28, 9 (2006), 993–1006.95] John R Schuerman and Lynn Harold Vogel. 1986.Computer support of placement planning: the use ofexpert systems in child welfare.

Child welfare

65, 6(1986), 531–543.[96] A James Schwab and Susan S Wilson. 1989. Thecontinuum of care system: Decision support forpractitioners.

Computers in Human Services

4, 1-2(1989), 123–140.[97] A James Schwab Jr, Michael E Bruce, and Ruth GMcRoy. 1984. Matching children with placements.

Children and youth services review

6, 2 (1984),125–133.[98] Craig Schwalbe. 2004. Re-visioning risk assessmentfor human service decision making.

Children andYouth Services Review

26, 6 (2004), 561–576.[99] Craig S Schwalbe. 2008. Strengthening the integrationof actuarial risk assessment with clinical judgment inan evidence based practice framework.

Children andYouth Services Review

30, 12 (2008), 1458–1464.[100] Aron Shlonsky and Dennis Wagner. 2005. The nextstep: Integrating actuarial risk assessment and clinicaljudgment into an evidence-based practice framework inCPS case management.

Children and youth servicesreview

27, 4 (2005), 409–427.[101] Ravi Shroff. 2017. Predictive Analytics for CityAgencies: Lessons from Children’s Services.

Big data

5, 3 (2017), 189–196.[102] Fiore Sicoly. 1989a. Computer-aided decisions inhuman services: Expert systems and multivariatemodels.

Computers in Human Behavior

5, 1 (1989),47–60.[103] Fiore Sicoly. 1989b. Prediction and decision making inchild welfare.

Computers in Human Services

5, 3-4(1989), 43–56.[104] Douglas G Simpson, Peter B Imrey, Olga Geling, andSusan Butkus. 2000. Statistical estimation of childabuse rates from administrative databases.

Childrenand Youth Services Review

22, 11-12 (2000), 951–971.[105] Susan Leigh Star and Anselm Strauss. 1999. Layers ofsilence, arenas of voice: The ecology of visible andinvisible work.

Computer supported cooperative work(CSCW)

8, 1-2 (1999), 9–30.[106] James P Stevens. 1984. Outliers and inﬂuential datapoints in regression analysis.

Psychological Bulletin

SocialCasework

64, 1 (1983), 11–17. [108] Angelika Strohmayer, Rob Comber, and MadelineBalaam. 2015. Exploring learning ecologies amongpeople experiencing homelessness. In

Proceedings ofthe 33rd Annual ACM Conference on Human Factorsin Computing Systems . ACM, 2275–2284.[109] Rebecca Tushnet. 2018. The Difference Engine:Perpetuating Poverty through Algorithms.

Jotwell: J.Things We Like (2018), 1.[110] Rhema Vaithianathan, Emily Putnam-Hornstein, NanJiang, Parma Nand, and Tim Maloney. 2017.Developing predictive models to support childmaltreatment hotline screening decisions: AlleghenyCounty methodology and implementation.

Center forSocial data Analytics (2017).[111] D Wagner, K Johnson, and W Johnson. 1998. Usingactuarial risk assessment to target service nterventionsin pilot California Counties. (1998).[112] Hanna M Wallach. 2006. Topic modeling: beyondbag-of-words. In

Proceedings of the 23rd internationalconference on Machine learning . ACM, 977–984.[113] James A Walsh and Roberta A Walsh. 1990. Studies ofthe maintenance of subsidized foster placements in theCasey Family Program.

Child Welfare

69, 2 (1990),99–114.[114] Daniel Webster, Richard P Barth, and Barbara Needell.2000. Placement stability for children in out-of-homecare: A longitudinal analysis.

CHILD WELFARE-NEWYORK-

79, 5 (2000), 614–632.[115] Jane Webster and Richard T Watson. 2002. Analyzingthe past to prepare for the future: Writing a literaturereview.

MIS quarterly (2002), xiii–xxiii.[116] James K Whittaker. 2017.

The child welfare challenge:Policy, practice, and research . Routledge.[117] Jacob O Wobbrock and Julie A Kientz. 2016. Researchcontributions in human-computer interaction. interactions

23, 3 (2016), 38–44.[118] Jill Palzkill Woelfer and David G Hendry. 2010.Homeless young people’s experiences with informationsystems: life and work in a community technologycenter. In

Proceedings of the SIGCHI Conference onHuman Factors in Computing Systems . ACM,1291–1300.[119] Susan P Wyche, Paul M Aoki, and Rebecca E Grinter.2008. Re-placing faith: reconsidering thesecular-religious use divide in the United States andKenya. In

Proceedings of the SIGCHI conference onhuman factors in computing systems . ACM, 11–20.[120] Susan J Zuravin. 1999. Child neglect.