[PDF] Avoiding Help Avoidance: Using Interface Design Changes to Promote Unsolicited Hint Usage in an Intelligent Tutor

Abstract

Within intelligent tutoring systems, considerable research has investigated hints, including how to generate data-driven hints, what hint content to present, and when to provide hints for optimal learning outcomes. However, less attention has been paid to how hints are presented. In this paper, we propose a new hint delivery mechanism called "Assertions" for providing unsolicited hints in a data-driven intelligent tutor. Assertions are partially-worked example steps designed to appear within a student workspace, and in the same format as student-derived steps, to show students a possible subgoal leading to the solution. We hypothesized that Assertions can help address the well-known hint avoidance problem. In systems that only provide hints upon request, hint avoidance results in students not receiving hints when they are needed. Our unsolicited Assertions do not seek to improve student help-seeking, but rather seek to ensure students receive the help they need. We contrast Assertions with Messages, text-based, unsolicited hints that appear after student inactivity. Our results show that Assertions significantly increase unsolicited hint usage compared to Messages. Further, they show a significant aptitude-treatment interaction between Assertions and prior proficiency, with Assertions leading students with low prior proficiency to generate shorter (more efficient) posttest solutions faster. We also present a clustering analysis that shows patterns of productive persistence among students with low prior knowledge when the tutor provides unsolicited help in the form of Assertions. Overall, this work provides encouraging evidence that hint presentation can significantly impact how students use them and using Assertions can be an effective way to address help avoidance.

Full PDF

AAvoidingHelpAvoidance manuscript No. (will be inserted by the editor)

Avoiding Help Avoidance: Using Interface DesignChanges to Promote Unsolicited Hint Usage in anIntelligent Tutor

Mehak Maniktala · Christa Cody · Tiﬀany Barnes · Min Chi the date of receipt and acceptance should be inserted later

Abstract

Within intelligent tutoring systems, considerable research has in-vestigated hints, including how to generate data-driven hints, what hint con-tent to present, and when to provide hints for optimal learning outcomes. How-ever, less attention has been paid to how hints are presented. In this paper,we propose a new hint delivery mechanism called “Assertions” for providingunsolicited hints in a data-driven intelligent tutor. Assertions are partially-worked example steps designed to appear within a student workspace, and inthe same format as student-derived steps, to show students a possible subgoalleading to the solution. We hypothesized that Assertions can help addressthe well-known hint avoidance problem. In systems that only provide hintsupon request, hint avoidance results in students not receiving hints when theyare needed. Our unsolicited Assertions do not seek to improve student help-seeking, but rather seek to ensure students receive the help they need. Wecontrast Assertions with Messages, text-based, unsolicited hints that appearafter student inactivity. Our results show that Assertions signiﬁcantly increaseunsolicited hint usage compared to Messages. Further, they show a signiﬁ-cant aptitude-treatment interaction between Assertions and prior proﬁciency,with Assertions leading students with low prior proﬁciency to generate shorter(more eﬃcient) posttest solutions faster. We also present a clustering analysisthat shows patterns of productive persistence among students with low priorknowledge when the tutor provides unsolicited help in the form of Assertions.Overall, this work provides encouraging evidence that hint presentation cansigniﬁcantly impact how students use them and using Assertions can be aneﬀective way to address help avoidance.

M. Maniktala · C. Cody · T. Barnes · M. ChiDepartment of Computer Science, North Carolina State University, Raleigh, North Carolina,USAE-mail: [email protected] a r X i v : . [ c s . A I] O c t Mehak Maniktala et al.

Keywords intelligent tutoring system · help avoidance · user experience · unsolicited hints · aptitude-treatment interaction · logic proofs · productivepersistence · clustering · problem solving Studies suggest that hints, when provided appropriately, can augment stu-dents’ learning experience [15, 68] and improve their performance [11]. How-ever, students may not use hints optimally [2, 31]; some abuse hints to ex-pedite problem completion, and some avoid seeking help when they are inneed [1, 65]. Our goal is to redesign the hint interface to solve this help avoid-ance problem. Considerable research has investigated hints from several per-spectives, including hint generation [10, 63], adaptive hint content [22, 41, 83],student help-seeking behavior [2,65], and hint timing [69]. However, few studieshave speciﬁcally investigated how hint interfaces could reduce help avoidance(e.g. [41, 50]).Most intelligent tutoring systems (ITSs) provide solicited hints on-demand ,i.e, upon student request [84]. Other tutors try to circumvent help avoidanceby providing unsolicited hints when the system “determines” they are needed,for example, after a long period of inactivity [32]. However, students oftenignore these unsolicited hints [22,55]. In this work, we designed a new interfacefor unsolicited hints, called

Assertions to address this issue, and comparedits impact on student learning outcomes with that of

Messages , text-basedunsolicited hints that appear after student inactivity. The ultimate goal of ourresearch is to combine the new Assertions interface with a data-driven methodto determine when providing an unsolicited hint would be most beneﬁcial andleast disruptive for students.Our Assertions interface was designed based on user experience and mul-timedia design principles, including contiguity [50], attention [34], expecta-tion [76], and persuasion [20, 29]. First and foremost, we hypothesized thatplacing Assertions contiguously within the area of student attention wouldmake unsolicited hints more noticeable. Second, we believed students couldmore quickly interpret Assertions based on the expectation set by format-ting them like other problem-solving steps. Finally, we used persuasive lan-guage asking students to use the Assertions as problem-solving subgoals. Thesefeatures help Assertions act as partially-worked example steps, so they maygarner the same beneﬁts of worked examples, that have been shown to im-prove learning eﬃciency [49]. We hypothesized that Assertions would reducehelp avoidance for all students, by increasing the percentage of times helpwas received when it was needed. Further, we hypothesize that Assertionswould have an aptitude-treatment interaction eﬀect, fostering productive per-sistence and improving posttest performance, among students with low priorproﬁciency. Persistence during training that leads to mastery of a subject orpositive posttest outcomes is called productive persistence [39]. voiding Help Avoidance 3

The main contribution of this work is a principled design for a hint in-terface, Assertions, and a study to show that Assertions can be used to sig-niﬁcantly reduce help avoidance for all students through interface alone. Ournew proposed Assertions appear as partially-worked example steps, reducingthe barriers to help usage while leveraging beneﬁts of worked examples. Thesecond contribution of this work is a new cluster-based method that combinesposttest performance, eﬀort (to quantify persistence), and unsolicited hint us-age to discover productive persistence. Based on these clusters, we were ableto show that students with low prior proﬁciency who received Assertions ex-hibit productive persistence. Since Assertions are automatically provided tostudents, they can be thought of from two perspectives: either as unsolicitedhints, or as partially worked example steps. Therefore, in our related work anddesign sections below, we discuss Assertions from both of these perspectives.

Mehak Maniktala et al.

Factory, Fossati et al. [33] devised the Procedural Knowledge Model (PKM),that uses students’ global problem-solving behaviors to generate data-drivenfeedback for the iList tutor for programming with linked lists. Price, Barnes,and colleagues extended the Hint Factory approach to generate data-drivenhints for novice programming [63, 64, 67]. Later, Paaßen et al. created thecontinuous hint factory to allow for hint generation for previously unobservedstates [60], while Price et al. devised the SourceCheck algorithm that leveragedsimilar representations to generate hints based on a set of student solutionsrather than the trace data that the original Hint Factory uses [62]. Riverset al. developed a data-driven hint generator for ITAP (Intelligent TeachingAssistant for Programming) that uses a similar set of tools including stateabstraction, path construction, and state reiﬁcation to generate personalizedhints [70]. This method extends the Hint Factory by enhancing the solutionspace and creating new edges for states that are disconnected. This allows theITAP method to generate hints even for states that are not present in theprior data. For this work, we extended the Hint Factory to provide personal-ized hints for logic with 100% availability as described in Section 3.Aleven et al. have shown that students often display poor help-seekingbehaviors within intelligent tutors, including help avoidance , where students could beneﬁt from seeking help but choose not to, and help abuse , wherestudents use help excessively when they could solve a problem without assis-tance [2]. Studies by Price et al., Almeda et al., and Roll et al. have conﬁrmedthat help avoidance is pervasive across domains and systems with studentsignoring hints [4, 66, 71]. In one study, Roll et al. showed that meta-cognitivefeedback improved student’s help-seeking skills but did not aﬀect their domainlearning [71]. Price et al.’s research study on help-seeking by novice program-mers showed that students have several reasons for not requesting on-demandhints, including uncertainty about whether system help would be useful, ora desire to be independent [66]. Some tutoring systems prevent help avoid-ance by providing unsolicited hints rather than relying on student help-seekingthrough “on-demand” hint requests [5, 46, 56]. Arroyo et al. [5] and Murray etal. [56] showed that unsolicited hints promoted learning gains for a subset ofstudents. However, a study by Muir and Conati showed that students oftenignore unsolicited hints [55].Several studies have tried to encourage students to use unsolicited help bychanging its content or placement. For example, Cody et al. showed that un-solicited, data-driven hints were more likely to be used if their content focusedon next-step hints rather than more abstract, high-level hints [21]. Conati etal. used eye-tracking to show that factors such as hint timing, and student’s at-titude and prior knowledge can aﬀect students’ attention towards unsolicitedhints in a number factorization game [22]. Kardan and Conati showed thatunsolicited hints with tailored hint content along with highlighting and prox-imal hint placement improved student learning in a controlled study with AISPACE [41].Despite their potential beneﬁts, we argue that attempting to understandor use hints, and especially unsolicited ones, can increase students’ cognitive voiding Help Avoidance 5 load while learning new concepts within a tutoring system. This is becausestudents have to mentally integrate several sources of information, includingon-demand hints, unsolicited hints, and the student’s own current solution at-tempt. Adding to this is the fact that, in many existing tutoring systems, thehints and the student solution workspace are physically located in diﬀerent ar-eas of the interface. As a result, we believe that by physically integrating thosesources of information together, Assertions naturally reduce students’ work-ing memory load and thus would facilitate student learning by accelerating thechanges in their long term memory associated with schema acquisition [77,78].2.2 Worked ExamplesSince we posit that Assertions can be seen not only as unsolicited hints, butfrom another perspective as partially-worked examples for single problem-solving steps, we discuss impacts of worked examples here. Extensive researchhas shown that worked examples, i.e. showing step-by-step problem solutions,can be as eﬀective as problem solving to learn the same content yet the for-mer generally need much less time [49, 54]. In our prior work, we have addedwhole-problem worked examples to our tutor to help students learn the prob-lem interface and problem-solving skills. In [54], we found that the studentswho received data-driven worked examples were much more likely to completethe tutor, and did so in less time [54]. In another study [72], we found thatwhen we use reinforcement learning (RL) to determine when to present whole-problem worked examples, the slow learners provided based on this RL policyhad a signiﬁcantly higher learning gains than their peers who received worked-examples at random. Further, our results from study on worked examples inDeep Thought [43] show that whole-problem worked examples beneﬁt studentsearly in the tutoring, but are comparable to hint-based scaﬀolding. We also ob-served that worked examples were less beneﬁcial later in the tutoring sessionsfor lower proﬁciency students. Our work with Pyrenees, a probability tutor,suggests that step-level Worked Examples can also promote learning [92]. Thiswork suggests that students do not resist following these step-level worked ex-amples, that are essentially unsolicited hints provided in student workspace.One mechanism proposed by Sweller et al. for the success of worked ex-amples is through reduction in the cognitive load when students are learningnew concepts [79]. Their work discusses the principles underlying cognitiveload theory and how worked examples reduce the need for learners to en-gage in inference processes which might otherwise require heavy demands onstudents’ working memory. On the other hand, much prior work found thatasking students to justify their solution steps, referred to as self-explanations,can greatly improve their learning [3, 19, 24]. Furthermore, asking students toexplain expert-designed worked examples can be more eﬀective than problemsolving alone [18,88]. For example, Weerasinghe and Mitrovic explored the im-pact of self-explanations in KERMIT-SE, a tutor for the open-ended domainof database design. They engaged students in tutorial dialogues upon errors in

Mehak Maniktala et al. solutions and found that it improved student performance in both conceptualand procedural knowledge [88, 89]. In this work, we design our new Assertionshint interface to act as expert-designed partially-worked example steps withself-explanations. However, there are two key diﬀerences between our workand that by Weerasinghe and Mitrovic: Assertions are provided to guide stu-dents on the next step instead of the current step, and they are provided aftercorrect steps instead of incorrect steps. As described in section 3, Assertionsprovide students with the content of a useful step, but students must providean explanation before they can use the hint content in their solutions.2.3 Aptitude-Treatment InteractionPrior research in instructional strategies has shown the existence of aptitude-treatment interaction (ATI), where certain students are more sensitive to vari-ations in the learning environment compared to less sensitive students whoperform regardless of the treatment [25, 74]. Researchers have explored thecomplex relationship between student aptitude and their interaction with un-solicited help. While Razzaq et al. found that students learned more reliablywith hints they requested than unsolicited hints [69], Arroyo et al. observedhigher learning gains for low performing students when unsolicited hints wereprovided [5]. Further, Murray et al. found that unsolicited help avoided thenegative eﬀects of frustration and saved students time when they were strug-gling [56]. Muir and Conati showed that students with low prior knowledge arelikely to need hints the most, but they do not look at the hints as often [55].Kardan and Conati found that changes in unsolicited hint content and inter-face had a more pronounced eﬀect on learning for students with lower initialknowledge [41]. Similar to these studies, we hypothesize that an improved in-terface for unsolicited hints can increase hint usage and outcomes, especiallyfor students with low prior knowledge. In this work, we believe that studentswhose initial tutor performance is lower may need more assistance to developstrategies for solving logic proofs, and therefore, may beneﬁt more from animprovement in the hint interface.2.4 Productive PersistenceRecently, there is an increased interest in non-cognitive skills like persistenceand self-control within education research [39]. Task persistence is deﬁnedas the continuation of a task despite diﬃculty. To quantify persistence, re-searchers used metrics of eﬀort [28]. However, not all persistence is produc-tive, Beck and Gong [12] deﬁne unproductive persistence or “wheel spinning”as when a student spends an excessively long time struggling to learn a topicwithout achieving mastery. They showed that if a student did not master a skillin ASSISTments (an online math learning platform) or the Cognitive AlgebraTutor in a reasonable amount of time, the student was likely to struggle and voiding Help Avoidance 7

Fig. 1: Tutor’s Interface: Student workspace (left), rules (middle), info box(right), and the

Hint button and message box (bottom-left)never master the skill. Their work presents connections between wheel-spinningand negative student behaviors such as disengagement and gaming, as well asrecommendations to improve ITS design to address these issues. Research byNelson et al. is well-known for their heuristic model of the help-seeking processwhere they suggest that unproductive persistence may be associated with helpavoidance [58]. Studies suggest that the persistent eﬀort that lead to masteryof a topic is productive persistence [39], and is often associated with short-termoutcomes like improvement in performance [13,61], and longer-term outcomesin higher education and future earning [27, 35]. Recent studies in educationaldata mining have attempted to predict when an intervention can help stu-dents by distinguishing between productive and unproductive behavior usingdecision trees [39] and Recurrent Neural Networks (RNN) [14]. The work byKai et al. on ASSISTments used decision trees to identify when students arestruggling and how to make students’ persistence more productive. They foundthat interleaved practice of diﬀerent skills is more advantageous than blockedpractice, where the opportunities to learn a given skill are massed one afteranother. Another study on ASSISTments by Botelho et al. used RNNs to de-tect stopout (low persistence) and wheel-spinning (unproductive persistence)early to intervene and prevent unproductivity. They found that these modelshave high AUC and are also able to learn a set of features that generalize topredict each other. In this paper, we apply clustering to discover patterns ofproductivity, persistence, and unsolicited hint usage in our tutor.

Mehak Maniktala et al.

Fig. 2: A sample solution of a training problem in Deep Thought

Deep Thought (DT, Figure 1) is an intelligent tutor for solving open-endedmulti-step propositional logic problems that has data-driven features includingnext-step hints [8, 75], as well as adaptive problem selection [51, 53] and ped-agogical policies for worked example presentation induced via reinforcementlearning [6, 52, 72, 73]. Figure 1 shows the current tutor interface: the left win-dow is the workspace where students construct solutions, the central windowlists the domain rule buttons, and the right window provides instructions andinformation such as the rules that are meant to be practiced in the currentproblem. Each problem-solving statement is graphically represented as a node .Deep Thought shows several problem-provided statements (that are meant tobe used as existing or known facts) at the top of the workspace, and a conclu-sion to derive at the bottom. Students iteratively carry out problem-solvingsteps by deriving new statements from old ones using domain rules. This is atypical procedure used across STEM domains to apply principles or rules toknown information to derive new facts [59]. For example, in physics, if we knowvalues for mass ( m ) and acceleration ( a ), we can apply the rule F = ma withthose values to ﬁnd force ( F ). In this paper, a problem-solving step consists ofa new derived statement and its justiﬁcation , where the justiﬁcation includesspecifying the domain rule and the source statements used to show that thenew derived statement is true. In logic, problem-solving continues until theconclusion is the derived statement in a step that is justiﬁed.Figure 1 shows an example problem with three nodes 1-4 for the problem-provided statements (2: B , 1: A → C , 3: C → E ), and 4: D ∧ ¬ E at the topof the workspace. The conclusion to be derived (C: ¬ A ∧ B ) is at the bottom, voiding Help Avoidance 9 with a question mark indicating that it is not yet justiﬁed. Each problemsolving step involves the same process: clicking on 1-2 source nodes and a rulebutton, and entering the new derived statement. The tutor veriﬁes whetherthe source nodes and rule correctly justify the derived statement. Once a stepis veriﬁed, a new node appears, colored based on how often the same node wasnecessary in previous student solutions to this problem, where green meansfrequent, yellow is infrequent, and gray is never. We call a node ‘necessary’ or‘needed’ when its deletion would make a solution incomplete. These coloringsgive students an indication of whether they are on an optimal problem-solvingpath.We now walk through the student experience of solving the problem shownin 1 to obtain the solution shown in Figure 2. First, the student clicks on node4 and rule Simp, and is asked to type the new derived statement, D . The tutorveriﬁes that Simp applied to node 4 is a correct justiﬁcation, and draws node5, labeled with Simp and an arrow from node 4 to 5. Node 5 is colored graysince it was never needed by previous students solving that same particularproblem. Next, the student applies the same process to derive and justify node6, which is green since it was frequently necessary in historical solutions. Toderive node 7, the student clicks on node 1, and Impl rule, and types in thederived statement ¬ A ∨ C . After it is veriﬁed, node 7 appears, with the labelImpl, and an arrow from node 1 to 7. The student then clicks “Get Hint” torequest a hint, and “Try to derive ¬ C ” appears in the message box. Next, thestudent tries to follow the hint by selecting nodes 3 and 6 and the rule MP.The tutor detects this incorrect rule application, records the error in the datalog, and provides an error-speciﬁc message, but since it was a mistake, no newnode is created. Since nodes 3 and 6 are still selected, the student clicks onthe correct rule – MT, and types in the derived statement ¬ C . This processcorrectly justiﬁed the hint content statement ¬ C , so node 8 appears with MTwith arrows from nodes 3 and 6. The student similarly clicks on nodes 7 and8, and rule DS to derive node 9. Finally, the student clicks on nodes 2 and 9,and rule Conj to derive the conclusion, and the tutor detects that the problemis complete.3.1 Hints in Deep ThoughtDeep Thought uses the Hint Factory [75] to generate hints, where the hintcontent depends only on the current problem solving state, a snapshot of astudent problem-solving attempt. The Hint Factory [75] works by treatingproblem-solving data from prior students as a Markov Decision Process andusing value iteration to assign values to each state based on its distance froma valid observed solution. Then, the hint source is set for a current student’sstate by selecting the subsequent reachable state with the highest value. If thecurrent state is not found, we rollback current student solution states until amatching state and its hint source are found. Finally, the Hint Factory-derived hint content is the newest derived statement in the hint source state. Deep Fig. 3: Diﬀerences between Assertions and Messages while delivering a logichint statement A → E (a) Assertion is presented in the workspace,with the format of a student-derived step,and with a “Subgoal” label (b)

Message hint is provided textually belowthe student workspace

Thought inserts this derived statement, the hint content HC , into a templatedepending on the hint type, described below.In this study, there are three types of hints, including on-demand hintrequests, and two types of unsolicited hints: Messages and Assertions. Thecontent of on-demand and unsolicited hints is identical and no additional jus-tiﬁcation/derivation help is given. Students request on-demand hints by click-ing the “Get Hint” button, and the system shows “Try to derive HC ,” inthe message box. Both Messages and Assertions are unsolicited hints, mean-ing that they are not requested by students. Messages appear automaticallyafter one minute of student inactivity, using the same Messages interface ason-demand hints.

Assertions appear automatically after about 40% of steps.Since the mean student solution length in the training problems is 9 steps,this means that students are likely to encounter 3 - 4 hints per problem. TheAssertions interface consists of 4 parts: (1) adding a new cyan-colored nodecontaining the hint content HC in the workspace, (2) labeling the node as a“Subgoal,”, (3) including a question mark icon showing that the node is notyet justiﬁed, and (4) stating “Try to justify the added goal” in the messagebox. Figure 3 shows the Messages and Assertions interfaces suggesting thesame logic statement A → E in diﬀerent formats. Students must explain howthe node is to be derived by justifying it before they can use the hint contentin their solutions. While this is not a typical verbal self-explanation, we ar-gue that, by justifying the step, the student is demonstrating that they knowwhat domain principle (rule) and prior statements can be used to explain whythe new derived statement is true. In the next section, we describe the designprinciples used to create the new Assertions interface. voiding Help Avoidance 11 • Contiguity and Attention : Moreno, et al.’s spatial contiguity principlefor multimedia learning materials states that a graphic should not be phys-ically separated from its explanatory text [26, 50]. Hegarty et al. showedthat contiguity supports student memory and understanding [36]. Butcherand Aleven showed that when interactive support was placed near a ge-ometry diagram, student learning outcomes improved [16, 17]. Kardan andConati in a controlled study on AI SPACE tailored the hint content, usedhint highlighting and proximal hint placement to gain students attentiontowards unsolicited hints that improved learning for low prior knowledgestudents [41]. In this work we use similar proximal hint placement for As-sertions but provide the same content in both Assertions and Messages.We strategically place Assertion hints where the student needs them. Al-though the message box is close to the workspace, it may still be subjectto ‘change blindness’ [34], where students paying attention to nodes withinthe workspace may ﬁlter Messages out and simply not notice their appear-ance. Therefore, we provide Assertions in the workspace, where studentshave already focused their attention. Together, contiguity and attentionare meant to help students notice the appearance of Assertion hints. • Expectation : Research by Summerﬁeld explains that the speed of visualinterpretation is optimized by leveraging past experiences to form expecta-tions [76]. Based on this principle, we design Assertions to leverage studentexpectations through an isomorphic visual format that may work togetherwith reduced text to decrease cognitive load. First, the hint content HC of Assertions appears in the same visual node format as student-derivedstatements, enabling students to visually interpret an Assertion hint faster.Second, Messages require students to read the text “Try to derive HC ” anddetermine that HC is a statement that should appear on a graphical node.This additional cognitive processing may pose a barrier that some studentsmay not overcome [80], and this may be especially true for students withlow prior knowledge [40]. Therefore, formatting the Assertions hint con- tent HC as nodes may help students by leveraging visual expectation, orby reducing overall cognitive load [80]. • Persuasion : Dillard suggests that user experiences can be enhanced byusing persuasion [29]. Cialdini has created six principles of inﬂuence, in-cluding reciprocity, commitment and consistency, liking, social proof, au-thority, and scarcity, that can be used to inﬂuence people’s behaviors [20].Assertions have two persuasive design aspects. First, we posit that addingAssertions directly to the workspace may make them seem required, lever-aging the authority of the tutoring system itself. Assertion nodes are ac-companied with a label “subgoal” (Figure 3a) and the message “Try tojustify the added goal”, persuasive and authoritative texts suggesting thatjustifying Assertions is just part of the tutor. The diﬀerence in the text ac-companying Assertions and Messages is that an Assertion is called a “goal”but message hints do use that terminology while providing hints. Second,Assertion nodes are also formatted with a question mark like the conclu-sion. Formatting leverages both the visual expectation principle above, butalso Cialdini’s consistency notion that people prefer to be consistent. Oncethey get used to following tutor instructions and justifying nodes that havequestion marks, Assertions can rely on people’s natural consistency thatinﬂuences them to continue to make similar consistent choices. Previousstudies on help-seeking and hint usage suggest that students have manydiﬀerent reasons for help avoidance, including their attitudes towards hintsand their preference for autonomy [65]. Persuasive design elements maycircumvent these preferences by simply inﬂuencing students to do what issuggested.

Based on our foundational design principles and literature review, we proposethe following three hypotheses: (H1) Assertions will increase the unsolicitedhint usage for all students irrespective of their prior knowledge. (H2) Assertionswill lead students with low prior knowledge to form shorter proofs faster inthe posttest. (H3) Assertions will foster productive persistence among studentswith low prior knowledge.4.1 ParticipantsThe study was conducted with 122 participants at North Carolina State Uni-versity, the top engineering university in the state, where Deep Thought wasgiven as a homework assignment to a class of 312 undergraduate students inthe College of Engineering majoring in Computer Science, Computer Engi-neering, or Electrical Engineering in a Fall 2018 discrete mathematics course.We do not have speciﬁc demographics of study participants, but the Fall2018 College of Engineering demographics include 25.3% women, 67.2% white, voiding Help Avoidance 13 .4.2 ConditionsWe used stratiﬁed sampling to split students based on their pretest perfor-mance, and then randomly assigned them to the conditions with Assertions as the treatment, and

Messages as the control. The condition assignment re-sulted in N = 73 in Assertions, and N = 49 in Messages. The total numberof participants who completed the study was 105 (61 in Assertions, 44 inMessages) but after removing logs with system errors, the dataset had 100students with 57 in Assertions, and 43 in Messages. We performed a χ testof independence to examine the impact of completion rate and system er-rors on the groups and found no signiﬁcant diﬀerences among the two groups: χ (2 , N = 122) = 1 . , p = 0 .

91. This implies that the group sizes were notsigniﬁcantly impacted by the tutor completion rate or logging errors.4.3 ProcedureThe student procedure is as follows: The tutor provides students with prac-tice solving logic problems, divided into four sections: introduction, pretest,training, and posttest. The introduction presents two worked examples to fa-miliarize students with the tutor interface. Next, students solve two problemsin a pretest , which is used to determine students’ incoming competence. Stu-dents are assigned a condition based on their pretest performance. The pretestproblems are designed to be easy and short, using a few straightforward rules,and this is reﬂected in their short optimal solution lengths (

Mean = 3 . SD = 0 . training section withﬁve training levels with gradually increasing diﬃculty, and this is reﬂected inthe average length of optimal solutions during training, with a mean optimalsolution length of Mean = 4 .

99 steps, ( SD = 1 . four training problems. Students may skip a maximum of threeproblems per level, with each skip taking students to easier problems. Studentsmay also restart problems using the “Restart” button below the workspace. Inboth conditions, students in the training levels may request on-demand hintsand always receive immediate feedback on rule application errors (see section3). Students in the Messages (control) condition received unsolicited messagehints upon one minute of inactivity. Students in the Assertions (treatment)condition were given Assertions after about 40% of their steps.The algorithmwe use to provide Assertions uses two steps. In the ﬁrst step, we decide at Fig. 4: Example scenarios of Assertion hint A → E usage usage (a) The Assertion A → E node appearsin the student workspace; if it is neverjustiﬁed, it remains as-is (b) (The student has justiﬁed the hint by se-lecting nodes 1 and 3 and rule HS(c) A student solution where hint A → E wasjustiﬁed but not needed (d) Another student solution where the hintwas both justiﬁed and needed random whether the step should get a hint with 50% probability. In the sec-ond step, we check for the constraints that assertion should not be given inmore than two consecutive steps. This resulted in an actual assertion provisionrate of 40%. Note that both Messages and Assertions remain on the screenuntil a student justiﬁes them . Further, only one unsolicited hint, regardless ofinterface, may be present at a time, and the hint content is not updated basedon new student work. Finally, students take a more diﬃcult posttest with fourproblems, with longer optimal solution lengths compared to the other sections( Mean = 7 . SD = 1 . The tutor allows students to delete assertions but only two Assertions were deleted inthe entire dataset, suggesting that students did not realize this was possiblevoiding Help Avoidance 15 justiﬁed when a student applies rules to existing state-ments to derive it. Figure 4a shows an Assertion suggesting A → E . When astudent selects nodes 1 and 3 and the rule HS to derive the hint A → E , theAssertion hint A → E is said to be justiﬁed, and it becomes a numbered node5 as in Figure 4b. The student may continue to solve the problem as in Figure4c, without ever having used node 5 to justify any other node. As in this case,whenever an Assertion was justiﬁed but could be deleted without making thesolution incomplete, we say the Assertion was justiﬁed but not needed . Anotherstudent may solve the problem as in Figure 4d where the same hint statement A → E is both justiﬁed and needed. If we remove node 5 from the solution,it becomes incomplete since nodes 7: ¬ A and C: ¬ a ∧ B could not be derivedwithout it.We assume that if students justify a hint, they have paid attention to it.The Hint Justiﬁcation Rate (HJR) is deﬁned as hints justiﬁed divided by thetotal given across the training problems. As in other multi-step open-endedproblem domains, students may derive several statements that are not neededto solve a problem, making the solution longer than necessary. For a hint to becalled needed , students must ﬁrst justify it, but must also ﬁgure out how theycan use it to derive the conclusion.

Hint Needed Rate (HNR) is deﬁned as hintsneeded divided by the total number of hints given across the training problems.We use unsolicited HJR to evaluate student attention towards unsolicited help,and unsolicited HNR to measure the inﬂuence of unsolicited hints on studentproblem solving.4.5 Performance MeasuresOur test performance measures include: solution length optimality, problem-solving time, and rule application accuracy. In open-ended domains, solutionlength , i.e., the number of derived statements in a complete solution, is a valu-able performance metric as there is a vast diversity of possible student solu-tion paths. Our aim with increasing unsolicited hint usage is to guide studentsto learn eﬃcient problem-solving strategies from incorporating the partiallyworked example Assertion steps as necessary statements in their solutions.Since the posttest consists of four problems, we evaluate students based ontheir average solution length in the posttest, and shorter lengths are better . Note that solution length can only be calculated for complete solutions, and our dataconsists only of students who successfully completed the study by completing the mandatorypre- and post-test problems. N = 5 (10%) in Messages, and N = 12 (16%) in Assertions did6 Mehak Maniktala et al. Problem solving time is also an important performance metric in open-ended domains. Similar to other studies [41, 81], we also assess students onthe total time they spend solving problems. In order to account for outliers,while calculating problem solving time, we cap each click-based interactiontime to ﬁve minutes, i.e., if a student took more than ﬁve minutes to performan interaction, we cap it to ﬁve . A shorter problem solving time suggestsbetter performance. We hypothesized that an increased usage of unsolicitedhints, will help students learn to solve problems more quickly and with shortersolution lengths, and that these eﬀects will be more pronounced for studentswith low prior knowledge.Finally, Accuracy is deﬁned as the number of correct rule applicationsdivided by the total number of applications. A higher accuracy value suggestsbetter knowledge of how to apply domain rules. Since the tutor is designed toprovide immediate feedback on incorrect rule applications without penalties,even within the pre- and post-tests (see section 3), we do not hypothesizediﬀerences in the accuracy between the two conditions. We report accuracyfor both conditions, however, for completeness.4.6 Prior ProﬁciencyWe hypothesize that an increase in the unsolicited hint usage signiﬁcantlyimpacts the performance of students with low prior knowledge. Our priorwork [72] suggests that students with diﬀerent incoming competencies canexperience a treatment diﬀerently. To account for such aptitude-treatment in-teraction eﬀects, we quantify prior knowledge by splitting the students intoLow and High

Prior Proﬁciency groups using a normalized pretest perfor-mance score that combines the number of problem-solving steps, the aver-age time spent on each step, and accuracy. The three performance measuresare normalized separately and equally weighted in a combined score thatis again normalized. Students with pretest performance > χ (1 , N = 122) = 0 . , p = 0 . eﬀort is not ﬁnish the tutor. A chi-square test shows no signiﬁcant diﬀerence in the completion andnon-completion group sizes between the two conditions ( χ (1 , N = 122) = 0 . , p = 0 . The 99 th percentile of interaction action time in Fall 2018 was 99.03s; 811 out of 260,750interaction logs for 100 students in the study, had an action time greater than 5minvoiding Help Avoidance 17 highly motivated by prior research. More speciﬁcally, Venture et al. deﬁneda metric for students’ eﬀorts as the amount of time spent on unsolved prob-lems and they found that there was a signiﬁcant correlation between the eﬀortmeasured during the training and a self-report measure of persistence [86].Later, in another study, they used this eﬀort metric to measure persistencein an educational game that teaches Qualitative Physics [85]. In our tutor,students can skip up to three problems per training level and thus we alsomeasure the time students spent in these unsolved skipped problems as ameasure of eﬀort. Moreover, Dumdumaya et al. deﬁned their eﬀort metric asthe number of reattempts made on a problem after a failed attempt predictedtask persistence [30]. In our tutor, this corresponds to the number of restartson problems that students eventually solve. In the following, we separatelytrack eﬀort through two research-based measures: (1) time spent on unsolved(skipped) training problems, and (2) the number of restarts on solved trainingproblems. For the purpose of this analysis, we deﬁne productive persistence aspersistent (high) eﬀorts that result in higher posttest performance. After cleaning the data as described in Section 4.2, an average of 2,483 interac-tions were logged and analyzed per student in our ﬁnal sample of 100 students(with 57 in the Assertions condition, and 43 in Messages). We partitioned thestudents based on Prior Proﬁciency into Low (n = 41) and High (n = 59)groups. We then partitioned by Condition and Prior Proﬁciency resulting in4 groups: Assertions-Low (n = 25), Assertions-High (n = 32), Messages-Low(n = 16), and Messages-High (n = 27).Before investigating any of our hypotheses, we ﬁrst compared the numberof on-demand hints between the two conditions to ensure that any diﬀerencesbetween groups could not be explained by diﬀerences in on-demand hint re-quests. Similar to other tutors [47, 65], students in this study rarely requeston-demand help irrespective of Condition or Prior Proﬁciency. We found nosigniﬁcant diﬀerences in the number of on-demand hint requests between con-ditions or by prior proﬁciency, with all conditions requesting, on average, lessthan one on-demand hint per problem. Students in the Assertions conditionrequested few on-demand hints per problem, with Mean = 0.79 , SD = 3.92(Assertions-Low group: Mean = 0.67, SD = 2.82, and Assertions-High group:Mean = 0.89, SD = 3.07). Students in the Messages condition similarly re-quested few on-demand hints per problem for the Messages group, with Mean= 0.55 , SD = 2.72 (Messages-Low: Mean = 0.46, SD = 2.36, and Messages-High: Mean = 0.59, SD = 2.43). The on-demand hint data was not normallydistributed as tested by the Shapiro-Wilk’s test (Assertion: W = 0.744, p < W = 0.752, p < { Assertion,Messages } and Prior Proﬁciency { Low, High } on the number of on-demandhints shows no signiﬁcant main eﬀects (Condition: F (1,100) = 0.132, p = Table 1: Comparison of unsolicited hint metrics between the two conditions,where a two-way Aligned Ranks Transformation ANOVA for each metric showsonly a main eﬀect of Condition ( p < Unsolicited Hint Metric Assertions MessagesHints Given in Training 48.82 (9.85)* 32.74 (10.64)Hint Justiﬁcation Rate (HJR) 0.93 (0.07)* 0.63 (0.18)Hint Needed Rate (HNR) 0.82 (0.09)* 0.62 (0.17) F (1,100) = 1.075, p = 0.302) or interaction ( F (1,100)= 0.006, p = 0.940). Based on this analysis, the remaining analyses focus onlyon usage for unsolicited Assertion and Message hints.5.1 H1 : Assertions increase the unsolicited hint usage for all studentsirrespective of their prior knowledgeTable 1 shows the unsolicited hint metrics: . Since we have hint data that is not normally distributed ,we performed a two-way Aligned Ranks Transformation ANOVA [90] on eachof the unsolicited hint metrics with the two factors as Condition { Assertions,Messages } , and Prior Proﬁciency { Low, High } .We applied a two-way Aligned Ranks Transformation ANOVA on the un-solicited hint metrics of Condition ( F (1,100) =40.26, p < Condition (HJR: F (1,100)= 191.10, p < F (1,100) = 62.30, p < F (1,100) = 0.008, p = 0.929, HJR: F (1,100) = 0.221, p = 0.639, and HNR: F (1,100) = 0.009, p = 0.924. The distribution parameters for each of the unsolicited hint metricsper Prior Proﬁciency group are provided in Appendix A. There was only a HJR and HNR are the proportion of hints justiﬁed and needed respectively Shapiro-Wilk’s test on

Unsolicited Hints Given for the Assertions group: W = 0.904, p < W = 0.942, p = 0.030; Shapiro-Wilk’s test on UnsolicitedHJR for the Assertions group: W = 0.887, p < W = 0.959, p < Unsolicited HNR for the Assertions group: W = 0.904, p < W =0.945, p < Table 2: Comparison of

Posttest Performance metrics between the twoconditions within each

Prior Proﬁciency group - Average Solution Length( p = 0.033) and Total time ( p = 0.008) are signiﬁcantly diﬀerent between theAssertions-Low and the Messages-Low groups PriorProﬁ-ciency Avg. Sol. Length ( main eﬀect of the Condition as shown above. This shows that the Assertionshad a signiﬁcant impact on unsolicited hint usage for all students, regardlessof incoming proﬁciency, conﬁrming hypothesis H1.5.2 H2 : Assertions will lead students with low prior knowledge to formshorter proofs faster in the posttestSince all performance data were normal, we performed t-tests to compare con-ditions. A t-test on the average pretest solution length between the Assertions(Mean = 7.54 nodes, SD = 1.87 nodes) and the Messages (Mean = 7.64, SD =2.21) conditions, showing no signiﬁcant diﬀerence ( t (99) = 0.791, p = 0.215).We also observed insigniﬁcant diﬀerences in the pretest problem-solving time( t (99) = 0.683, p = 0.248) using a t-test between the Assertions (Mean =27.16 min, SD = 8.29 min) and the Messages (Mean = 25.89 min, SD = 10.23min) conditions. While the H2 hypothesis does not predict diﬀerences in ac-curacy between conditions, students were assigned a condition based on theirpretest performance, which includes rule application accuracy, so we compareit here. A t-test on the pretest rule accuracy between the Assertions (Mean =0.52, SD = 0.16) and Messages (Mean = 0.52, SD = 0.14) conditions showsno signiﬁcant diﬀerence ( t (99) = 0.111, p = 0.455).As mentioned earlier, hypothesis H2 is based on the reasoning that Asser-tions may guide students towards optimal strategies, which can lead studentswith low prior proﬁciency to form shorter solutions in less time. We exam-ined the correlation between the dependent variables (average solution lengthand total time) to assess their overlap, both for the entire population and forthe low prior proﬁciency group. We did not observe a signiﬁcant correlationbetween the average posttest solution length and posttest time for the entirepopulation: Corr = 0.050, p = 0.615 or for the Low Prior Proﬁciency group: Corr = 0.015, p = 0.916.Table 2 shows the posttest performance of the two Conditions { Assertions,Messages } disaggregated for the Low , and

High

Prior Proﬁciency groups inthe ﬁrst two rows, and for

All students as a summary in the bottom row. To

Fig. 5: Tukey’s HSD shows that the Assertions-Low performed signiﬁcantlybetter in posttest than the Message-Low group in average solution length ( p = 0.033) and total time ( p = 0.008)investigate our H2 hypothesis, we performed a two-way ANCOVA on averagesolution length and total time, with the Condition { Assertions, Messages } andPrior Proﬁciency { Low, High } as the two factors, and the respective pretestperformance metric as the covariate. For average solution length, we observed asigniﬁcant interaction between the Condition and Prior Proﬁciency ( F (1,100)= 4.983, p = 0.027). Neither main eﬀect for Condition or Prior Proﬁciencywere signiﬁcant. We then performed the pairwise Tukey’s Honest Signiﬁcant(HSD) test for multiple comparisons and found a signiﬁcant diﬀerence ( p =0.033) between the Assertions-Low and

Messages-Low groups, showing thatthe Assertions-Low group formed signiﬁcantly shorter proofs on the posttestthan the Messages-Low group.A two-factor ANCOVA on posttest total time as described above shows asigniﬁcant interaction between the Condition and Prior Proﬁciency ( F (1,100)= 6.236, p = 0.014), and a signiﬁcant main eﬀect of the Condition ( F (1,100)= 6.913 p = 0.010). The main eﬀect of Prior Proﬁciency was not signiﬁcant.A pairwise Tukey’s Honest Signiﬁcant (HSD) test for multiple comparisonson the total posttest time shows a signiﬁcant diﬀerence ( p = 0.008) between Assertions-Low and

Messages-Low groups. The Assertions-Low group spentsigniﬁcantly less time on the posttest than the Messages-Low group. Figure5 summarizes the diﬀerences between the Assertions-Low and Messages-Lowgroups in their posttest performance. Together with the results above, theAssertions-Low group had signiﬁcantly better posttest solution length andtime than the Messages-Low group, conﬁrming our H2 aptitude-treatmentinteraction hypothesis for posttest performance. While we did not hypothesizeimprovements in posttest accuracy, we provide these results in Appendix Bfor completeness.Next, we investigated the correlation between average posttest solutionlength with the unsolicited hint metrics. First, the top row of Table 3 shows voiding Help Avoidance 21

Table 3: Correlation between average posttest solution length and unso-licited hint metrics for the entire population, and split by low and high priorproﬁciency groups

PosttestSolution Lengthwith EntirePopulationN = 100 Low PriorProﬁciencyN = 41 High PriorProﬁciencyN = 59Corr p Corr p Corr p < < Table 4: Correlation between total posttest time and unsolicited hint metricsfor the entire population, and split by low and high prior proﬁciency groups

PosttestTotal Timewith EntirePopulationN = 100 Low PriorProﬁciencyN = 41 High PriorProﬁciencyN = 59Corr p Corr p Corr p < < < < that the number of unsolicited hints given does not correlate to posttest so-lution length, suggesting that diﬀerences in posttest solution lengths betweenconditions cannot be attributed to the frequency of unsolicited hints. However,both HJR (second row) and HNR (third row) are signiﬁcantly and negativelycorrelated to posttest solution length for students with Low Prior Proﬁciency(HJR: p < p < . This suggests that students with low priorknowledge learn more from the hints needed, rather than the ones they onlyjustiﬁed (see Figure 4 diﬀerentiating hints justiﬁed and needed). A justiﬁed,but not needed, hint suggests that a student could determine how to derivethe unsolicited hint content, but not how to use it . It is reasonable that lowerprior proﬁciency students who were able to include the unsolicited hints asnecessary components of their proof solutions were more likely to learn moreoptimal, shorter problem-solving strategies. We also observed an insigniﬁcantbut positive correlation between average posttest solution length and HNRfor the High prior knowledge group. While small and not signiﬁcant, this in-verted eﬀect may indicate another aspect of aptitude treatment interaction,where high prior proﬁciency students may potentially learn less if they taketoo much advantage of unsolicited hints. This result suggests that it may be We did not test for the signiﬁcance in the diﬀerence between the two correlation coeﬃ-cients because the samples are not independent. Hints Needed are a subset of Hints Justiﬁed2 Mehak Maniktala et al. preferable to build a more adaptive method to determine when to presentunsolicited hints to students with high prior proﬁciency.Table 4 shows the correlation between posttest time and the unsolicitedhint metrics. First, Table 4 shows that the number of unsolicited hints given(top row) does not correlate to posttest time, suggesting that diﬀerences inposttest time between conditions cannot be attributed to the frequency of un-solicited hints. While HJR (second row) and HNR (third row) are signiﬁcantlycorrelated to the posttest time for the entire population, the Pearson’s Corre-lation Coeﬃcient is less than 0.3, suggesting small coverage. However, studentswith Low Prior Proﬁciency have a signiﬁcant correlation (that is also greaterthan 0.3) between posttest time and unsolicited hint usage metrics HJR andHNR.Table 1 shows that, over the entire population, signiﬁcantly more ( p < Condition on thenumber of unsolicited hints given. Neither the main eﬀect of Prior Proﬁciencynor the interaction eﬀect were signiﬁcant. It would be reasonable to expectthat the frequency of hints might impact posttest performance. However, ourcorrelation analysis shows that the signiﬁcantly higher number of unsolicitedhints given in the Assertions condition did not correlate with posttest per-formance for either solution length or time. Instead, the signiﬁcant negativecorrelations between posttest length and time, and Hints Needed Rate for allstudents with low prior knowledge suggests that students in the Low grouplearned from using the unsolicited hints to achieve problem conclusions. Theseneeded hints provided insight into eﬃcient problem solving, by showing stu-dents optimal problem-solving steps. As shown in Table 1 above, students inthe Assertions condition had higher HNR than students in the Messages con-dition. Therefore, our results conﬁrm hypothesis H2 that there would be anaptitude-treatment interaction eﬀect where Assertions helped students withlow prior proﬁciency learn to construct more optimal (shorter) solutions morequickly on the posttest.5.3 H3 : Assertions foster productive persistence among students with lowprior knowledgeWe hypothesized that increased usage of unsolicited hints in the form of Asser-tions will lead students with low prior proﬁciency to exert persistent eﬀort intraining, and this persistence will be productive (i.e., improved posttest per-formance). We clustered students on ﬁve features including: two productivitymeasures (posttest solution length and time, where lower is better), two eﬀortmeasures including time spent on unsolved (skipped) problems and the num-ber of restarts, and unsolicited hint usage as measured by HJR. We used HintJustiﬁcation Rate (HJR) instead of Hint Needed Rate (HNR) since the hintsneeded cannot be determined for unsolved problems. The clustering analysis voiding Help Avoidance 23 Table 5: Selecting the number of clusters based on three cluster quality indices

Table 6: Centroids (Mean) of the three clusters using Hierarchical Clusteringwith the Ward’s method

Clus-terNo. ClusterLabel Posttest TrainingTotalTime(min) Avg.Sol.Length( provides a deeper understanding of student behavior patterns involving pro-ductivity, eﬀort, and proactive hint usage. An ANOVA on the eﬀort metricswould not have helped us understand how student eﬀort varies in tandemwith both productivity and hint usage. Therefore, the cluster analysis is moregeared towards answering H3 than an ANOVA.We performed cluster analysis using Hierarchical clustering with Ward’smethod on standardized features. We selected the number of clusters usingmajority vote across three indices: Silhouette and Calinski-Harabasz, whichboth maximize inter-cluster similarity and minimize intra-cluster similarity(overall higher values are better), and the Davies-Bouldin Index, which prefersminimal intra-cluster similarity (overall lower values are better). Table 5 showsthat using three clusters yields the best quality clusters.Table 6 shows the centroids of the three clusters. We used the class av-erage (CA) , i.e., average over the entire population to assess the clusters oneach measure. The following order was observed for each feature used in theclustering analysis: (Note that lower posttest time and solution lengths arebetter) • Posttest Time (min) : < < CA (42.81) < • Posttest Sol. Length : < < CA (14.01) < • Unsolved Problem Time (min) : > CA (5.16) > > • Restarts : > CA (2.44) > > • Hint Justiﬁcation Rate (HJR) : > > CA (0.80) > Fig. 6: Proﬁle of the three Clusters based on the Condition and Prior Proﬁ-ciencyﬁve features better than the class average). In the following, we refer to thiscluster as

Productive - High Eﬀort- High HJR . Cluster

Productive - Low Eﬀort - High HJR .Interestingly, a lot of the High Prior Proﬁciency students ended up in loweﬀort but did no better than Assertions-Low group on the posttest. Lastly,cluster

Unproductive - LowEﬀort- Low HJR .We then proﬁled each cluster based on the pairs of the Condition and PriorProﬁciency as shown in Figure 6. Interestingly, the majority of the Assertions-Low group students are in the

Productive - High Eﬀort- High HJR cluster, andthe majority of the Messages-Low group students are in the

Unproductive - LowEﬀort- Low HJR cluster. Most of the students in the Assertions-High and theMessages-High groups are in the

Productive - Low Eﬀort- High HJR cluster.Since we are interested in the Low Prior Proﬁciency group, we performed a chi-square test to compare the distribution of the Assertions-Low and Messages-Low students in the three clusters and found a signiﬁcant diﬀerence ( χ (1 , N =41) = 24 . , p < . Productive - High Eﬀort- High HJR clusterwith the highest eﬀort and unsolicited hint usage in training with productiveposttest results, and this conﬁrms our H3 hypothesis. voiding Help Avoidance 25 H1 : Assertions increase the unsolicited hint usage for all studentsirrespective of their prior knowledgeThe hints in our tutor suggest the most optimal next-step statement to derivefor any given student problem-solving state. Similar to other tutors [47,65], inthis study, we found that students rarely request on-demand help irrespectiveof condition. However, our results suggest that the diﬀerence in unsolicitedhint usage between Messages and Assertions can be attributed to presentationalone. We found that Assertions, speciﬁcally designed using the principles ofcontiguity, attention, expectation, and persuasion, signiﬁcantly increased boththe attention students pay to unsolicited hints (HJR), and their inﬂuence onstudents’ solutions (HNR) regardless of the students’ prior knowledge. Conatiand Manske suggested in [23] that students pay more attention to simplerhints. Assertions provide high immediacy (making the hint content immedi-ately usable, [7]) since they leverage both spatial contiguity [50] by placinginformation right where it is needed and visual expectation [76] by format-ting hints to make them more intuitive to follow. Studies have also foundstudents’ attitude towards unsolicited hints to be an important factor in helpavoidance [22, 65]. Persuasive factors like increasing perceived authority [20]through formatting and language can make justifying Assertions seem to berequired. Our results show that an unsolicited hint interface that combinespersuasion, making hint usage seem required, with high immediacy, making iteasy to see and do, can help overcome barriers to hint usage.6.2 H2 : Assertions will lead to students with low prior knowledge to formshorter proofs faster in the posttestSeveral studies have found ATI eﬀects surrounding hint usage where stu-dents with low prior knowledge or proﬁciency beneﬁt more from interven-tions [5, 41, 56]. In particular, Kardan and Conati [41] found that attentionto hints aﬀected student performance in a tutor for teaching constraint sat-isfaction problems, and students with low prior knowledge experienced morepronounced eﬀects from an adaptive hint design intervention. While their in-tervention dealt with both an unsolicited hint interface (highlights to directattention) and scaﬀolding (incremental textual hints), our study focuses onlyon the interface of unsolicited hints. Our ANCOVA results showed a signiﬁcantaptitude-treatment interaction between Prior Proﬁciency { High, Low } andCondition { Assertions, Messages } . Using Tukey’s HSD tests, we inferred thatthe Assertions-Low group outperformed the Messages-Low group in posttestsolution length and time. We also observed a signiﬁcant correlation of theposttest solution length and time with hint needed rate (HNR) for the LowPrior Proﬁciency group, suggesting that using more unsolicited hints as nec-essary components of their proofs helped this group learn better strategies. However, no such relations were observed for the High Prior Proﬁciency group.This suggests that adapting hint timing of Assertions based on proﬁciency mayimprove student performance as in other ITSs [82, 87, 91].6.3 H3 : Assertions foster productive persistence among students with lowprior knowledgePersistent eﬀort is said to be productive when it is accompanied by an improve-ment in posttest performance [13]. Assertions are designed to encourage stu-dents to follow unsolicited hints that direct students toward optimal problem-solving strategies. Results from our empirical study support the notion thatAssertions promote productive persistence. Our cluster analysis showed thatthe majority of the Assertions-Low group exerted more eﬀort (high persis-tence) during training, justiﬁed a higher proportion of unsolicited hints, andperformed better on the posttest than the class average. We also saw a higherproportion of the Messages-Low students in the cluster that exerted less eﬀort(low persistence) in the training, justiﬁed a lower proportion of unsolicitedhints, and performed worse on the posttest than the class average. Interest-ingly, while most of the Assertions-Low group spent more time on unsolvedproblems in training, they took a signiﬁcantly shorter time on the posttestwhile creating shorter posttest solutions, suggesting that the Assertions pro-moted productive persistence (i.e. time well spent) among students with lowprior proﬁciency.6.4 Assertions - a new genre of hintsOverall, this study showcases the importance of eﬀective delivery for unso-licited hints, and a new genre of hints that we call Assertions. We believe thatproviding unsolicited hints as partially worked steps reduced the cognitiveload required for learning from them. Further, increasing spatial contiguityimproved students’ attention towards hints, and the isomorphic format mayhave made it easier for them to understand and use them in their solutions. Weobserved that Assertions led students with low prior knowledge to exert moreproductive persistence in training that resulted in better posttest performance,where they formed signiﬁcantly shorter, more optimal, solutions in signiﬁcantlyless time than their peers in the control condition who only received Messagehints. Assertions provide students with additional problem-solving resourcesthat can enable them to learn through the process of self-explaining (justi-fying) expert steps. We believe that Assertions may be particularly helpfulin multi-step domains, where providing students with partially-worked steps,right next to where they are needed, periodically, and in the same format asother problem-solving steps, could lead students to do more self-explanation(through justifying or completing the partially-worked steps) and by circum-venting help avoidance. voiding Help Avoidance 27 A limitation of this work is the diﬀerence in the timing of Assertions andMessages, which could have impacted the results. While the hint frequencycorrelation analysis showed that students’ posttest performance was not im-pacted by the number of unsolicited hints given, we recognize that the hinttiming may have had an impact on students. This limitation arises from thefact that we are modifying a real adaptive system to achieve practical improve-ments. These two types of hints were designed for diﬀerent purposes. Messageswere intended to help someone who was struggling but forgot about the helpfeature. Assertions were intended to be proactive for students who wouldn’task for help no matter what.Assertions were designed to address the problemthat we observed, that Messages were not helping enough people improve theirperformance or learning.

In this study, we investigated the impact of Assertions, a new genre of unso-licited hints, on the hint usage and posttest performance within a data-driventutoring system. This work is novel in that it leveraged interface alone to ad-dress the help avoidance problem. However, this work did not seek to regulatestudents’ help-seeking, rather we sought to make unsolicited hints more eﬀec-tive through changes in their delivery. The Assertions hint interface made theintelligent tutor more eﬀective, signiﬁcantly improving unsolicited hint usagefor all students. We further demonstrated aptitude-treatment interaction ef-fects where students with low prior proﬁciency receiving Assertions performedbetter in the posttest, in terms of both time and solution length. Our clusteranalysis shows that the students with low prior knowledge who received As-sertions demonstrate more productive persistence in that they exerted morepersistent eﬀort even when failing during training, and used a higher propor-tion of unsolicited hints, but performed better on the posttest than their lowpeers who received Messages.There are three main limitations to this study. Assertions were providedsigniﬁcantly more frequently than Messages. Assertions did not seem to havea negative impact on learning, but rather leveled the playing ﬁeld for studentswith low prior proﬁciency. However, our analyses demonstrated that it wasnot hint frequency but the Assertions interface alone that improved hint us-age. The second limitation was that Assertions appeared randomly, and werenot adapted to individual students. Our results conﬁrm our hypothesis thatthe Assertions have a diﬀerential impact for students with diﬀerent incomingproﬁciency, suggesting that there may be beneﬁts to using individual factorsto determine when to provide Assertions. A third limitation arises from split-ting students into two prior proﬁciency groups. While some studies investigateﬁner-grained partitions, e.g. low, medium/average, and high groups [41], werefrained from doing so to maintain suﬃciently high sample sizes within eachgroup.

This study was a necessary ﬁrst step to identify a hint interface that couldsolve the help avoidance problem. Future work could study the generalizabil-ity of this transformative new genre of unsolicited hints that use the designprinciples of contiguity, attention, and expectation to increase hint immediacyand persuasion to reduce help avoidance in other tutors. Within our tutor, weplan to apply reinforcement learning and other machine learning techniquesto derive an adaptive policy to decide when and if Assertions should be pro-vided to individual students. Since Assertions promote productive persistenceamong students with low prior knowledge, we also plan to develop a modelthat provides Assertions when the tutor detects or predicts unproductive be-haviors [44, 45].

This material is based upon work supported by the National Science Foun-dation under Grant No. 1726550, “Integrated Data-driven Technologies forIndividualized Instruction in STEM Learning Environments.”, led by Min Chiand Tiﬀany Barnes. We would like to thank Nicholas Lytle ([email protected])for suggesting edits in the introduction section to enhance its clarity.

References

1. Aleven, V., Koedinger, K.R.: Limitations of student control: Do students know whenthey need help? In: International Conference on Intelligent Tutoring Systems, pp. 292–303. Springer (2000)2. Aleven, V., Mclaren, B., Roll, I., Koedinger, K.: Toward meta-cognitive tutoring: Amodel of help seeking with a cognitive tutor. International Journal of Artiﬁcial Intelli-gence in Education (2), 101–128 (2006)3. Aleven, V., Ogan, A., Popescu, O., Torrey, C., Koedinger, K.: Evaluating the eﬀective-ness of a tutorial dialogue system for self-explanation. In: International conference onintelligent tutoring systems, pp. 443–454. Springer (2004)4. Almeda, M.V.Q., Baker, R.S., Corbett, A.: Help avoidance: When students should seekhelp, and the consequences of failing to do so. In: Meeting of the Cognitive ScienceSociety, vol. 2428, p. 2433 (2017)5. Arroyo, I., Beck, J.E., Beal, C.R., Wing, R., Woolf, B.P.: Analyzing students’ responseto help provision in an elementary mathematics intelligent tutoring system. In: Papersof the AIED-2001 workshop on help provision and help seeking in interactive learningenvironments, pp. 34–46. Citeseer (2001)6. Ausin, M.S., Azizsoltani, H., Barnes, T., Chi, M.: Leveraging deep reinforcement learn-ing for pedagogical policy induction in an intelligent tutoring system. In: Proceedingsof The 12th International Conference on Educational Data Mining (EDM 2019), vol.168, p. 177. ERIC7. Bakke, S.: Immediacy in user interfaces: An activity theoretical approach. In: Interna-tional Conference on Human-Computer Interaction, pp. 14–22. Springer (2014)8. Barnes, T., Stamper, J.: Automatic hint generation for logic proof tutoring using his-torical data. Journal of Educational Technology & Society (1), 3 (2010)9. Barnes, T., Stamper, J., Croy, M.: Using markov decision processes for automatic hintgeneration. Handbook of Educational Data Mining (2011)10. Barnes, T., Stamper, J.C., Lehmann, L., Croy, M.J.: A pilot study on logic proof tutoringusing hints generated from historical student data. In: EDM, pp. 197–201 (2008)voiding Help Avoidance 2911. Bartholom´e, T., Stahl, E., Pieschl, S., Bromme, R.: What matters in help-seeking? astudy of help eﬀectiveness and learner-related factors. Computers in Human Behavior (1), 113–129 (2006)12. Beck, J.E., Gong, Y.: Wheel-spinning: Students who fail to master a skill. In: Interna-tional conference on artiﬁcial intelligence in education, pp. 431–440. Springer (2013)13. Borghans, L., Duckworth, A.L., Heckman, J.J., Ter Weel, B.: The economics and psy-chology of personality traits. Journal of human Resources (4), 972–1059 (2008)14. Botelho, A., Varatharaj, A., Patikorn, T., Doherty, D., Adjei, S., Beck, J.: Developingearly detectors of student attrition and wheel spinning using deep learning. IEEETransactions on Learning Technologies (2019)15. Bunt, A., Conati, C., Muldner, K.: Scaﬀolding self-explanation to improve learning inexploratory learning environments. In: International Conference on Intelligent TutoringSystems, pp. 656–667. Springer (2004)16. Butcher, K.R., Aleven, V.: Integrating visual and verbal knowledge during classroomlearning with computer tutors. In: Proceedings of the Annual Meeting of the CognitiveScience Society, vol. 29 (2007)17. Butcher, K.R., Aleven, V.: Using student interactions to foster rule–diagram mappingduring problem solving in an intelligent tutoring system. Journal of Educational Psy-chology (4), 988 (2013)18. Chi, M.T., Bassok, M.: Learning from examples via self-explanations. Tech. rep., PITTS-BURGH UNIV PA LEARNING RESEARCH AND DEVELOPMENT CENTER (1988)19. Chi, M.T., De Leeuw, N., Chiu, M.H., LaVancher, C.: Eliciting self-explanations im-proves understanding. Cognitive science (3), 439–477 (1994)20. Cialdini, R.B.: Inﬂuence: Science and practice, vol. 4. Pearson education Boston, MA(2009)21. Cody, C., Mostafavi, B.: Investigating the impact of unsolicited next-step and subgoalhints on dropout in a logic proof tutor. In: Proceedings of the 2017 ACM SIGCSETechnical Symposium on Computer Science Education, pp. 705–705. ACM (2017)22. Conati, C., Jaques, N., Muir, M.: Understanding attention to adaptive hints in educa-tional games: an eye-tracking study. International Journal of Artiﬁcial Intelligence inEducation (1-4), 136–161 (2013)23. Conati, C., Manske, M.: Evaluating adaptive feedback in an educational computer game.In: International workshop on intelligent virtual agents, pp. 146–158. Springer (2009)24. Conati, C., Vanlehn, K.: Toward computer-based support of meta-cognitive skills: Acomputational framework to coach self-explanation (2000)25. Cronbach, L.J., Snow, R.E.: Aptitudes and instructional methods: A handbook forresearch on interactions. Irvington (1977)26. Davies, W., Cormican, K.: An analysis of the use of multimedia technology in computeraided design training: Towards eﬀective design goals. Procedia Technology , 200–208(2013)27. Deke, J., Haimson, J.: Valuing student competencies: Which ones predict postsecondaryeducational attainment and earnings, and for whom? ﬁnal report. Mathematica PolicyResearch, Inc. (2006)28. DiCerbo, K.E.: Game-based assessment of persistence. Journal of Educational Technol-ogy & Society (1), 17–28 (2014)29. Dillard, J.P., Seo, K.: Aﬀect and persuasion. James PRICE dILLARd y Lijian sHEn(coord.), The Sage handbook of persuasion pp. 150–166 (2013)30. Dumdumaya, C., Rodrigo, M.M.: Predicting task persistence within a learning-by-teaching environment. In: Proceedings of the 26th International Conference on Com-puters in Education, pp. 1–10 (2018)31. Duong, H., Zhu, L., Wang, Y., Heﬀernan, N.T.: A prediction model that uses the se-quence of attempts and hints to better predict knowledge:” better to attempt the prob-lem ﬁrst, rather than ask for a hint”. In: EDM, pp. 316–317 (2013)32. Fossati, D., Di Eugenio, B., Ohlsson, S., Brown, C., Chen, L.: Generating proactivefeedback to help students stay on track. In: International Conference on IntelligentTutoring Systems, pp. 315–317. Springer (2010)33. Fossati, D., Di Eugenio, B., Ohlsson, S., Brown, C., Chen, L.: Data driven automaticfeedback generation in the ilist intelligent tutoring system. Technology, Instruction,Cognition and Learning (1), 5–26 (2015)0 Mehak Maniktala et al.34. Healey, C., Enns, J.: Attention and visual memory in visualization and computer graph-ics. IEEE transactions on visualization and computer graphics (7), 1170–1188 (2012)35. Heckman, J.J., Stixrud, J., Urzua, S.: The eﬀects of cognitive and noncognitive abilitieson labor market outcomes and social behavior. Journal of Labor economics (3),411–482 (2006)36. Hegarty, M., Just, M.A.: Constructing mental models of machines from text and dia-grams. Journal of memory and language (6), 717–742 (1993)37. Jin, W., Barnes, T., Stamper, J., Eagle, M.J., Johnson, M.W., Lehmann, L.: Programrepresentation for automatic hint generation for a data-driven novice programming tu-tor. In: International Conference on Intelligent Tutoring Systems, pp. 304–309. Springer(2012)38. Jin, W., Lehmann, L., Johnson, M., Eagle, M., Mostafavi, B., Barnes, T., Stamper,J.: Towards automatic hint generation for a data-driven novice programming tutor.In: Workshop on Knowledge Discovery in Educational Data, 17th ACM Conference onKnowledge Discovery and Data Mining. Citeseer (2011)39. Kai, S., Almeda, M.V., Baker, R.S., Heﬀernan, C., Heﬀernan, N.: Decision tree modelingof wheel-spinning and productive persistence in skill builders. JEDM— Journal ofEducational Data Mining (1), 36–71 (2018)40. Kanfer, R., Ackerman, P.L.: Motivation and cognitive abilities: An integrative/aptitude-treatment interaction approach to skill acquisition. Journal of applied psychology (4),657 (1989)41. Kardan, S., Conati, C.: Providing adaptive support in an interactive simulation forlearning: An experimental evaluation. In: Proceedings of the 33rd Annual ACM Con-ference on Human Factors in Computing Systems, pp. 3671–3680. ACM (2015)42. Koedinger, K.R., Aleven, V., Heﬀernan, N., McLaren, B., Hockenberry, M.: Opening thedoor to non-programmers: Authoring intelligent tutor behavior by demonstration. In:International Conference on Intelligent Tutoring Systems, pp. 162–174. Springer (2004)43. Liu, Z., Mostafavi, B., Barnes, T.: Combining worked examples and problem solving ina data-driven logic tutor. In: International Conference on Intelligent Tutoring Systems,pp. 347–353. Springer (2016)44. Maniktala, M., Barnes, T., Chi, M.: Extending the hint factory: Towards modellingproductivity for open-ended problem-solving. In: Proceedings of the 13th InternationalConference on Educational Data Mining (2020)45. Maniktala, M., Cody, C., Isvik, A., Lytle, N., Chi, M., Barnes, T.: Extending the hintfactory for the assistance dilemma: A novel, data-driven helpneed predictor for proactiveproblem-solving help. In: JEDM— Journal of Educational Data Mining (2020)46. Marwan, S., Lytle, N., Williams, J.J., Price, T.: The impact of adding textual explana-tions to next-step hints in a novice programming environment. In: Proceedings of the2019 ACM Conference on Innovation and Technology in Computer Science Education,pp. 520–526. ACM (2019)47. Mathews, M., Mitrovic, A.: How does students’ help-seeking behaviour aﬀect learning?In: International Conference on Intelligent Tutoring Systems, pp. 363–372. Springer(2008)48. McLaren, B.M., Koedinger, K.R., Schneider, M., Harrer, A., Bollen, L.: Bootstrappingnovice data: Semi-automated tutor authoring using student log ﬁles (2004)49. McLaren, B.M., Lim, S.J., Koedinger, K.R.: When is assistance helpful to learning?results in combining worked examples and intelligent tutoring. In: International Con-ference on Intelligent Tutoring Systems, pp. 677–680. Springer (2008)50. Moreno, R., Mayer, R.E.: Cognitive principles of multimedia learning: The role of modal-ity and contiguity. Journal of educational psychology (2), 358 (1999)51. Mostafavi, B., Barnes, T.: Data-driven proﬁciency proﬁling: proof of concept. In: Pro-ceedings of the Sixth International Conference on Learning Analytics & Knowledge, pp.324–328 (2016)52. Mostafavi, B., Barnes, T.: Evolution of an intelligent deductive logic tutor using data-driven elements. International Journal of Artiﬁcial Intelligence in Education (1), 5–36(2017)53. Mostafavi, B., Eagle, M., Barnes, T.: Towards data-driven mastery learning. In: Pro-ceedings of the Fifth International Conference on Learning Analytics And Knowledge,pp. 270–274 (2015)voiding Help Avoidance 3154. Mostafavi, B., Zhou, G., Lynch, C., Chi, M., Barnes, T.: Data-driven worked examplesimprove retention and completion in a logic tutor. In: International Conference onArtiﬁcial Intelligence in Education, pp. 726–729. Springer (2015)55. Muir, M., Conati, C.: An analysis of attention to student–adaptive hints in an educa-tional game. In: International Conference on Intelligent Tutoring Systems, pp. 112–122.Springer (2012)56. Murray, R.C., VanLehn, K.: A comparison of decision-theoretic, ﬁxed-policy and randomtutorial action selection. In: International Conference on Intelligent Tutoring Systems,pp. 114–123. Springer (2006)57. Murray, T.: An overview of intelligent tutoring system authoring tools: Updated analysisof the state of the art. In: Authoring tools for advanced technology learning environ-ments, pp. 491–544. Springer (2003)58. Nelson-Le Gall, S.: Help-seeking: An understudied problem-solving skill in children.Developmental Review (3), 224–246 (1981)59. Newell, A., Simon, H.A., et al.: Human problem solving, vol. 104. Prentice-Hall Engle-wood Cliﬀs, NJ (1972)60. Paaßen, B., Hammer, B., Price, T.W., Barnes, T., Gross, S., Pinkwart, N.: The contin-uous hint factory-providing hints in vast and sparsely populated edit distance spaces.arXiv preprint arXiv:1708.06564 (2017)61. Paunonen, S.V., Ashton, M.C.: Big ﬁve predictors of academic achievement. Journal ofResearch in Personality (1), 78–90 (2001)62. Price, T., Zhi, R., Barnes, T.: Evaluation of a data-driven feedback algorithm for open-ended programming. International Educational Data Mining Society (2017)63. Price, T.W., Dong, Y., Barnes, T.: Generating data-driven hints for open-ended pro-gramming. EDM , 191–198 (2016)64. Price, T.W., Dong, Y., Lipovac, D.: isnap: towards intelligent tutoring in novice pro-gramming environments. In: Proceedings of the 2017 ACM SIGCSE Technical Sympo-sium on computer science education, pp. 483–488 (2017)65. Price, T.W., Liu, Z., Catet´e, V., Barnes, T.: Factors inﬂuencing students’ help-seekingbehavior while programming with human and computer tutors. In: Proceedings of the2017 ACM Conference on International Computing Education Research, pp. 127–135(2017)66. Price, T.W., Zhi, R., Barnes, T.: Hint generation under uncertainty: The eﬀect of hintquality on help-seeking behavior. In: International Conference on Artiﬁcial Intelligencein Education, pp. 311–322. Springer (2017)67. Price, T.W., Zhi, R., Dong, Y., Lytle, N., Barnes, T.: The impact of data quantity andsource on the quality of data-driven hints for programming. In: International Conferenceon Artiﬁcial Intelligence in Education, pp. 476–490. Springer (2018)68. Puustinen, M.: Help-seeking behavior in a problem-solving situation: Development ofself-regulation. European Journal of Psychology of education (2), 271 (1998)69. Razzaq, L., Heﬀernan, N.T.: Hints: is it better to give or wait to be asked? In: Interna-tional Conference on Intelligent Tutoring Systems, pp. 349–358. Springer (2010)70. Rivers, K., Koedinger, K.R.: Data-driven hint generation in vast solution spaces: a self-improving python programming tutor. International Journal of Artiﬁcial Intelligence inEducation (1), 37–64 (2017)71. Roll, I., Aleven, V., McLaren, B.M., Ryu, E., d Baker, R.S., Koedinger, K.R.: The helptutor: Does metacognitive feedback improve students’ help-seeking actions, skills andlearning? In: International Conference on Intelligent Tutoring Systems, pp. 360–369.Springer (2006)72. Shen, S., Chi, M.: Reinforcement learning: the sooner the better, or the later the better?In: Proceedings of the 2016 Conference on User Modeling Adaptation and Personaliza-tion, pp. 37–44. ACM (2016)73. Shen, S., Mostafavi, B., Lynch, C., Barnes, T., Chi, M.: Empirically evaluating theeﬀectiveness of pomdp vs. mdp towards the pedagogical strategies induction. In: In-ternational Conference on Artiﬁcial Intelligence in Education, pp. 327–331. Springer(2018)74. Snow, R.E.: Aptitude-treatment interaction as a framework for research on individualdiﬀerences in psychotherapy. Journal of consulting and clinical psychology (2), 205(1991)2 Mehak Maniktala et al.75. Stamper, J., Barnes, T., Lehmann, L., Croy, M.: The hint factory: Automatic generationof contextualized help for existing computer aided instruction. In: Proceedings of the9th International Conference on Intelligent Tutoring Systems Young Researchers Track,pp. 71–78 (2008)76. Summerﬁeld, C., Egner, T.: Expectation (and attention) in visual cognition. Trends incognitive sciences (9), 403–409 (2009)77. Sweller, J.: Cognitive load during problem solving: Eﬀects on learning. Cognitive science (2), 257–285 (1988)78. Sweller, J.: Instructional design in technical areas. camberwell. Victoria: ACER Press(1999)79. Sweller, J.: The worked example eﬀect and human cognition. Learning and instruction(2006)80. Sweller, J.: Human cognitive architecture. Handbook of research on educational com-munications and technology pp. 369–381 (2008)81. Tch´etagni, J.M., Nkambou, R.: Hierarchical representation and evaluation of the studentin an intelligent tutoring system. In: International Conference on Intelligent TutoringSystems, pp. 708–717. Springer (2002)82. Timms, M.J.: Using item response theory (irt) to select hints in an its. Frontiers inArtiﬁcial Intelligence and Applications , 213 (2007)83. Ueno, M., Miyazawa, Y.: Irt-based adaptive hints to scaﬀold learning in programming.IEEE Transactions on Learning Technologies (4), 415–428 (2017)84. Vanlehn, K.: The behavior of tutoring systems. International journal of artiﬁcial intel-ligence in education (3), 227–265 (2006)85. Ventura, M., Shute, V.: The validity of a game-based assessment of persistence. Com-puters in Human Behavior (6), 2568–2572 (2013)86. Ventura, M., Shute, V., Zhao, W.: The relationship between video game use and aperformance-based measure of persistence. Computers & Education (1), 52–58 (2013)87. Villesseche, J., Le Bohec, O., Quaireau, C., Nogues, J., Besnard, A.L., Oriez, S.,De La Haye, F., Noel, Y., Lavandier, K.: Enhancing reading skills through adaptivee-learning. Interactive Technology and Smart Education (2018)88. Weerasinghe, A., Mitrovic, A.: Enhancing learning through self-explanation. In: Inter-national Conference on Computers in Education, 2002. Proceedings., pp. 244–248. IEEE(2002)89. Weerasinghe, A., Mitrovic, A.: Supporting self-explanation in an open-ended domain.In: International Conference on Knowledge-Based and Intelligent Information and En-gineering Systems, pp. 306–313. Springer (2004)90. Wobbrock, J.O., Findlater, L., Gergle, D., Higgins, J.J.: The aligned rank transform fornonparametric factorial analyses using only anova procedures. In: Proceedings of theSIGCHI conference on human factors in computing systems, pp. 143–146. ACM (2011)91. Wood, H., Wood, D.: Help seeking, learning and contingent tutoring. Computers &Education (2-3), 153–169 (1999)92. Zhou, G., Price, T.W., Lynch, C., Barnes, T., Chi, M.: The impact of granularity onworked examples and problem solving. In: CogSci (2015)voiding Help Avoidance 33 A : Unsolicited Hint Metrics for each prior proﬁciency group

Prior