[PDF] Cognitive Reflection Test and the Polarizing Force-Identification Questions in the FCI

Abstract

The set of polarizing force-identification (PFI) questions in the FCI consists of six items all basically asking only one question: the set of forces acting on a given body. Although it may sound trivial, these questions are among the most challenging in the FCI. In this work involving 163 students, we investigate the correlation between student performance on the set of PFI questions and the Cognitive Reflection Test. We find that for both scores in the FCI as a whole and in the PFI questions, the range of values of the Pearson coefficient at 95\% confidence interval, is suggestive that cognitive reflection may be one of the contributing factors in the student performance in the FCI. This is consistent with the idea that high level of cognitive reflection may help in eliminating seemingly valid choices (misconceptions) in the FCI that are intuitive from everyday experience or "common sense" but otherwise misleading. The ability to activate System 2 in Dual Process Theory, whether from System 1 or right after reading a physics problem, may contribute in narrowing down the set of prospective valid answers in a given physics problem. Complementary to cognitive reflection are other factors associated with deep understanding of physics whose effects are expected to become more evident with the level of difficulty of a set of physics problems. Given two students with the same level of cognitive reflection, the one with deeper understanding of physics is more likely to get the correct answer. In our analysis, the range of correlation coefficient for the set of PFI questions is downshifted with respect to that for the FCI as a whole. This may be attributed to the more challenging nature of the latter compared to a significant fraction of the remaining questions in the former.

Full PDF

CCognitive Reﬂection Test and the PolarizingForce-Identiﬁcation Questions in the FCI

Allan L. Alinea

Institute of Mathematical Sciences and Physics,University of the Philippines Los Ba˜nos,College, Los Ba˜nos, Laguna 4031 Philippines, [email protected]

Abstract

The set of polarizing force-identiﬁcation (PFI) questions in the FCI consists of six itemsall basically asking only one question: the set of forces acting on a given body. Although itmay sound trivial, these questions are among the most challenging in the FCI. In this workinvolving 163 students, we investigate the correlation between student performance on theset of PFI questions and the Cognitive Reﬂection Test. We ﬁnd that for both scores in theFCI as a whole and in the PFI questions, the range of values of the Pearson coeﬃcient at 95%conﬁdence interval, is suggestive that cognitive reﬂection may be one of the contributingfactors in the student performance in the FCI. This is consistent with the idea that highlevel of cognitive reﬂection may help in eliminating seemingly valid choices (misconceptions)in the FCI that are intuitive from everyday experience or “common sense” but otherwisemisleading. The ability to activate System 2 in Dual Process Theory, whether from System1 or right after reading a physics problem, may contribute in narrowing down the set ofprospective valid answers in a given physics problem. Complementary to cognitive reﬂectionare other factors associated with deep understanding of physics whose eﬀects are expectedto become more evident with the level of diﬃculty of a set of physics problems. Given twostudents with the same level of cognitive reﬂection, the one with deeper understanding ofphysics is more likely to get the correct answer. In our analysis, the range of correlationcoeﬃcient for the set of PFI questions is downshifted with respect to that for the FCI as awhole. This may be attributed to the more challenging nature of the latter compared to asigniﬁcant fraction of the remaining questions in the former. keywords:

Polarizing Force-Identiﬁcation Questions; Cognitive Reﬂection Test; ForceConcept Inventory; Dual Process Theory; Physics Educationaccepted for publication: European Journal of Physics

In spite of the enormity and complexity of possibleinformation conﬁgurations that it can process, it ap-pears that the way human mind thinks or decides ona given problem or situation, can often be simpliﬁedinto merely two systems. Dual process theory (DPT)in psychology distinguishes these two systems as intu-itive (‘System 1’), one that is considered fast and au-tonomous, and analytic (‘System 2’), the other one thatis considered as deliberative and slow; see e.g., Refs.[1, 2, 3]. System 1 allows us to make quick decisionswith minimal strain on our mental resources (e.g., rec-ognizing a friend in a classroom). On the other hand,System 2 enables us to solve problems where higher- order thinking is required (e.g., solving a 10 × su-doku ).In dealing with diﬀerent situations, it is helpfulto eﬃciently choose the appropriate process to use.More properly, with System 1 often proceeding “uncon-sciously,” it is important to be cognizant of the need toactivate System 2 on top of System 1. The CognitiveReﬂection Test (CRT) [4] is a widely used measure forthe propensity to override System 1 by System 2. Itis composed of three-item set of questions designed toinitially draw the problem solver into activating Sys-tem 1. Considering the bat-and-ball problem in theCRT, for instance, using System 1, the tendency is toanswer 10 cents. A quick reﬂection however, indicatesthat this cannot be correct; for then, the total cost1 a r X i v : . [ phy s i c s . e d - ph ] O c t ould be $0 .

10 + $1 .

10 = $1 .

20! The correct answerusing short algebraic manipulation or trial-and-error is5 cents.Scores in the CRT are found to have moderate cor-relation with performance on Wonderlic Personnel Test[4], heuristics-and-biases tasks [5], incidences of con-junction fallacy [6, 7], and susceptibility to some be-havioral biases [8], amongst others. From a simpliﬁedperspective, these studies suggest that people who tendto think deeply, as may be measured by CRT, are ableto make better decisions and solutions to problems.In the ﬁeld of Physics Education, to which this paperbelongs, such a perspective although simpliﬁed, oﬀersan attractive avenue to look into the possible correla-tion between CRT and student performance on somestandardized tests. The idea is that DPT, as it relatesto CRT, may be able to explain the way students ap-proach and solve problems in Physics. This in turn,may give us an insight about the Physics learning pro-cess and as educators, the ways by which we may beable to improve it.Following similar line of thinking, Wood, Galloway,and Hardy [9] (see also Refs. [10, 11]) investigated thequestion “Can dual processing theory explain physicsstudents’ performance on the Force Concept Inventory?[FCI]” They examined the relationship between stu-dent performance on CRT and FCI [12, 13] and found a moderately positive linear correlation between thetwo for both pretest and post test. The “ﬁndings indi-cate that students who are more likely to override thesystem 1 intuitive response and to engage in the moredemanding cognitive reﬂection needed to answer theCRT question correctly are also more likely to scorehighly on the FCI, implying that similar cognitive pro-cesses account for at least some of the cognitive abilitiesneeded for each test.” [9]This study intends to further probe the possible re-lationship between CRT, DPT, and the FCI. In partic-ular, while not necessarily neglecting FCI as a whole,we wish to investigate student performance in the set ofsix Polarizing Force-Identiﬁcation (PFI) questions (seeRefs. [14, 15]) in the FCI, namely, questions 5, 11, 13,18, 29 and 30, with their scores in the CRT. This sub-set of the FCI eﬀectively asks only one basic question:identiﬁcation of the force(s) acting on a given body.Surprisingly, out of the ﬁve choices for each question,the majority of the students tend to only select twochoices (polarizing choices). One of these choices is thecorrect answer containing the right set of forces actingon a given body while the other one contains the rightset of forces plus at least one erroneous force; see

Fig. . The inclusion of this erroneousforce confuses students causing the “polarization” ofstudent response.Figure 1: In this sample question, one is asked to identify the force(s) acting on the sphere. Whereas moststudents may be able to identify forces i and ii correctly, the addition of erroneous force iii may confused them,causing “polarization” of student answers between choices (D) and (E).

The subjects of our study were 163 students under en-gineering, chemistry, and physics degree programs of-fered by the University of the Philippines Los Ba˜nos.All of these students have already passed the prescribedcalculus-based mechanics course and were taking otherfundamental physics course under the author of thiswork, upon sitting for the FCI and CRT. The two testswere administered as pen-and-paper test and studentswere given incentives to take the tests seriously. Datagathering took place at the University of the Philip-pines Los Ba˜nos from Academic Years 2017-18 to 2018-2019.To minimize possible bias with the CRT scores, weopted to remove results for students who did not takethe test for the ﬁrst time. The remaining 163 testresults for the FCI and CRT were subjected to itemanalysis with the help of ZipGrade [19] and a spread- sheet application. Common statistical parameters suchas mean, standard error, and Pearson coeﬃcients werecalculated to ﬁnd possible meaningful relationships be-tween scores in FCI as a whole, set of PFI questions,and CRT. Table 1 shows the distribution of students with respectto CRT scores ranging from zero out of three (all wronganswers) to three out of three (perfect score). Abouthalf (47%) of the students got a perfect score in thetest while only one tenth (10%) got the lowest score.With 43% scoring one or two, the skewed frequencydistribution drives the average to 2.1/3 (closed to thatof Ref. [9], 2.3, with similar cohorts). This is aboutone and a half times higher than that in Ref. [18] cov-ering more than 40,000 students. The diﬀerence maybe due to stronger inclination of our subjects to mathe-matics as suggested by their chosen degree programs inengineering, chemistry, and physics, compared to thepopulation with more varied interests investigated inRef. [18]. In Ref. [20], the authors found a moder-ately positive correlation between cognitive reﬂectionand numeric ability.Majority of the students who got the correct an-swers perform some sort of short mathematical calcula-tions on paper. This constitute the majority of our testsubjects based on Table 2. For the bat-and-ball prob-lem, for instance, majority of the nearly 80% who gotthe correct answer, approached the problem by solvingsome form of algebraic equation(s); some did try trial-and-error to ﬁt the conditions of the problem. For thelilypad problem, geometric sequence either from day1 to day 48 or backwards from day 48, can be foundin the student solutions with correct answers. For themachine problem, short solutions involving the use ofratio and proportion can be found on student papers.Admittedly, few students who got the wrong answersmade some marks on paper in addition to the ﬁnal an-swer. However, the “seriousness” of these scribbles arenot at par with those of students who got the correctanswer(s). In the lilypad problem for instance, one canﬁnd ‘48/2’ yielding 24 days—the intuitive but wronganswer. visibility on the performance in CRT RT Score 0/3 1/3 2/3 3/3 MeanPercent Students (No. of Students) 10% (16) 22% (36) 21% (34) 47% (77) 2.1

Table 1: Percent (and number) of students who got CRT scores from 0/3 (all wrong) to 3/3 (perfect score).

CRT Problem bat and ball machine lilypadPercent Students (No. of Students) 79% (128) 59% (96) 68% (111)

Table 2: Percent (and number) of students who answered each CRT [4] question correctly. Problems 1, 2, and 3are labeled bat and ball , machine , and lilypad , respectively.It is worth noting that students were not requiredto present a mathematical solution nor any other formof solution to the CRT; only the ﬁnal answers were re-quired in the closely supervised test. This has beenmade clear in the instructions explained in class beforestudents could start the tests. Any mathematical solu-tion (be it serious or simply light scribbles) in the formof algebraic manipulation, ratio and proportion, or ge-ometric sequence, is out of their own volition. We ﬁndthat the majority of students who got the wrong answerin any of the CRT problems simply gave the wrong in-tuitive answers (i.e., 10 cents, 100 minutes, and 24 daysfor the bat-and-ball, machine, and lilypad problems re-spectively) without any supplemental solution at all, ortried to perform some light mathematical scribbles onpaper only to end up with or conﬁrm the same wrongintuitive answers; e.g., writing 1:1:1 for the machineproblem then yielding the intuitive but wrong answerof 100 minutes. The intuitive answers account for 86%,69%, and 54% of all the wrong answers in the bat-and-ball, machine, and lilypad problems respectively; thisis the majority of wrong answers as in Ref. [9]. Onthe other hand, among the correct student responsesare clear cases with erasure marks covering the wrongintuitive answers right beside the correct one. Thisindicates the transition from System 1 to System 2,consistent with the aim of CRT.Having said this, there is a possibility that onecould have actually activated System 2, as may be ev-ident in their scratch works, for a given CRT question,without ﬁnding the correct answer nor the intuitivebut wrong answer. This may then weigh against theaccuracy of CRT score as a measure of the tendencyto activate System 2. Assuming that students who gotthe correct answers in the CRT activated System 2 andthose who answered the intuitive but wrong answers(10 cents, 100 minutes, 24 hours) activated only Sys-tem 1, this could possibly account for a maximum of 14%, 31%, and 46% of all the wrong answers in the bat-and-ball, machine, and lilypad problems, respectively.However, consistent with the immediately precedingtwo paragraphs, we ﬁnd this group of students to besomewhat of a polar opposite to the group of studentswho got the correct answer(s)—most students who gotthe wrong but non-intuitive answers did not write anyform of solution at all. Some scribbled one-liner nu-meric calculation (e.g., ‘100/5’ for the machine problemand ‘48/4’ for the lilypad problem) which may hardlybe considered as good evidence of activating System 2.For the three CRT problems we ﬁnd only three cases(two for the bat-and-ball problem, one for the machineproblem, and zero for the lilypad problem) with sometractable short algebraic calculation corresponding towrong and non-intuitive answers. This gives us conﬁ-dence to hold on to the raw CRT score as our measureof the level of cognitive reﬂection at least as far thescope of this study is concerned.After having discussed the CRT results, let us nowlook into the student performance in the FCI with re-spect to CRT. Figure 2 shows the distribution of stu-dent scores for the two tests. For CRT scores rangingfrom 1 to 3, the FCI mean score exhibits an upwardtrend suggestive of a good linear correlation with theformer for the mentioned range. However, the FCImean score for the CRT score of zero, in addition to theerrors associated with the mean, somewhat smears thisprospect for a possible good positive linear correlation.The Pearson coeﬃcient for the CRT-FCI scores turnsout to be 0.24 in the approximate range [0.09, 0.38] (computed through the Fisher transformation) at 95%conﬁdence interval. Assuming the same normality con-dition, this is consistent with the results in Ref. [9] forthe pretest, r = 0 .

38 in the approximate range [0.23,0.51] and post test, r = 0 .

32 in the approximate range[0.17, 0.46] , at the same conﬁdence interval; see Ref.[21] about comparing correlation coeﬃcients. r as narrow as possible. We approximated that given the same base value of the correlationcoeﬃcient, we may need a sample size of (cid:38)

4e provide a two-fold explanation for this ﬁnding.Firstly, if CRT is a good measure of the tendency toshift from System 1 to System 2, within the contextof DPT, it means that we cannot rule out the possi-bility that one of the contributing factors in gaining ahigh score in the FCI is the high level of cognitive re-ﬂection. The questions in the FCI each have four outof ﬁve incorrect answers in the choices, many of whichinvoke student misconception or misleading preconcep-tion [22, 23] in fundamental classical mechanics. Manyof these misconceptions or preconceptions in turn, arefrom “common-sense” everyday experiences that con-stitute their intuition. For instance, students may havea predisposition to decide right away from “commonsense” that heavier objects tend to fall faster than alighter object of the same size and shape. A low levelof cognitive reﬂection can drive a student to select in-tuitive answers that are wrong. This may then lead, asone of the contributing factors, to low FCI score.Secondly, while a shift from System 1 to System 2 may contribute in the more “serious” drive to ﬁndthe correct answer in one FCI question, this may notbe enough. After System 2 is activated, some incor-rect choices may be discarded after a short but rela-tively deeper reﬂection compared to System 1. How-ever, there could remain further hurdles—the other in-correct answers—before a student could pinpoint thecorrect answer. This means that even if all studentsstart out with activating System 2 without necessarilypassing System 1 (similar to that when encountering amath problem to ﬁnd √ . Distribution of FCI scores with respect to CRT scores. The error bars are set one standard error of themean above and below the mean.

We identiﬁed and elaborated the six PFI questions inthe FCI in our earlier study presented in Ref. [14].Back then our sample size was only about 50 (inter-national) students. And although relatively small, thepattern for the PFI questions was sharp enough wor-thy of publication. Figure 3 conﬁrms the existence ofthese PFI questions, now with a much larger sample size of 163 students—triple that of the former study. Ascan be seen in the ﬁgure, majority of the students, forthe six force identiﬁcation questions, eﬀectively answeronly two (polarizing choices) out of the ﬁve choices.Except possibly for the identiﬁed PFI question 3, wherethe number of students who chose letter B is similar to that for letter D (the correct answer), the six bargraphs are consistent with the result in Ref. [14]. Distribution of scores for the six PFI questions (Questions 5, 11, 13, 18, 29 and 30 in the FCI) withrespect to CRT scores. Of the ﬁve choices, with X meaning no answer, majority of the students chose the twopolarizing choices—one that contains the correct answer and the other one that is eﬀectively a superset of thecorrect answer.

With the PFI questions at hand, let us look into itspossible relationship with the CRT. Figure 4 shows thedistribution of student scores for the set of PFI ques-tions with respect to the CRT score. The graph shownis eﬀectively a subset of that shown in Fig. 2 involv-ing all the questions in the FCI. For the PFI questionswith respect to CRT, the error bars are wider. Thereseems to be a rising trend between PFI mean scoreand CRT score from CRT score of 0 up to 2. However,the downshift at CRT score equal to 3 seems to havespoiled this possible trend. All in all, we get a correla-tion coeﬃcient of r = 0 .

096 in the approximate range[-0.06, 0.25] at 95% conﬁdence interval.The base value of the Pearson coeﬃcient is smallbut its range tells us that we cannot simply set asidecognitive reﬂection in view of student performance in the PFI questions. The identiﬁcation of forces actingon a given body, as simple as it may sound, is one ofthe most basic skills necessary in the study of dynam-ics. Yet even the acquisition of this very basic skillis plagued by misconceptions or misleading preconcep-tions directly or indirectly from everyday experiencesthat form our intuition. Instances of these conceptsinclude the requirement for a force to sustain the mo-tion of a body (in a vacuum) and the existence of cen-trifugal force (in an inertial frame). When confrontedwith questions asking for a set of forces acting on agiven body, low level of cognitive reﬂection may leadto the inclusion of intuitive but misleading forces. Deepthinkers on the other hand, may rule out these wrongforces, eﬀectively narrowing down the set of prospec-tive correct answers.6igure 4:

Distribution of scores for the set of PFI questions in the FCI with respect to CRT scores. The errorbars are set one standard error of the mean above and below the mean.

Having said this, our range of the Pearson coeﬃ-cient is evidently on the lower half of its absolute spec-trum from 0 to 1. Similar to that for the FCI as awhole, we contend that on top of the ability to shiftfrom System 1 to System 2, is the need for deep un-derstanding of Physics concepts or ideas to pinpointthe right set of forces acting on a given body. In otherwords, we see that high level of cognitive reﬂection issome sort of initial “push” needed to submerge one intoa sea of thought, and complemented by will, naturaltalent, and/or industry, may lead them to acquire theright understanding or decision in solving physics prob-lems be it as basic as force identiﬁcation or as complexﬂying a real rocket.Before we leave this sub-section, we take cognizanceof the downwshifted range for the Pearson coeﬃcientfor PFI score vs. CRT score, with respect to FCI score(as a whole) vs. CRT score; that is, from [0.09, 0.38] to[-0.06, 0.25]. Although the intervals are still overlap-ping, we take our freedom to account for the possible diﬀerence (hopefully to be resolved in future studies).Figure 5 shows the distribution of the number of stu-dents who got the correct answer for each FCI ques-tion. Based on the ﬁgure, our students found FCI itemnumbers 5, 11, 13, 14, 17, 18, 21, 25, 26, and 30, tobe the top 10 most challenging FCI questions. Of thesix-item set of PFI questions, ﬁve belongs to these top10 questions. We may see then that on average, thesubjects of our study found the set of PFI questions tobe more challenging compared to the rest of the FCIquestions and this possibly caused the downward shiftin the correlation coeﬃcient. We are inclined to thinkthat cognitive reﬂection stands as an important con-tributing factor in the student performance in the setof PFI questions and in the FCI as a whole. When stu-dents activate System 2, this is where the other factorsassociated with deep understanding of physics comeinto play. The gravity of these other factors becomemore evident with the level of diﬃculty faced by stu-dents in answering a given set of physics problems.

The operation of the human mind is still one of mostcomplex processes far from our complete comprehen-sion. From the perspective of a Physics Educator, thereis a need to understand it so as to optimize studentlearning experience. But as the goal post of completeunderstanding is still far beyond the horizon, we aredelighted of large strides leading to this end. Dual Pro-cess Theory may be seen as one these strides telling usof a signiﬁcant simpliﬁcation employing two systemsof thought processes: one that is intuitive (System 1)and the other one being analytic (System 2). The Cog- nitive Reﬂection Test is a good measuring instrumentof cognitive reﬂection indicating the tendency to shiftfrom System 1 to System 2. The use of CRT oﬀers anattractive avenue to look into the relationship betweencognitive reﬂection and student performance in Physicstests such as the FCI.In this study we look into the idea that studentswho can easily transition from System 1 to System 2or start right away with System 2 in DPT, may beable to perform better in the set of polarizing force-identiﬁcation questions in the FCI and in the FCI as awhole. The result of our analysis of tests involving 163students is suggestive that cognitive reﬂection is oneof the contributing factors involved in student perfor-7igure 5:

Distribution of the number of students who got the correct answer for each question in the FCI. Thetop 10 (out of 30) most challenging problems based on student performance are identiﬁed in graph as the ‘lowest10’. Of these 10 questions, ﬁve belong to the set of PFI questions. mance in the FCI and the set of PFI questions. Ourinsight is that high level of cognitive reﬂection may en-able students to cross out seemingly correct choices inthe test that are normally part of our intuition fromeveryday experience or “common sense” but otherwisemisleading. Complementary to cognitive reﬂection areother factors associated with deep understanding inphysics. These become more evident with the levelof diﬃculty of a given set of physics problems. We ﬁndthat the range of correlation coeﬃcient between CRTscore and PFI score is downshifted with respect to thatfor CRT score and FCI score. The possible diﬀerencemay be attributable to the set of PFI questions beingmore challenging on average compared the rest of FCIquestions.Looking ahead, with all its eﬀorts, this study hasonly covered a small (but signiﬁcant) portion of stu-dent learning and performance in physics in relationto cognitive reﬂection and DPT. From here, we fore-see future studies involving further tests hopefully withlarger number of participants. Other Physics invento-ries or tests related to scientiﬁc reasoning ability (seeRefs. [24, 25]) may be explored to ﬁnd correlations withthe level of cognitive reﬂection. Considering CRT, ahigher resolution test (see the proposed expanded CRTin Ref. [26]) in tandem with an expanded PFI ques-tionnaire, may be used to better resolve diﬀerences inthe level of cognitive reﬂection. Regarding the admin-istration of the CRT, the use of “mild subterfuge” as inRef. [9] may be employed and studied to see its eﬀect on the correlation between cognitive reﬂection and FCIor any other standardized Physics Test.

References [1] Neys, Wim De, editor. Dual Process Theory 2.0.Routledge, an Imprint of the Taylor & FrancisGroup, 2018.[2] Stanovich, Keith E, and Richard F West. Individ-ual Diﬀerences in Reasoning: Implications for theRationality Debate? Cambridge University Press,2001.[3] Kahneman, Daniel. Thinking, Fast and Slow.Nota, 2013.[4] Frederick, Shane. Cognitive Reﬂection and Deci-sion Making.

Journal of Economic Perspectives , (4), 25–42, 2005.[5] Toplak, Maggie E., et al. The Cognitive ReﬂectionTest as a Predictor of Performance on Heuristics-and-Biases Tasks. Memory & Cognition , (7),1275–1289, 2011.[6] Oechssler, J¨org, et al. Cognitive Abilities and Be-havioral Biases.” Journal of Economic Behavior &Organization , (1), 147–152, 2009.[7] Liberali, Jordana M., et al. Individual Diﬀerencesin Numeracy and Cognitive Reﬂection, with Im-plications for Biases and Fallacies in Probability8udgment. Journal of Behavioral Decision Mak-ing , (4), 361–381, 2011.[8] Hoppe, Eva I., and David J. Kusterer. BehavioralBiases and Cognitive Reﬂection. SSRN ElectronicJournal, 2009.[9] Wood, Anna K., et al. Can Dual Processing The-ory Explain Physics Students’ Performance onthe Force Concept Inventory? Physical ReviewPhysics Education Research , (2) 2016.[10] M. Kryjevskaia, M. R. Stetzer, and N. Grosz, An-swer ﬁrst: Applying the heuristic-analytic theoryof reasoning to examine student intuitive think-ing in the context of physics, Phys. Rev. ST Phys.Educ. Res . 10, 020109, 2014.[11] C. R. Gette, et al. Probing Student Reasoning Ap-proaches through the Lens of Dual-Process Theo-ries: A Case Study in Buoyancy.

Physical ReviewPhysics Education Research , (1), 2018.[12] D. Hestenes, M. Wells, and G. Swackhamer, ForceConcept Inventory, Phys. Teach. (3), 141-158,1992[13] A. Savinainen and P. Scott, The Force Concept In-ventory : a tool for monitoring student learning, Phys. Educ. Physics Education , (2), 210–217, 2015.[15] A. L. Alinea, and W. Naylor. Gender Gap and Po-larisation of Physics on Global Courses. PhysicsEducation , (2), 2017.[16] R. R. Hake, Interactive-engagement versus tra-ditional methods: a six-thousand-student surveyof mechanics test data for introductory physicscourses, Am. J. Phys. Be-havior Research Methods . (5), 1953–1959, 2017. [18] Bra˜nas-Garza, Pablo, et al. Cognitive ReﬂectionTest: Whom, How, When. Journal of Behavioraland Experimental Economics , Frontiers in Psy-chology , , 2015.[21] K. L. Wuensch, (2019) Comparing Corre-lation Coeﬃcients, Slopes, and Intercepts,http://core.ecu.edu/psyc/wuenschk/docs30/CompareCorrCoeﬀ.pdf,retrieved Mar 15, 2020.[22] I. A. Halloun and D. Hestenes. Common SenseConcepts about Motion. American Journal ofPhysics , (11), 1056–1065, 1985.[23] M. Finegold and P. Gorsky. Learning aboutForces: Simulating the Outcomes of PupilsMisconceptions. Instructional Science , (3),251–261, 1988.[24] V. P. Coletta and J. A. Phillips, Interpreting FCIscores: Normalized gain, preinstruction scores,and scientiﬁc reasoning ability, American Journalof Physics , 1172, 2005.[25] S. Ates, and E. Cataloglu. The Eﬀects of StudentsReasoning Abilities on Conceptual Understand-ings and Problem-Solving Skills in IntroductoryMechanics. European Journal of Physics (6),1161–71, 2007.[26] M. E. Toplak, R. F. West, and K. E. Stanovich.Assessing miserly information processing: An ex-pansion of the Cognitive Reﬂection Test. Thinking& Reasoning ,20