[PDF] Does transitioning to online classes mid-semester affect conceptual understanding?

Abstract

The Force Concept Inventory (FCI) can be used as an assessment tool to measure the gains in a cohort of students. In this study it was given to first year mechanics students (N=256 students) pre- and post-mechanics lectures, for students at the University of Johannesburg. From these results we examine the effect of switching mid-semester from traditional classes to online classes, as imposed by the COVID-19 lockdown in South Africa. Overall gains and student perspectives indicate no appreciable difference of gain, when bench-marked against previous studies using this assessment tool. When compared with 2019 grades, the 2020 semester grades do not appear to be greatly affected. Furthermore, initial statistical analyses also indicate a gender difference in mean gains in favour of females at the 95% significance level (for paired data, N=48). A survey given to students also appeared to indicate that most students were aware of their conceptual performance in physics, and the main constraint to their studies was due to difficulties associated with being online. As such, the change in pedagogy and the stresses of lockdown were found to not be suggestive of a depreciation of FCI gains and grades.

Full PDF

DDoes transitioning to online classes mid-semesteraﬀect conceptual understanding?

Emanuela Carleschi, Anna Chrysostomou, Alan S. Cornell

Department of Physics, University of Johannesburg, PO Box 524, Auckland Park2006, South AfricaE-mail: [email protected], [email protected], [email protected]

Wade Naylor

Immanuel Lutheran College, PO Box 5025, Maroochydore 4558, AustraliaE-mail: [email protected]

Abstract.

The Force Concept Inventory (FCI) can be used as an assessment toolto measure the gains in a cohort of students. In this study it was given to ﬁrstyear mechanics students ( N = 256 students) pre- and post-mechanics lectures, forstudents at the University of Johannesburg. From these results we examine the eﬀectof switching mid-semester from traditional classes to online classes, as imposed by theCOVID-19 lockdown in South Africa. Overall gains and student perspectives indicateno appreciable diﬀerence of gain, when bench-marked against previous studies usingthis assessment tool. When compared with 2019 grades, the 2020 semester grades donot appear to be greatly aﬀected. Furthermore, initial statistical analyses also indicatea gender diﬀerence in mean gains in favour of females at the 95% signiﬁcance level (forpaired data, N = 48). A survey given to students also appeared to indicate thatmost students were aware of their conceptual performance in physics, and the mainconstraint to their studies was due to diﬃculties associated with being online. As such,the change in pedagogy and the stresses of lockdown were found to not be suggestiveof a depreciation of FCI gains and grades. Keywords : Physics Education; Large Cohort Courses; Online Teaching; Force ConceptInventory; FCI. a r X i v : . [ phy s i c s . e d - ph ] J a n oes transitioning to online classes mid-semester aﬀect conceptual understanding?

1. Introduction

Many studies through the years have advocated for various changes to teaching pedagogyaway from the so-called traditional lecturing pedagogy, such as ﬂipped classrooms, andpeer-assessment [1, 2, 3, 4, 5, 6], to name but a few. Many of these studies haveused assessment tools, such as the Force Concept Inventory (FCI) [7, 8, 9], to assess theeﬃcacy of these changes, where such drastic changes have been regarded over many yearsof studies. As such, with the FCI now considered as a de facto standard for assessingthe conceptual basis of Newtonian mechanics, and a gain when comparing these testresults both before and after an introductory mechanics course of approximately 25%[10, 11] as a benchmark for standard performance in a course’s approach, we can nowask the question of what happens when the pedagogical approach is forcibly changedduring the semester, with no forewarning to either the lecturers or the students.Whilst this study has been conducted for only one year worth of data, where asmentioned above, usual analyses of such changes in pedagogy are conducted over years,during the 2020 lockdown experienced in South Africa as a result of the COVID-19pandemic, we were presented with a unique opportunity to test the resilience of thisassessment tool, as well as the eﬀects such sudden changes could have on the usual gainsin a ﬁrst year mechanics course. As such, our study was conducted with a very diversegroup of ﬁrst year students enrolled in the University of Johannesburg (UJ), Faculty ofEngineering and the Built Environment (FEBE), whose demographics (average 2015 -2019) range as follows: African 92.8%; White 3.8%; Indian 2.3%; Coloured 1.1% [12].Yet for such a diverse background, the gains appear to have remained comparable tothe bench-marked studies of the last three decades.Given the necessary switch to online platforms, such as Blackboard [13], we arealso able to study the engagement of students within this online learning environment(through their attendance and marks on continuous assessments). This also includesthe time taken to complete various assessment tasks.It may be considered somewhat surprising that even with the above-mentionedchange in pedagogy, we found no appreciable change in the FCI gain across this course(see later comments). In seeking to break this down we shall look at a number of factors,including the previously studied Gender Gap [14, 15, 16, 17, 18, 19, 20], where therehas been a resulting gender diﬀerence in favour of males in the student performanceon standardised assessments, such as the FCI in previous studies. However, our resultsindicate this does not seem to be the case here. Another aspect at play here couldbe the increased peer scaﬀolding (along the lines of Mazur’s peer evaluation [21])as there had been an increased reliance on discussion groups with peers, due to theCOVID lockdown. This persisted into the second semester, as was highlighted by theperspectives of the mechanics lecturers, where a range of discussion boards, interactivetutorials, and WhatsApp groups were still used. As presented in Table 1, it can be seenthat the scores for the average semester 2 mark for both 2019 and 2020, which includesa combination of coursework, practicals and exams, were not appreciably diﬀerent. The oes transitioning to online classes mid-semester aﬀect conceptual understanding? course throughput , where explained in the table caption, in 2020 a slightlylower semester mark was used to gain entrance to the ﬁnal exam.Given these motivations our paper shall be presented as follows: In section 2 weshall detail the methodology of our study, followed by the analysis tools and techniquesused in this study in section 3, along with supporting appendices, and ﬁnally we shallconclude in section 4. 2019 2020no. of enrolled students (excluding cancellations) 349 404average semester mark (%), all students 54 63average semester mark (%), only students who qualiﬁed for exam 58 66% students qualiﬁed for written exam 75 93average exam mark (%) 48 50average course mark (%), all students 45 55average course mark (%), only students who qualiﬁed for exam 53 58course throughput (%) 49 67.5

Table 1:

Comparison of the 2019 and 2020 marks for engineering physics 1 in the secondsemester. Note that: 1) the course throughput is calculated after the main exam only,excluding the results of the supplementary exams; 2) In 2020 the entrance requirementfor the exam was lowered to 30% for the theory part of the course (instead of the usual40% in 2019 and previous years) in light of the COVID-19 pandemic.

2. Methodology

The subjects for our testing were the 2020 ﬁrst year cohort of engineering students atthe University of Johannesburg. The class consisted of approximately 400 students.The course was initially taught as a traditional lecture based course, with a weeklyonline assessment, fortnightly tutorials and fortnightly practicals (these being done inperson in groups of approximately 30 students with graduate students acting as tutorsand demonstrators of the practicals). The academic year had begun in early Februaryof 2020, where the pre-mechanics course FCI test was conducted in late February on N = 256 students. † The end result of the FCI is deﬁned by the normalised gain G [10]: G = (cid:104) % S f (cid:105) − (cid:104) % S i (cid:105) − (cid:104) % S i (cid:105) , (1) † From a cohort of 400 students 256 students sat the FCI, where for pre- and post-test cases therewere 144 and 166 students, respectively. After removing several redundant attempts, there were256 data subjects in total. Those who took both tests (paired) were 48 in total. oes transitioning to online classes mid-semester aﬀect conceptual understanding? S f and % S i are the ﬁnal and initial scores respectively. We found that for thiscohort (for paired data) N = 48 the average gain was at 24%. We will comment furtheron these results in Sec. 3.1.First we should note that South Africa was placed in a hard lockdown in mid-March 2020, and teaching was switched within the period of a few weeks to a purelyonline format. Lectures were replaced with recorded video content, and online platformsfor engagement with students were employed (such as consultations using BlackBoardCollaborate Ultra, and WhatsApp discussion groups). As the easing of the lockdownoccurred towards the end of the mechanics course, a post-FCI test was possible toadminister to a smaller voluntary group of students, of which N = 48 had done thepre-test.Note that online teaching in a similar manner continued into the second semester(where the physics syllabus covers electromagnetism and optics), where one of theauthors was a lecturer for this course. As such, continued performance could also beassessed for these students over the entirety of their ﬁrst year physics studies, as well asconducting an online survey with these students for their perspectives of the course andtheir feelings of success or failure. It should also be noted that the authors of this studyare actively taking part in the running, teaching and tutoring of the students. Hence,we can also provide the perspectives of the lecturers of this course, and those of thetutors involved, including the stresses of changing the teaching format mid-semester.The methodology employed to unpack this collected data relied primarily onstandard statistical parameters, including the mean, standard deviation, percentdiﬀerences, p -values for the t -test diﬀerence of means, and correlations through R [22, 23]and a spreadsheet. ‡ Using our data from this cohort in Sec. 3 we shall investigate:(i) student performance from pre-test to post-test, including an analysis of theirperformance in the pre- and post-tests via a question breakdown, see Sec. 3.1,(ii) the existence of a polarisation eﬀect in 6 particular questions [25], see Sec. 3.2,(iii) a possible gender diﬀerence in the FCI for paired data, see Sec. 3.3,(iv) and a discussion of student and staﬀ surveys, where as is usual we used a Likertscale [26], see Sec. 3.4.

3. Results & Analyses

In this section we ﬁrst present some analyses of the distribution of the scores for studentsin the pre- and post-tests ( N = 256), starting in the left panel of Fig. 1 (where the right ‡ Interested readers who wish to familiarise themselves with the basics of statistical analysis,including the t -test, correlation, ANOVA, and measures of variation (e.g. standard deviationand standard error of the mean), among others, may want to consult Refs. [22, 24]. oes transitioning to online classes mid-semester aﬀect conceptual understanding?

30 3025 2520 2015 1510 105 50 0 N o o f s t uden t s Percentage (%)

30 3025 2520 2015 1510 105 50 0 N o o f s t uden t s Percentage (%) pre-test marks (ALL)post-test marks (ALL)

11 1110 109 98 87 76 65 54 43 32 21 10 0 N o o f s t uden t s Percentage (%)

11 1110 109 98 87 76 65 54 43 32 21 10 0 N o o f s t uden t s Percentage (%) pre-test marks (PAIRED)post-test marks (PAIRED)

Figure 1:

A frequency histogram for the pre- and post-test data in total ( N = 256)and for the paired data ( N = 48). Mean SD Min. Max.pre-test ALL 34.3 15.2 3 80pre-test PAIRED 34.7 17.2 3 80post-test ALL 44.1 22.8 0 100post-test PAIRED 50.8 22.4 10 100 Table 2:

Mean, standard deviation (SD), minimum and maximum % marks asdisplayed in Fig. 1. The paired data consisted of a subset ( N = 48) of the N = 256who sat either the pre- or post test.panel is for the paired data, N = 48). The diﬀerence in distributions (pre- compared topost-test) have well deﬁned shifts indicating a gain, particularly for the paired data. InTable 2 the results for the pre- and post-test scores can be seen. The gain of G = 0 . t -test( N = 48) and was not due to random ﬂuctuations at the 95% conﬁdence level ( p -value= 0 . § § The means for the pre- or post-tests groups used an independent samples t -test and was found tobe not due to random ﬂuctuations at the 95% conﬁdence level; p -value = 0 . oes transitioning to online classes mid-semester aﬀect conceptual understanding?

100 10090 9080 8070 7060 6050 5040 4030 3020 2010 10 P o s t - t e s t m a r ks ( % ) Pre-test marks (%)Best linear ﬁt: PostTest = 27.4 + 0.66*PreTest r = 0.504 Figure 2:

Moderate positive correlation between pre- and post-tests, for paired data N = 48.This relationship between pre- and post-test scores can also be seen in Fig. 2,where the correlation in was found to be moderate and positive. It should be notedthat the gain from the pre-test mean, as compared to the post-test mean, is not used todetermine the gains [10], although we have performed a question by question breakdown.In terms of numbers, Pearson’s correlation coeﬃcient, in Fig. 2, gives a moderatepositive correlation of r = 0 .

504 with p -value = 0 . < α at the 95% signiﬁcancelevel ( α < . oes transitioning to online classes mid-semester aﬀect conceptual understanding?

80 8070 7060 6050 5040 4030 3020 2010 100 0-10 -10-20 -20 P e r c en t age ( % ) Question number % correct answer, post-test (all students) % correct answer, pre-test (all students) difference questions with majority right answer, post-test (all students) questions with majority right answer, pre-test (all students)

80 8070 7060 6050 5040 4030 3020 2010 100 0-10 -10-20 -20 P e r c en t age ( % ) Question number % correct answer, post-test (paired students) % correct answer, pre-test (paired students) difference questions with majority right answer, post-test (paired students) questions with majority right answer, pre-test (paired students)

Figure 3:

Top: Correct answer analysis for all the students ( N = 256) who wrote pre-and post-tests. Bottom: Correct answer analysis for all the paired students ( N = 48)who wrote pre- and post-tests.from their mechanics course rather than empirical evidence gathered from daily life.Finally, poorer post-test performance in these more conceptual questions maydemonstrate that students are not conﬁdent in their ability to apply their knowledge tounfamiliar scenarios. This may be a consequence of superﬁcial learning or dependence onpreconceived ideas rather than physics. The presence or development of misconceptionsmay also have come into play. Additionally, we note that these questions might indicatean issue with language ability, a concern shared by other studies conducted in regionswhere English is not the ﬁrst language of most students [25, 27]. oes transitioning to online classes mid-semester aﬀect conceptual understanding? Another way to investigate individual questions has recently been performed in Refs.[20, 25, 28], where a breakdown of the type of response to individual questions can leadto a polarisation of a correct answer and one predominantly incorrect answer. ¶ In Figs.4 and 5 we can see the eﬀect of the polarising questions: 5 , , , ,

29 and 30 in theFCI, e.g., see Ref. [25]. A similar pattern emerges for the cohort at UJ, where thereclearly appears to be a subset where asides from the correct answer there is anotherpolarising choice, and apart from Q18 in the “paired” data, Fig. 5, we ﬁnd the samedominant incorrect (polarised) response [25].It may be that certain misconceptions drive this polarisation. For example, considerquestion 5, where the dominant answer of C can be read from the pre-test data of Figs.4 and 5. This answer claims that the motion of a ball is driven by gravity as well as“a force in the direction of motion”, which indicates the common misconception thatmotion requires an active force. The presence of a force in the direction of motion isfavoured also in answers 11C, 13C, and 18D, as well as implied in 30E. Though thereis a general decrease from pre- to post-test in the selection of these erroneous answers,the post-test data of Figs. 4 and 5 suggest that these misconceptions can be diﬃcult toalleviate, as we have found at UJ.Such observations connect well to the work of Bani-Salameh [30] and others; weshall interrogate these ideas in greater depth in a future work, see Sec. 4. It can certainlybe inferred that there are subsets of incorrect answers where misconceptions in students’understanding consistently leads to the same kind of wrong answer [25].Group Female Male Diﬀerence n = 16 n = 32Mean (SD) Mean (SD) (Male-Female)Pre-test(%) 31.0 (12.3) 37.0 (19.4) 6.0Post-test(%) 56.9 (25.2) 47.6 (20.5) -9.3Gain(%) 38.0 17.3 -20.7 Table 3:

Paired FCI results (in percent) for female and male participants ( N = 48). In this section we now analyse the diﬀerences between the male and female participantswho sat for both pre- and post-tests (paired). In Table 3 the means for female andmale participants are presented and it clearly appears that female participants haveperformed better in the FCI as compared to male participants. Interestingly, and it has ¶ See Refs. [29, 27, 30, 31] for a discussion of other ways to analyse and interpret individual questionresponses in the FCI. oes transitioning to online classes mid-semester aﬀect conceptual understanding? Figure 4:

Distribution of student answers for Questions 5 , , , ,

29 and 30, pre-(red) and post-(blue), for all N = 256 students. NA stands for “no answer”, while theﬁve options are labelled from A to E.been found by Alinea & Naylor [20], the table shows that although male participantsdid better in the pre-test, female participants had a higher average in the post-test.To further try and verify these results, where due to the average number ofparticipants in each group being 24, we have in Appendix A performed multiplestatistical tests to conﬁrm if this diﬀerence in means is statistically signiﬁcant. From thiswe have found that at the 95% signiﬁcance level we can neglect the null hypothesis. Weemphasize that besides parametric tests for normal distributions we have also performeda non-parametric Wilcoxon test that led to a statistical diﬀerence in the medians, seeApp. A.In comparison to Fig. 2, in Fig. 6 the correlation for female and male groups wasfound to be mild and positive with: r F = 0 . p -value = 0 . r M = 0 . p -value= 0 . p -value < .

05. Finally, this can be compared to the combined correlation(independent of gender, N = 48) where r = 0 . p -value= 0 . apparently higher gain for the female part of the cohort might bedue to, as found by Sadler and Tai [32] (also see Adams et al [33]), professor-to-gendermatching to student gender was second only to the quality of high-school physics coursein predicting students’ performance in college physics. It may be worth mentioning thatat UJ, during the 2020 academic year, a female instructor was the senior academic forthe mechanics course. As we discuss in Sec. 4, we will leave these preliminary results oes transitioning to online classes mid-semester aﬀect conceptual understanding? Figure 5:

Paired distribution of student answers for questions 5 , , , ,

29 and 30,pre-(red) and post-(blue), tests for the paired group ( N = 48). The same conventionsare used as in Fig. 4. P o s t - t e s t ( % ) Pre-test (%)Gender Female Male

Figure 6:

Correlations for combined scores in terms of gender.for a follow-up work. oes transitioning to online classes mid-semester aﬀect conceptual understanding? In this section we brieﬂy remark on student and staﬀ surveys conducted at UJ in 2020after the sitting of the FCI post-test. In this survey we used a Likert scale [26] formost of the data collection. From Adams et al. [33] - interviews revealed that the useof a ﬁve-point scale in the survey - as opposed to a three-point scale - was important.Students expressed that agree vs strongly agree (and disagree vs strongly disagree) wasan important distinction and that without the two levels of agree and disagree they wouldhave chosen neutral more often .One month after the FCI post-test was administered, students were asked tocomplete a feedback form detailing their experience with the FCI assessment tool. Thissurvey was primarily designed to gauge student perspectives compared with the FCI testmark obtained (student numbers served as the only identiﬁer to preserve anonymity).As the survey was not compulsory (and this was during a hard lockdown) a rather smallsample (22) of the 256 students opted to take the survey. Of these student respondents,only 8 had taken either the pre- or post-test, while the other 14 had written both.Already a pattern could be seen in terms of students that sat both tests, “paired”,compared with those who sat only the pre- or post-test.If we look at the average mark obtained by the students who sat both tests, whichwas 10 .

80 for the pre-test and 12 .

57 for the post-test (out of 30), then when asked torate their performance, most chose 3 (¯ x = 2 .

67) and 2 (¯ x = 2 .

14) on the Likert scalefor the pre- and post-tests, respectively. Only one response exceeded 3: a student whohad received 40% for the pre-test ranked their performance as 4. These results suggestthat, for the most part, students are aware of their shortfalls and do not overestimatetheir ability.Three survey questions directly inquired about the students’ experience regardingthe shift to online learning. The ﬁrst, asking if students felt they engaged withlecturers/tutors/classmates more through digital platforms than through standardclasses, was met with a mixed response: most selected 3 (¯ x = 2 . ∼

18% chose “strongly agree” while ∼

23% chose “strongly disagree”. Similarly, ∼ ∼

18% selected “stronglyagree”. Given the South African context in general, and the UJ context speciﬁcally(where typically one third of the enrolled students are from Quintiles 1 and 2 schools (cid:107) ,see page 76 in Ref. [35], indicating a diﬀuse level of poverty), and that data is amongthe most expensive in the world [36], the diﬃculties of online learning can be especiallypronounced. (cid:107)

Public schools in South Africa are classiﬁed in so-called Quintiles, from 1 to 5. Quintile 1 schoolsinclude the poorest 20% schools, while Quintile 2 schools are the next poorest 20% of schools, andso on. Government funding is dispensed to public schools according to their Quintile classiﬁcation,with the aim of redressing poverty and inequality in education, see Ref. [34]. oes transitioning to online classes mid-semester aﬀect conceptual understanding?

4. Concluding remarks

In this article we have used the Force Concept Inventory (FCI) to look at the conceptualunderstanding of a large cohort of physics/engineering students at the University ofJohannesburg (UJ) during the 2020 academic year. Mid-semester UJ went into lock-down and students then switched from a traditional lecture format to online platforms,yet this led to the very informative scenario where we have found no overall drop ingains ( G = 0 . ∗∗ This wasfurther established through the comparison of 2019 and 2020 semester marks at UJ (seeTable 1) where we found no appreciable drop in marks.In Sec. 3.2 we looked at a subset of questions where a polarisation of choicesoccurred, in that either the correct answer or one main incorrect answer dominatedthe post-test responses [25]. We found similar patterns for the students at UJ, whichfollowed a very similar pattern to the data found in Refs. [25, 20, 28]. The importanceof these questions relates to the fact that they ask the student to be able to understandcertain particular concepts in physics, such as circular motion and motion requiring aforce.In Sec. 3.3 we looked at a possible out-performance of female students on the overallgain in the FCI. As was also found by Alinea & Naylor [20], although the male groupstarted with a higher average pre-test score, their gain was less. As mentioned earlier,the main course lecturer was female, which may lead to professor gender matchingin this cohort [32]. Although we rigorously checked that the diﬀerence in means wasstatistically signiﬁcant (at the 95% conﬁdence level, see App. A) we will report on alarger cohort inclusive of 2021 in a forthcoming work.The article has raised some questions, such as why the general performance for thegroup of paired students was higher than those who took either of the pre- or post-tests.This is often due to the fact that students who are diligent are more likely to take bothpre- and post-tests, and this can be seen from the fact that this group of students wasalso more likely to take the survey as well. Usually, overall gains are taken only frompaired data, which is then used to compare to other cohorts and institutions. However,the question of using “unpaired” pre- and/or post-test data sets in some form has notreally been investigated in the literature (see however, Ref. [27]) and we intend tocomment more on this issue in future work.As for other possible directions to investigate, besides extending the FCI to furtheryears, which also appears to be disrupted by more COVID lock-downs, we intend tolook at matriculation results in order to establish correlations between the FCI andhigh school exit grades in physics, maths and English scores. In the case of UJ, theﬁrst language of the students enrolled in the UJ FEBE (average 2015-2019): English14.3%; Isixhosa 5.9%; African 1.5%; Other 78.3% [12]. This relates to comments by ∗∗ In terms of physics performance during COVID see Refs. [38, 39, 40]. oes transitioning to online classes mid-semester aﬀect conceptual understanding?

Acknowledgements

EC and ASC are supported in part by the National Research Foundation of SouthAfrica (NRF). AC is grateful for the support of the National Institute for TheoreticalPhysics (NITheP), South Africa. We would like to thank all staﬀ and students who tookpart in this study. WN would like to thank useful discussions with Margaret Marshman,University of the Sunshine Coast (USC). The authors are also grateful to Allan L. Alinea(University of the Philippines) for his useful comments.

A. Statistical Analyses of Diﬀerences in Gender Means

In this appendix we look at the diﬀerences in gender means for paired data ( N = 48,comprising of 16 female participants and 32 male participants). We found that themean of the gains for female participants ( µ FG = 0 .

38) was greater than the mean formale participants ( µ MG = 0 . α < . t -test for independent samples (and unequal variances) had a p -value:0 . < α for a single tail (directional diﬀerence µ FG − µ MG > F -statistic: 4 . p -value:0 . < α .(c) A linear regression analysis also led to an F -statistic: 4 .

511 and p -value: 0 . <α .(d) A non-parametric two-sample Wilcoxon test led to medians of 0 . . W -statistic of W = 341 and p -value = 0 . < α .It may be worth mentioning that two-way ANOVA with replication was unbalanced,as the two group sizes were diﬀerent (16 and 32, respectively). However, we were ableto double-check the results obtained by converting gender to a dichotomous variable(Female= 1, Male= 0) and used a linear regression. At the 95% signiﬁcance level (or α < .

05) we can reject the null-hypothesis whenever the p -value < . t -test, a two-way ANOVA and a linear regression analysisboth agree at the 95% signiﬁcance level, and suggest that there is a statisticallysigniﬁcant diﬀerence between the means of female participants and male participants.This was further conﬁrmed in item (d), where we performed a non-parametric test (usingmedians) and found the critical value to be p = 0 . < α . These ﬁndings indicate oes transitioning to online classes mid-semester aﬀect conceptual understanding? real diﬀerence in gender (for this group) with female participants having better gainsthan male participants, even though male participants started with a higher averagepre-test score. References oes transitioning to online classes mid-semester aﬀect conceptual understanding?15