Towards digitalisation of summative and formative assessments in academic teaching of statistics
Nils Schwinning, Michael Striewe, Till Massing, Christoph Hanck, Michael Goedicke
TTowards digitalisation of summative and formativeassessments in academic teaching of statistics
Nils Schwinning, Michael Striewe, Till Massing, Christoph Hanck, Michael GoedickeUniversity of Duisburg-EssenEssen, Germany { nils.schwinning,michael.striewe,till.massing,christoph.hanck,michael.goedicke } @uni-due.de Abstract —Web-based systems for assessment or homeworkare commonly used in many different domains. Several studiesshow that these systems can have positive effects on learningoutcomes. Many research efforts also have made these systemsquite flexible with respect to different item formats and exercisestyles. However, there is still a lack of support for complexexercises in several domains at university level. Although thereare systems that allow for quite sophisticated operations forgenerating exercise contents, there is less support for usingsimilar operations for evaluating students’ input and for feedbackgeneration. This paper elaborates on filling this gap in the specificcase of statistics. We present both the conceptional requirementsfor this specific domain as well as a fully implemented solution.Furthermore, we report on using this solution for formativeand summative assessments in lectures with large numbers ofparticipants at a big university.
I. I
NTRODUCTION
Using computer assisted assessment (CAA) to face growingnumbers of participants in university courses is a well-knownconcept. Especially web-based homework systems are verypopular given decreasing numbers of teaching assistants thatare able to grade homework manually. Studies show that usingweb-based homework instead of paper-based concepts doesnot lead to decreases in students performances.[8][6][4]. Onthe contrary, a majority of studies is able to report a positiveeffect of using computer-assisted instruction in the classroomspecifically for the domain of statistics [11].Besides being useful for instruction and homework, somesystems are even able to perform summative assessmentssuch as tests or exams which reduces the effort of requiredmanual grading even further. Many systems exist that offerthe classical digital exercise formats such as multiple choice,fill-in or drop-down. Examples include Moodle , LonCapa and Stack among others. Sometimes, these formats can becombined to create an exercise consisting of several subtasksor the systems allow to create exercises with variable contentby using randomly created elements.However, courses in higher statistics need to use moresophisticated exercise designs that cannot be easily transferredto a such system. Many of the typical assessments in thearea consist of open tasks where students have to choose acertain strategy in order to solve the task correctly. A veryimportant question is, how these exercises can be digitalized https://moodle.org/ http://lon-capa.org/ https://stack.maths.ed.ac.uk/demo/ without losing quality, which means to ask questions withoutimplicitly revealing the solution strategy. Therefore, the systemto be used needs to allow a most flexible exercise design,that enables authors to react adequately to submissions. Thisapplies both for giving detailed feedback messages as well asfor using wrong solutions for further calculations. Furthermore,additional advantages of CAA can be used to help studentsto increase their learning outcome. In particular, it would bebeneficial to use parameterized exercises which give studentsthe chance to work on the same problem several times.To be able to do all this, many complex statistical functionsare needed. In order to avoid the effort of implementing thesein an existing system it seems reasonable to consider the use ofa tool that offers this functionality. The programming languageR is a free software widely used in the area of statistics asit provides many built-in functions. Besides, its architecturemakes it highly extensible which means that user can installpackages written by others or even write packages with theirown functions. All these factors make it very attractive to useR together with CAA to realise a concept that allows thedigitalisation of exercises for higher statistics.This paper presents an approach to combine the require-ments worked out above into an exercise type that allowsus to digitalize exercises for higher mathematics using theexample of statistics and thus to support university courseswith large numbers of participants. To do this, we havemade some major improvements to the e-assessment systemJACK which is in use at the University of Duisburg-Essen.A central new feature is the interaction with the statisticalprogramming language R in exercises. This means that Rcannot only be used for creating random parameters at thebeginning of an exercise but we can also send submissionsto it and use it for evaluations. Furthermore, it is possible toevaluate submitted expressions with R and use these values insubsequent tasks of an exercise. We use the package R SERVE that sets up a TCP/IP server which can be easily used fromvarious programming languages such as J AVA or C/C++ toconnect our system to R. Through this setup the approach canbe transferred to various mathematical subdomains using otherexternal systems with minimal effort. The concept is in heavyuse in statistics lectures with large numbers of participants atour university and is used for both, formative and summativeassessment types.The remainder of this paper is organized as follows: Section https://rforge.net/Rserve/ a r X i v : . [ c s . H C ] N ov I provides a brief review of existing approaches for the sameproblem and reports on their individual drawbacks in ourcontext. Section III discusses the features that we equippedour exercise type with in order to use it in introductorycourses at big universities, followed by some examples. Ashort description of the technical realization is given in sectionIV. Experiences from practical use of the system are reportedin section V before the paper is closed with conclusions andfuture work in section VI.II. R
ELATED W ORK
A. Tool Support for Teaching Statistics
One of the arguments for using electronic systems in thestatistics classroom is to create higher involvements of studentsand thus to stimulate learning. A dedicated experiment onusing a personal response system (PRS) as additional tool in astatistics lecture is presented in [15] and reports both benefitsand drawbacks. One specific limitation mentioned is the factthat formal exams in the course use open questions, while thePRS is based on multiple-choice questions.A way to overcome this gap is to make use of anotherarguments in favour of digitalisation of exercises: Tool supportcan be used to generate different variants of the same (open-ended) exercise. The most direct way to do this for exercisesin higher statistics is to use R in combination with thepackage exams . This package is able to generate exercisesfrom templates producing output in PDF, HTML, or specificformats for some common e-learning-systems [5], [16]. Whilethis allows authors to use the full power of R, it comes withtwo major drawbacks with respect to our goals: First, theresults are static, thus a student attending a specific exercisesin an e-learning-system will always get the same contents andhas hence no benefit from using R in the generation process.Second, R is not involved in the evaluation of student inputand hence also not involved in the generation of feedback orhints.A different approach which is able to use the same power ofcalculations both for exercise content generation and evaluationhas been realized by the CAMPUS project [7] almost 20years ago. It is entirely based on Microsoft Excel spreadsheetsand incorporates the use of the
RAND() function to generaterandom instances of exercises. While this is an elegant solutionthat is both useful for homework and exam situations, it isnaturally limited to the calculation capacities of Excel. Thereare significantly lower than the ones of R and are thus notentirely sufficient for exercises in higher statistics. Moreover,the approach requires all students to have access to Excel,and teachers to apply a lot of security measures to protect thespreadsheets against malicious attempts.
B. General CAA Systems for Mathematics
To overcome the limitations of tool support solely in exer-cises generation, CAA systems for mathematics can be used.A well-known CAA system for mathematics is A
CTIVE M ATH .It allows to generate variable content within exercises based onrandomization [9], [3]. Randomization is realized by drawing avalue from a set of admissible values for each variable, whereadmissible values may be numbers (both from sets and inter-vals), expressions, functions and alike. Further processing of the input is possible via the Computer Algebra System (CAS)used within A
CTIVE M ATH , but not by invoking an externalsystem such as R. The same is true for the systems A
LICE [12] and M
APLE
TA [2], [1], which are based on M
APLE as aCAS and thus naturally limited to the features offered by thatsystem. Consequently, there is no CAA system which alreadysupports the requirements sketched in the introduction to fullextent.W
IRIS offers components to be embedded into learningenvironments instead of being a CAA system on its own.This could be a promising approach to add the requiredfeatures to an existing system. The combination of their quizengine and CAS makes exercises programmable to allow forrandomization. In particular, loops and conditional statementscan be used to generate random content repeatedly until certaindesired properties are met. While this allows the exercisedesigner to add virtually any functions not offered by the CASdirectly, one looses the performance and quality benefits ofspecialized computer algebra systems and specialized softwarelike R. Similar is true for tools executing assessment itemsdescribed in some standard format like QTIW ORKS for itemsdefined in the QTI 2.1 format. In these cases, the combinationof QTI 2.1 and the particular tool does not allow to use externalsoftware like R.A general framework for math assessments supported byCAS is offered by the CABLE framework [10]. While itallows to use virtually any CAS or external system to generatevariable values for exercise variants and to evaluate the stu-dent’s input, its evaluation is limited to algebraic expressions.Consequently, it offers similar support as the tools for generat-ing paper based exercises, but only little support in automatedevaluation and feedback. In particular, correctness of a solutionis determined by testing whether the difference between aninput and the model solution is zero. Hence exercises thatcannot be assessed this way cannot be designed in the CABLEframework.III. T HE E - ASSESSMENT SYSTEM
JACKAs a general tool for e-assessment and automated tutoring,exercises in JACK are not limited to the domain of mathe-matics. Instead, exercises may contain multiple choice, fill-in,and drop-down elements for receiving input from the students.Thus a minimal exercise consists of some sort of task descrip-tion, at least one element that receives input from students, andat least one feedback message. However, JACK offers someoptions for a more sophisticated exercise design and moredetailed feedback messages, in particular for exercises withmathematical content, including the support of L A TEX, inputwith a formula editor and an evaluation of solutions usingcomputer algebra systems. We will discuss the options in thefollowing subsections.
1) Parameterization of exercises:
As a basic feature ofJACK, tasks may be generic by using variables as place-holders. These placeholders are filled in dynamically, so theexercise presents different content to the student every timeit is attempted. There is no limitation in where to use thesevariables, so the task description may vary, the number or https://webapps.ph.ed.ac.uk/qtiworks/ ontent of multiple choice or drop-down options may vary,the expected correct results may vary, and so on. While this isof no immediate use for a single visit on a single exercise, it isvery helpful when students work with the tutoring system fora longer time. In this case, they can work on the same exercisemore than once, receiving different values within that exercise.Thus exercises remain challenging for a longer period of time.Moreover, it possibly helps students to understand the abstractconcepts and encourages them to talk to each other aboutsolution strategies instead of plain solutions. Parameterizationeven offers another benefit during summative assessments, asit helps to avoid plagiarism between students. Using R toparameterize exercises enables us to use the wide range ofstatistical functions the system offers. In particular, data setson the basis of arbitrary random variables can be generated,which is a very useful feature. Moreover, JACK allows the useof plots created with R. A such plot can even be based on arandomly generated data set, which introduces another type ofvariable elements in an exercise: image variables.
2) Feedback Options:
As a very important consequenceof splitting an exercise into steps, the student may receivedetailed individual feedback for each step. A feedback ingeneral consists of a score and a feedback message. Accordingto the typology of Tunstall and Gsipps [14], JACK thusprovides both evaluative and descriptive feedback: The scoreis an integer number in the range of to based ona grading scheme provided by the author. Hence it is anevaluative feedback that provides a judgment and tells thestudent whether he was right or wrong. The feedback messagemay contain arbitrary content, including dynamically createdgraphics. In particular, it can refer to the student’s input, reusevalues from previous steps, and involve any kind of calculation.Hence it is a descriptive feedback that refers to the student’sachievements and may provide guidance on how to improve awrong solution.We consider the latter kind of feedback as one of thecentral features of a tutoring system. It is intended to help thestudents to comprehend where they made mistakes or wherethey were correct. To be able to do so, JACK has to understandthe semantics of a solution. For this reason, CAS are usedto evaluate solutions. On one hand, the CAS can verify thecorrectness of a solution, even if there are infinitely manycorrect ones. On the other hand, it is able to locate errorsprecisely and to compare a student’s solution with a standardsolution. This enables authors to provide granular feedback,evaluative feedback as well as descriptive feedback. SectionIV explains in a more detailed way how external systems canbe used to provide feedback.Let us consider an example exercise to illustrate how thesystem is able to give feedback. In the first step of the exercisethe student has to compute the cumulative distribution function F ( x ) for a random variable given by its density function f ( x ) = 1 π · kk + ( x − m ) , where k and m are randomly generated integers. The rightway to solve this exercise is to compute the integral F ( x ) = (cid:90) x −∞ f ( t ) dt = 1 π · arctan (cid:18) x − mk (cid:19) + 12 . Figure 1. Screen capture of the example exercise.
As we can see in figure 1, the student submits his solutionwith the help of a formula editor, which provides manytrigonometric and hyperbolic functions, which allows him toenter the arctan -function into the input text field. In case thesubmitted solution is correct, the system tells the student so andtakes him to step 2 of the exercise where he has to computea quantile of the distribution. In addition to that, the authorof the exercise has created a series of feedback messagesfor incorrect solutions. In that case, the student can use thefeedback to improve his submission and redo the step. Thefeedback messages are the following:1) The system checks, whether the submitted solutiondepends on the required variable x . If this is not thecase, a feedback message is given, telling the studentto use x .2) The system checks, whether the arctangent-functionwas used and provides a feedback message otherwise.3) Most students probably compute the integral by sub-stituting s = t − mk or similar. It can then happenthat they forget the factor k when replacing dt by ds . This would lead to the solution F ( x ) = πk arctan (cid:0) x − mk (cid:1) + . So in case this solution is sub-mitted the system will provide a feedback messagetelling the student to recheck his substitution.4) If 1., 2. and 3. are not fulfilled the system checkswhether the correct integration constant was used andprovides a new step in case it is unequal to . Thesystem recapitulates how the integration constant isdetermined and lets the student recompute it.5) In case the arctangent-function was used and theintegration constant is correct, the system checks ifhe argument of the arctangent-function is equal to x − mk . A feedback message is given otherwise.6) When the checks mentioned in 1.-5. fail a defaultmessage is displayed, telling the student that hissolution is wrong.A student who is not able to solve step 1 of the exercisecan ask for a hint by clicking the button provided for thispurpose. Authors can add multiple hints to a step which areshown separately, each time the student clicks the button. Forstep 1 of this exercise the author has supplied three differenthints. The first hint is very basic and recapitulates how todetermine the cumulative distribution function from the densityfunction. The second hint reminds the student how he canfind the integration constant and the third hint tells him that arctan( x ) is an antiderivative for x . In none of these hintshelp the student to find the correct answer, he can use the skipbutton to move forward to step 2 of the exercise. The systemreveals the solution and tells the student how to find it. A. Further Examples
We will now present some examples from the lectures weused the system in to illustrate the steps that were necessaryto convert them from paper-based exercises into digital ones.The example in Figure 2 introduces a more complex exer-cise type, dealing with hypothesis testing. In a paper based test,exercises like this one would require a longer answer includingseveral arguments and results from calculations. Hence it isnot feasible to ask just for a single final result in an electronicversion of this exercise. Instead, we decompose the exerciseinto five stages in JACK. On a technical level, each stagedefines a stand-alone exercise, containing a task, feedbackmessages, hints, etc. Of course, these tasks are strongly relatedto each other by sharing the same context. The decompositioninto several stages enables us to provide detailed feedbackmessages for each single task and to react differently to severalpossible flaws in each of these stages.In the first stage we draw a sample of raw data and askif the question calls for a left, right or two tailed alternativehypothesis in a drop-down menu. Here, we use the possibilityto draw more than one number from R at once for the sample.In case that the exercise is used for summative assessmentit is advisable to give partial credit for follow-up mistakesif only stage 1 is wrong. Depending on the answer in thisstage (”‘right tailed”’ is the correct solution”’) the exercise ispath dependent. This means even if the first stage is answeredincorrectly the user can proceed on this subpath performinge.g. a left tailed test.We then ask for the distribution of the test statistic,assuming the data to be i.i.d. normal, with some commondistributions offered in the drop-down menu (question 2) andfor the degrees of freedom of the null distribution (question3). Stage 3 is only visible if the student selects the Student tdistribution in stage 2. If the normal distribution is chosen theuser is directed to stage 4 as it now makes no sense to ask forthe degree of freedom.The choice of stages is not restricted to using drop-downmenus. Hence we can also proceed differently after stage 3,depending on the value typed into the input field by the student. Notably, experience has shown that students do not alwayssubmit numeric solutions. They may, for example, enter a “bigO” instead of a zero. Hence, we also have to define a fallbackstage in case the submission is not numeric, because otherwisewe cannot make any computations with it in the next stage.In stage 4 and 5 we finally ask for results from calculations,using two strategies to allow minor deviations in the actualnumbers: First, we ask for the result in a precision of fourdecimals and thus allow to omit minor rounding differences.Second, we configure the exercise to allow a corridor of inputaround the precise correct solution. So if -1.2672 is correctas in stage 4, input in the range from -1.2662 to -1.2682 isaccepted as correct. The size of the range can be adjustedindividually for each stage.In the sample session shown in Figure 2, the studentskipped stage 4 and thus JACK shows the correct resultinstead of giving credits or other feedback. Fallback stagesas mentioned above are necessary for such cases as well, aswe cannot proceed with user input if a stage is skipped.In summary, this example demonstrates how an exercisethat offers a great amount of freedom in its paper based versioncan be transformed into a digital version offering almostthe same amount of freedom for the students. In particular,feedback can be given to flaws in different stages of theexercise and students can continue also with wrong answersor even if they are not able to answer a particular stage at all.Another example using the graphics capabilities of R isshown in Figure 3. The plot used in the task description iscreated dynamically by R based on random data followingsome pre-defined distribution. The same random data is alsoused to compute the correct solution. There is not much degreeof freedom in the exercise, as it just asks for some simplenumbers. Nevertheless the exercise gets more interesting dueto digitalization, as we are able to produce virtually hundredsor thousands of different plots. This includes drawing randomdata several times using the same distribution and parametersas well as drawing random data using different parameters.IV. T
ECHNICAL R EALIZATION
The general framework for generic exercises in JACK aspresented already in previous publications needed no specificextensions to be used with sophisticated exercises for higherstatistics. However, the general architecture of JACK (as pre-sented in [13]) was not yet completely prepared for the flexibleintegration and very frequent use of external systems like R.In particular, it is not feasible to simply start a new R processon the server every time exercise content needs to be generatedor student input needs to be evaluated. This would first implya large overhead on system load with respect to starting andstopping processes and second would make it difficult to runmore complex commands for feedback generation such ascreating plot images based both on exercise parameters andstudent input.To avoid these difficulties, the architecture of JACK hasbeen extended as follows: An instance of R
SERVE is de-ployed alongside JACK on the server which accepts multipleconnections at once and is able to process several commands https://rforge.net/Rserve/igure 2. Screen capture of a hypothesis testing exercise realized in five stages. issued over one of the connections one after another. Eachconnection is associated with a dedicated workspace both inmemory and on the disk which keeps a persistent state aslong as the connection is not closed. When a student startsto work on an exercise, a new connection to R SERVE isopened specifically for this student and this exercise and keptopen until the student leaves the exercise. Hence there is lessoverhead for starting and stopping R processes. Furthermore,results from creating exercise parameter values or processingstudent input can be stored in the workspace and reused lateron in the same exercise when processing further input orcreating more parameter values. Once the student leaves theexercise, the connection is closed and all workspace contentis deleted. Consequently, the student can start with completelynew parameter values when trying the same exercise again.V. U
SAGE S CENARIOS AND E VALUATION
The University of Duisburg-Essen offers two large lecturesfor statistics that are supported with JACK. These are thecourses for descriptive and inductive statistics held by thefaculty of economics. Both are attended by up to 700 students per semester. The concept of using CAA is the same in bothcourses. We use formative and summative assessments duringthe semester and even offered electronic exams as a voluntaryalternative to the normal exam.The formative assessments are intended to replace theclassical paper-based exercises that students often have to doin traditional university courses with mathematical content.Summative assessments during the semester are offered assmall tests every other week. Students can gain bonus pointsfor the final exam in each of the tests. These summativeassessments shall motivate them to start learning earlier in thesemester. The tests are taken from at home, which enables theparticipants to work on them collaboratively and to use allkinds of resources to help them. Therefore, the bonus pointsare only added to the results of those students, who have passedthe exam. Two electronic exams are offered at the end of thesemester and complement two paper-based exams. Students arefree to choose which type of exam they prefer. Experiencesshow that both types of exams are equally accepted by thestudents. igure 3. Screen capture of an exercise using automated plot generation.
Course
Table I. U
SAGE FIGURES FOR THE EXERCISES CREATED FOR THE USE IN THE LECTURES OF DESCRIPTIVE AND INDUCTIVE STATISTICS
The different ways of using exercises for different purposeshas a direct effect on the exercise design. Exercises createdfor the formative assessment type need to be provided withdetailed hints and feedback messages in order to guaranteea good learning outcome. Stages that are not solved correctlyeither have to be repeated or the task can be skipped manually.Exercises used in summative assessments need to fulfilldifferent requirements than those used in formative assess-ments. Hints and feedback messages are not shown to thestudents during the test, so it is not necessary to create them.However, repetitions of stages should not be allowed in testexercises, since this behaviour could reveal information aboutthe correctness of the submission. Furthermore, we have tomake sure that the exercises are able to deal with consequentialerrors when students have to use their input from previousstages for calculations in the following ones. This featureneeds to be used very carefully, because experiences showthat submissions may not be of the expected type which canmake it impossible to use them in further stages. For example,doing calculations with user input is impossible if a studentjust entered “don’t know” instead of a number. To handle thesecases, the exercise author can then define a so-called fallbackstage that is used in this case.In addition to the requirements for summative assessments,exercises created for the electronic exams also require a lot of testing and considerations on how to grade them. We have tospend a good amount of time on predicting possible erroneoussubmissions that are still worth some points.The teachers of the two lectures and their assistants have tocreate the exercises themselves, as they are the only ones hav-ing the required domain knowledge. We offer small workshopswhere we teach them how to deal with the system. Experiencesshow that they can start creating exercises themselves veryquickly. Consequently, a large exercise pool has been createdfor both lectures even though creating a well tested complexexercise can take up to 8 hours of time. Table I gives a briefsummary of the created exercises for the different assessmenttypes and their usage numbers. Especially shortly before thesummative assessments we can observe a lot of traffic onthe system which shows that students take the opportunity topractice. As most of the exercises for the formative assessmentsare parameterized, students tend to do them more than once.Furthermore the participation numbers in the electronic examsare equal to the paper-based exams. As students are free tochoose which exam they would like to take this shows theacceptance of the sytem among them. The data we receivedshows that those students who worked the most during thesemester appear to have the best performances in the exams.Therefore, we intend to use the learning behaviour as apredictor for the outcome in the exam. However, a detailedanalysis is subject to further research. Nevertheless, all togethere can conclude that our approach is heading into the rightdirection.VI. C
ONCLUSIONS AND F UTURE W ORK
In this paper we introduced a flexible exercise type thatallows the digitalisation of complex exercises used in academicteaching. We worked out and implemented the requirementsthat have to be fulfilled in order to be able to do this properlyand without loss of quality. To illustrate how the concept canwork in practice we used the example of statistics. In particular,we connected our system to the domain specific software R andcreated a large number of exercises. Most of these exercisesare parameterized, which we consider another huge benefit ofusing CAA. The fact that the exercises we created are usedwithin courses for statistics at our university shows that ourconcept is feasible. However, the presented approach can beapplied to any other domain using complex exercise designs.The architecture of our system allows us to easily connect itto other expert systems, comparable to R. We have seen howsuch a system can be used to evaluate submissions.In the domain of statistics we see further challenges thatneed to be overcome. The mentioned courses also teachstudents how to use R to perform complex computations. Anew exercise type that we are working on will be able to gradethese small R programs automatically and to offer detailledfeedback messages. We will integrate this new exercise typeinto our concept of supporting the courses with CAA.With respect to the technical realization, there is only onefurther action planned so far: The RS
ERVE instance shouldbe moved to a separate server, so that it can be used fromdifferent JACK frontend or backend instances at the same time.While this would require a more sophisticated load balancingconcept on this dedicated server to avoid overloading it withrequests, this would make it much easier to make new Rfeatures available for all JACK instances at the same time.Load balancing could happen by using D
OCKER to spawnseveral instances of R SERVE if necessary.R
EFERENCES[1] Software Solutions to Enhance Statistics Education. Maplesoft Techni-cal Whitepaper, 2015.[2] Bill Blyth and Aleksandra Labovic. Using maple to implement elearningintegrated with computer aided assessment.
International Journal ofMathematical Education in Science and Technology , 40(7):975–988,2009.[3] Giorgi Goguadze.
Active Math - Generation and Reuse of InteractiveExercises using Domain Reasoners and Automated Tutorial Strategies .PhD thesis, Universitt des Saarlandes, 5 2011.[4] Tolga Gok. Comparison of student performance using web- and paper-based homework in large enrollment introductory physics courses.
In-ternational Journal of the Physical Sciences , 6(15):3778–3784, August2011.[5] Bettina Grn and Achim Zeileis. Automatic generation of exams in r.
Journal of Statistical Software , 29(1):1–14, 2009.[6] Sherry Herron, Rex Gandy, Ningjun Ye, and Nasser Syed. A Com-parison of Success and Failure Rates between Computer-Assisted andTraditional College Algebra Sections.
Journal of Computers in Math-ematics and Science Teaching , 31(3):249–258, July 2012.[7] Neville Hunt. Computer-aided assessment in statistics: the campusproject.
Research in Learning Technology , 6(2), 1998. Journal of Computersin Mathematics and Science Teaching , 29(3):233–246, August 2010.[9] Erica Melis, Eric Andres, Jochen Budenbender, Adrian Frischauf,George Goduadze, Paul Libbrecht, Martin Pollet, and Carsten Ullrich.ActiveMath: A Generic and Adaptive Web-Based Learning Environ-ment.
International Journal of Artificial Intelligence in Education ,12(12):385–407, 2001.[10] Laura Naismith and Christopher J Sangwin. Computer Algebra BasedAssessment of Mathematics Online. In
Proceedings of the 8th AnnualCAA Conference , pages 235–242, Loughborough University, UK, 2004.[11] Giovanni W. Sosa, Dale E. Berger, Amanda T. Saw, and Justin C. Mary.Effectiveness of computer-assisted instruction in statistics.
Review ofEducational Research , 81(1):97–128, 2011.[12] Neil Strickland. Alice Interactive Mathematics. MSOR Connections,2(1), 27-30, 2002.[13] Michael Striewe. An architecture for modular grading and feedbackgeneration for complex exercises.
Science of Computer Programming ,129:35–47, 2016.[14] Pat Tunstall and Caroline Gsipps. Teacher Feedback to Young Childrenin Formative Assessment: a typology.
British Educational ResearchJournal , 22(4):389–404, 1996.[15] Ernst Wit. Who wants to be the use of a personal response system instatistics teaching.