Gradeer: An Open-Source Modular Hybrid Grader
Benjamin Clegg, Maria-Cruz Villa-Uriol, Phil McMinn, Gordon Fraser
GGradeer: An Open-Source Modular Hybrid Grader
Benjamin S. Clegg ∗ , Maria-Cruz Villa-Uriol ∗ , Phil McMinn ∗ and Gordon Fraser †∗ University of Sheffield, † University of Passau
Abstract —Automated assessment has been shown to greatlysimplify the process of assessing students’ programs. However,manual assessment still offers benefits to both students andtutors. We introduce
Gradeer , a hybrid assessment tool, whichallows tutors to leverage the advantages of both automatedand manual assessment. The tool features a modular design,allowing new grading functionality to be added.
Gradeer directlyassists manual grading, by automatically loading code inspectors,running students’ programs, and allowing grading to be stoppedand resumed in place at a later time. We used
Gradeer to assessan end of year assignment for an introductory Java programmingcourse, and found that its hybrid approach offers several benefits.
I. I
NTRODUCTION
The demand for Computer Science and Software Engineeringeducation has continued to increase over recent years, witheducational institutions seeing larger cohorts of studentsenrolled in such courses [1]. As technology further advances,future generations of students will drive this demand further,with universities and schools facing several challenges inteaching a growing number of students. One of these challengesis the assessment of a large number of students’ solutions toprogramming tasks. Assessment is particularly important, sinceit both has the ability to further students’ development throughthe provision of detailed feedback, and serves to measure astudent’s understanding of a topic.Automated grading and feedback techniques offer severalbenefits in assessing large numbers of students. Their automatednature allows users to perform other tasks while gradingis executed. It is also often much quicker to run a seriesof automated processes than to manually assess individualstudents’ solution programs. This is especially important forcourses with large numbers of students, where manual assess-ment would consume too much time, and manual feedbackcould be provided too late to be of relevance to students’learning. In addition, automated feedback allows for a largeamount of feedback to be generated, and providing morepieces of automated feedback has been shown to improvestudents’ performance [2]. Automated grading is also moreconsistent than manual grading, especially if students’ solutionsare assessed manually by multiple people [3], which wouldlikely be necessary to improve assessment times.There are, however, some issues with the use of automatedassessment alone. There is a significant initial time cost ofusing automated assessment, with the need to either developor configure a tool before assessment can be performed.Additionally, with the exception of test-based systems, tutorsmay find it difficult to adapt an automated assessment systemto meet their requirements [4]. Similarly, there are a wide range of unique automated assessment approaches [5]–[10], someof which may be suited to certain tasks, but would requirea significant degree of effort to combine into one gradingtool. Automated assessment also lacks some of the benefitsof manual approaches. Manual assessment has the ability tocapture aspects of grading that are hard to automate, such asthe usefulness of variable names, or the appearance of a GUI.There is also evidence that manually provided feedback is ofgreater benefit to students’ performance than automaticallygenerated feedback [11].In this paper, we introduce
Gradeer , a hybrid modulargrading system, with the goal of providing the benefits ofboth approaches, while mitigating their challenges (Section II).We used
Gradeer to assess an end of year assignment for anintroductory programming course (Section III). We found thatthe tool’s hybrid approach allowed for the use of a large numberof consistent automated assessment criteria, and aided in theprovision of detailed manual feedback to students.
Gradeer alsoprovides a degree of automation to assist tutors in manualassessment, such as automatically launching students’ programsand code inspectors. We found that these features saved us aconsiderable amount of time when manually assessing students’solutions. The modular nature of grading components allowsa variety of automated grading techniques to be used inconjunction with one another, while minimising the effortrequired to combine their results.
Gradeer is available onGitHub under the GPLv3 license, which allows users to writetheir own extensions and integrations for the tool [12].II. T HE Gradeer G RADING T OOL
Gradeer is an assessment tool which provides tutors withthe benefits of both automated and manual assessment in asingle package. The tool achieves this using a modular design,allowing a user to choose how to assess a programming taskusing simple configuration files, or even define their ownmodules for specific purposes. To allow for manual assessment,
Gradeer is designed to be used by tutors on personal computers,where the user can interact with the program via a CLI. It ishowever possible for
Gradeer to be integrated with a GUI orweb interface.
Gradeer is implemented in Java, and allows forthe assessment of Java programs. Wider language support isplanned for future versions of the tool. This section describesour design of
Gradeer , alongside some of its benefits.
A. Checks
We designed
Gradeer with a focus on modular gradingcomponents, called checks , each of which represents a singlegrading criterion. Different types of checks are currently a r X i v : . [ c s . C Y ] F e b mplemented, defining how a criterion’s base score (a decimalvalue between zero and one) can be determined for a givenprocess and student’s solution. Various checks of different typescan be used together in a single run of Gradeer , constructing amarkscheme to assess several learning outcomes. For example,users can configure
Gradeer to use multiple checks to runvarious test suites, perform static analysis, and manually assessseveral aspects of a solution. Users configure their checks inJSON files. Users can also implement new checks to add thefunctionality of unique and domain-specific grading tools.One currently implemented type of check is the
TestSuiteCheck , which executes a given JUnit test classon a student’s solution via Apache Ant [13], then calculatesa score as the proportion of tests that pass. Tutors can assessindividual learning outcomes by grouping tests that evaluatethe same outcome into one class.We also implemented check types for two static analysistools, Checkstyle and PMD [14], [15], in order to automaticallyassess the code quality of students’ solutions. Such checkssearch the output of their respective tool for a user definedrule violation. The number of violations in each source file ofa solution is recorded and used to compute a base score. Userscan also define a minimum and maximum number of violations,which yield base scores of one and zero, respectively.To support manual assessment, we have implemented a
ManualCheck type, which displays a user-defined promptand score limit to the user when executed. This check thenparses numeric input from the user and normalises it to ascore in the range of zero and one. For example, the followingresponse would produce a base score of 0.6:
How informative are the variable names?(0 = very poor, 10 = excellent)
Each check has an associated weight; a score multiplicationfactor to allow a test to have a greater or smaller impact on eachsolution’s overall grade, as discussed in Section II-B4. Thisweight can be defined by the user. In addition, each check hasassociated feedback to provide to a student for their solution.For most checks, this feedback is determined by mapping abase score to one of several feedback values that have beenpre-defined by the user. For example, the above manual checkmay provide students with feedback for the base scores, bs : • . ≤ bs ≤ . : “Your variable names are informative.” • . ≤ bs < . : “Some of your variable names could bemore informative.” • . ≤ bs < . : “Most of your variable names could bemore informative.”Manual checks can also read text input from the user, allowingfor additional feedback to be provided on an individual basis.For example, a tutor may enter “ a is not an informative variablename, leftMotor would be better.” B. Execution
Figure 1 shows an overview of
Gradeer’s execution process.
1) Compilation & Check Loading:
First,
Gradeer compilesevery students’ solution and every model solution (SectionII-B2). At this stage, any solutions which do not compileare flagged as such. These solutions are reported to the tutorfor review, and are excluded from further execution. Next,
Gradeer loads every check defined in the JSON files. The toolalso compiles the test classes that are provided by the user. Ifenabled,
Gradeer automatically generates a test suite check foreach test class which does not have a matching check alreadydefined by the user.
2) Model Solution Execution:
The user can supply a setof one or more model solutions; entirely correct solutionsto the programming task being assessed. Users can chooseto use multiple model solutions to define different correctimplementations of the programming task. In order to identifyand remove invalid checks,
Gradeer executes every check oneach provided model solution. Checks which attain a base scoreof less than one on any of the model solutions are consideredto be invalid, and are removed; they falsely claim that a modelsolution is partly or completely incorrect. This prevents invalidchecks from being used in the assessment of students’ solutions,preventing them from unfairly losing or gaining grades, orbeing given inaccurate feedback. For example, uncompilabletest classes will not pass on any solutions, so their checks areremoved. The names of invalid checks are stored in a file forthe tutor to review and correct.
3) Solution Grading (for each Student’s Solution):a) Pre-checks:
In order for some checks to functionproperly, a series of pre-checks are executed on each solution.For example, checks for Checkstyle and PMD require pre-checks which execute their corresponding static analysis toolon the solution under test and store its output in memory. b) Solution Inspection:
To support effective manualgrading,
Gradeer includes a solution inspector which canperform two processes, as configured by the user. The firstexecutes a student’s solution in a separate thread before runningany manual checks. This allows the user to be able to interactwith the solution, and to observe its user interface, whichmay be relevant to the rubric of manual checks. The solutionexecution thread is closed following the completion of everymanual check on a given solution. The second opens each ofthe solution’s source files in an external user defined text editor,such as Atom. This allows for the user to perform manual codeinspection, for example to determine the quality of variablenames or comments. The solution inspector removes the needfor the user to manually run a student’s solution to interactwith it, or open its source files to inspect it, saving time. c) Checks:
The final step of a solution’s grading processis to run every check on it. In order to improve execution time,
Gradeer runs automated checks in parallel by default. Manualchecks are only executed in the main thread, however, as theyrequire user input, and henceforth could result in the occurrenceof race conditions otherwise. In order to prevent some JUnitchecks from taking too long to execute,
Gradeer has a userconfigurable global test timeout, where any tests that takelonger than this time are treated as failing. This is particularly odelSolutionsUnit TestsCheckConfigsStudents'Solutions CompilerCompilerCompiler PreprocessorsPre-checksPreprocessorsPreprocessorsPreprocessorsCheckGenerators CheckExecutor ValidChecks SolutionInspectorStateRestoration StoredCheckResults Grade &FeedbackCSVsGradeCalculator& OutputWriterChecks CheckExecutorCheckResultsPreprocessorsPre-checksStudents'SolutionsModelSolutionsCompilation & Check Loading Model Solution Execution Solution Grading Output
Fig. 1: Overview of
Gradeer’s flow of execution. The dotted areas indicate different phases of the execution. Waved boxes arefiles, parallelograms are internal data, and regular boxes are processes.important, since some students’ solutions may contain bugs thatprevent them from halting, such as incorrect loop conditions.
4) Output:
After executing every check on every solution,
Gradeer stores the appropriate grades and feedback for eachsolution in various CSV files. The final grade of each solutionis stored in one CSV file. This grade is calculated by:
Grade (s) = (cid:80) c ∈ C w ( c ) · Base Score ( c, s ) (cid:80) c ∈ C w ( c ) ,where s = Student’s solution, C = Set of enabled checks, w ( c ) = Weight of check c Similarly, the combined feedback of each solution acrossall checks is also stored in a CSV file.
Gradeer also storesa CSV file with the individual base scores and feedback ofevery check for each solution. This file also includes the weightof each check. This allows for final changes to be made inspreadsheet software if absolutely necessary. For example, theuser can post-process the students’ grades by adjusting thechecks’ weights, and recalculating the final grades in the samemanner as
Gradeer . Users can also gather valuable informationon students’ performance for the grading criteria, facilitatingthe provision of group feedback to the entire student cohort.
C. State Restoration
Following the completion of checks on a solution,
Gradeer stores the results and feedback for every check in aJSON file. When
Gradeer is executed with such files present,it uses them to restore these check results for every applicablesolution, and skips the corresponding checks when processingthese solutions again. This has numerous advantages: • A tutor can effectively pause the grading process and comeback to it at a later time. This is particularly advantageouswhen using manual checks, as programming tasks withmany students’ solutions can take hours to manually assess.State restoration allows this arduous process to be splitinto more manageable marking sessions. • Assessment tasks can be allocated to multiple users, suchas TAs. Tutors can adjust users’
Gradeer configurations touse different solutions or checks. By allocating differentmanual checks to different users, grading can be completedmore quickly without reducing consistency. By mergingthe users’ JSON files and re-running
Gradeer , the finalgrades and feedback can be generated. • If Gradeer halts unexpectedly, perhaps due to a widersystem error, the user’s grading progress is not lost. • Tutors can either directly modify the result files to adjustthe results of individual checks, or delete them outright tore-assess a solution. Running
Gradeer again will updatethe final output files (as described in Section II-B4). Tutorscan also choose to add new checks after initial executionsof the tool to capture additional assessment requirements.III. C
ASE S TUDY
In this section, we discuss our application of
Gradeer in anend of year introductory Java programming assignment with171 students’ solutions.
A. The Assignment
The assignment required students to parse a series ofstructured input files into a provided data structure, thenimplement a set of methods that query this data. The assignmentalso required students to plot graphs using this data in a GUIusing Java’s Swing library. A primary goal of the assignmentwas to provide students with experience in working on a multi-faceted project with codependent systems, which are moreakin to real software than the simpler introductory programsused earlier in the course. As an end of year assessment, theassignment had a fairly wide span of learning outcomes. Suchlearning outcomes included the use of polymorphism, bespokedata structures, the choice and use of various Java Collections,text manipulation, GUI programming, algorithm design, andthe use of good quality code and programming style.We first determined the overall assignment specification,then focused on creating a model solution that captured thisspecification. We then created a set of grading unit tests,nsuring that they were valid and that the model solution passedon each of them. Following this, we duplicated the modelsolution to create a skeleton project, from which we removedthe classes and methods that students were to implement.
B. Release
We distributed the skeleton project to students. We alsoprovided the students with a set of input data files that were tobe read by their implemented parsers. These data files were asubset of those that we later used when grading the assignment.Around a week after we released the assignment, we alsoprovided students with a set of public tests. We designedthese tests to ensure that students’ code included the basicfunctionality of the assignment. This provided students witha degree of feedback as they worked on the assignment, anddissuaded students from submitting solutions which are notcompatible with our grading environment, such as includingincorrect class names.
C. Check Configuration
We configured
Gradeer to use 45 checks: •
26 test suite checks (each check executed one unit test), • six PMD checks, • six Checkstyle checks, and • seven manual checks (for GUI functionality and subjectiveaspects of code review, such as variable names).By using these checks together, we were able to use Gradeer toassess all of our learning outcomes. The manual checks wereimportant in this regard, since the design of the GUI and someaspects of code quality cannot be fully graded automatically.
D. Assessment
While
Gradeer supports the use of all types of checks ina single execution, we split the checks across two separateexecution configurations; one for automated checks and onefor manual checks. This was necessary since we anticipatedthat some solutions would be problematic, containing issuesthat would prevent compilation or execution. As such, runningmanual checks on some of these solutions would have been awaste of effort if the solutions could not be executed properly.By splitting the checks we were able to first compile thestudents’ solutions and run the automated checks to identify anyproblematic solutions, and to assess the working solutions. Weidentified 48 problematic solutions. We repaired these solutionswhere possible so that they could still be graded with
Gradeer ,but added a penalty for doing so when post-processing thegrades. We repeated the automated grading for these repairedsolutions. However, 11 of the solutions could not be repaireddue to severe issues. We wrote individual feedback for eachof these solutions to explain the nature of these problems.Finally, we re-executed
Gradeer with only the manual checkson every working and repaired solution. Table I shows theaverage amount of time that various aspects of running theassessment with
Gradeer took for each applicable solution,alongside the time taken to manage problematic solutions. TABLE I: Average time to perform each assessment task oneach applicable solution.
Assessment Task Average Time Per Solution
Compilable Solutions
Compilation ∼ ∼ ∼ Problematic Solutions
Problem Identification ∼ ∼ ∼
10 minutesOnce we completed grading the assignments, we performedsome post-processing on the results. In particular, we addedsome more specific feedback and adjusted the weights of someof the checks. Providing the additional feedback revealed thepossible benefit of being able to add specific feedback whenrunning
Gradeer , leading us to later implement the ability toadd user entered feedback for manual checks. We also providedmore detailed and general feedback to the entire student cohortusing the distribution of solutions’ base scores for individualchecks. In addition, we used this check performance datato adjust the checks’ weights. For example, we found thatthe scores of some checks would vary considerably betweensolutions, such as a PMD check for cyclomatic complexity,for which approximately half of the solutions achieved < . .In such cases, we increased the check’s weight, as it betterdifferentiated students’ solutions. However, we attempted tomaintain similar total weights between the broader groups oflearning outcomes, such as overall correctness and code quality,to assess students in a well-rounded manner. E. Benefits of Gradeer
We found that
Gradeer’s hybrid grading approach providedseveral benefits when assessing this programming assignment:
1) Fast Manual Assessment: Gradeer provides a particularbenefit in allowing for quick manual assessment. This ismostly due to
Gradeer’s solution inspector, which automaticallyexecutes students’ solutions, and displays their source files ina text editor. Without this feature, a tutor must manually openthe correct directory, enter a command to run the solution, andopen the source files, before beginning the manual assessment.By removing the need to follow these steps for every solution,
Gradeer removes a significant bottleneck in manual grading.
2) Automated Grading:
By using automated grading wher-ever possible, we were able to reduce the number of manualchecks. For example, we used some static analysis checksto evaluate the style of students’ solution programs, such asensuring that they used camel case formatting in variable names.By using these checks, the tutor did not have to look for theseissues when performing the manual code inspection. Similarly,the use of unit tests to assess correctness of some elementsof the program removed the need for the tutor to identifyaults in these elements manually. The additional benefit ofautomated grading is that the checks are applied consistentlyacross solutions. Any two students’ solutions which have thesame faults will be assessed the exact same way.
3) High Quality Feedback:
We found that
Gradeer wascapable of providing useful feedback to students. Whileautomated checks only provide simple feedback, the largenumber of these checks gave students a very wide range offeedback; they could gain a good understanding of where theysucceeded and where they can improve. This is supported byFalkner et al.’s findings that students’ performance improvesas more pieces of automated feedback are provided [2]. Thisfeedback is further augmented by
Gradeer’s support for manualchecks, the scores of which we used to determine which ofseveral pieces of feedback to give to a student. The ability toprovide manual feedback at runtime in the current version of
Gradeer supports this even further.
4) Reusable:
In the past, we typically used unique autograd-ing scripts for each assessment. Developing these scripts is atime consuming process, and may involve repeated effort ofimplementing similar functionality across multiple assessments.Conversely,
Gradeer can be reused in different assessments,only requiring modifications to simple configuration files.
F. Challenges
When assessing the assignment, we found that uncompilablesolutions introduced the greatest time cost. Around 48 of the171 solutions initially could not be compiled or executed, dueto missing files, syntax errors, or modifying files that should beunmodified. It is possible that such problems could be mitigatedby preventing students from uploading broken solutions, suchas by integrating
Gradeer with the solution upload system, andreporting to students if an issue is detected.Running the automated checks did take a considerableamount of time, at ∼ Gradeer that we used for this assessment didnot support multithreading. After implementing multithreading,we observed an execution time of ∼ Gradeer requires less effortthan writing a unique grading script, some tutors may bedissuaded by not understanding its internal functionality.Providing tests may increase tutors’ confidence in such tools. IV. R
ELATED W ORK
Some existing automated grading tools also feature modularassessment elements [16]. For example, Nexus’s assessmentcomponents implemented as Docker micro-services [17]. Web-CAT uses modular plug-ins [18], [19]. JACK and ArTEMiSboth use multiple software components that can be split acrossmultiple servers, and interchanged to support different gradingfunctionalities [20], [21]. These tools are designed to be used asscalable web services, which can be beneficial for large coursesand MOOCs. Such approaches do have considerable advantages,and may allow tutors to view students’ source code, but tutorscannot run and interact with students’ solutions directly, whichlimits their ability to perform manual assessment. By contrast,
Gradeer specifically accommodates manual assessment.It is not uncommon for assessment tools to take a “semi-automatic” approach, with support for user intervention andmanual assessments alongside automated processes [22]. Web-CAT allows tutors to manually inspect students’ source code,and provide feedback or additional grades [19]. Praktomatgrants TAs the ability to provide manual feedback by addingcomments to students’ code [23]. It also allows TAs to addmanual scores for learning outcomes. JACK enables tutors toprovide manual corrections for generated grades, and manualfeedback [20]. Jackson’s grading tool displays the contents of asolution’s files before reading the user’s input to determine thescores of a series of manual assessment elements [24]. Whilethese tools have provisions for manual assessment, none ofthem automate the process of launching students’ programs fortutors to interact with them. This may be problematic, as thebottleneck of manually running each solution is still presentwhen evaluating user interaction.
Gradeer’s solution inspectorremoves this bottleneck entirely.
Gradeer also combines theresults of automated and manual checks into a single grade,without additional user intervention.V. C
ONCLUSIONS AND F UTURE W ORK
In this paper we have presented
Gradeer , a modular gradingtool to support both the automated and manual assessment ofstudents’ programs. We have also discussed our experiencesin using the tool to assess an end of year assignment for anintroductory programming course. We find that
Gradeer caneffectively support tutors in providing quality feedback tostudents, while maintaining a low time cost of assessment.
Gradeer also provides tutors with detailed data on students’performance, which can be used to inform and improve teachingquality, future assessment design, and feedback.
Gradeer isavailable at https://github.com/ben-clegg/gradeer [12].In our future work, we will extend our evaluation of
Gradeer ,by comparing the time saved using our solution inspector versusmanually running each solution, and by surveying more endusers. We plan to improve
Gradeer , such as enhancing itsmodularity, by further separating check modules from the restof the system, and modularising other components (such as pre-checks and language-specific functionality) as well. We alsointend to add web integration to the tool, to inform studentswhen they have submitted solutions with significant problems.
SIGCSE 2014 - Proceedings of the 45th ACMTechnical Symposium on Computer Science Education , pp. 9–14, 2014.[3] I. Albluwi, “A Closer Look at the Differences between Graders inIntroductory Computer Science Exams,”
IEEE Transactions on Education ,vol. 61, pp. 253–260, aug 2018.[4] H. Keuning, J. Jeuring, and B. Heeren, “Towards a systematic review ofautomated feedback generation for programming exercises — ExtendedVersion,” tech. rep., Utrecht University, 2016.[5] X. Liu, S. Wang, P. Wang, and D. Wu, “Automatic Grading ofProgramming Assignments: An Approach Based on Formal Semantics,” , pp. 126–137,2019.[6] D. Insa and J. Silva, “Automatic assessment of Java code,”
ComputerLanguages, Systems and Structures , vol. 53, pp. 59–72, 2018.[7] R. Singh, S. Gulwani, and A. Solar-Lezama, “Automated feedbackgeneration for introductory programming assignments,”
Proceedingsof the ACM SIGPLAN Conference on Programming Language Designand Implementation (PLDI) , vol. 48, pp. 15–26, jun 2013.[8] S. Parihar, Z. Dadachanji, P. K. Singh, R. Das, A. Karkare, andA. Bhattacharya, “Automatic Grading and Feedback using ProgramRepair for Introductory Programming Courses,” in
Annual Conferenceon Innovation and Technology in Computer Science Education, ITiCSE ,2017.[9] B. C. W¨unsche, T. Suselo, W. Van Der Mark, Z. Chen, K. C. Leung,A. Luxton-Reilly, L. Shaw, D. Dimalen, and R. Lobb, “Automaticassessment of OpenGL computer graphics assignments,” in
AnnualConference on Innovation and Technology in Computer Science Education,ITiCSE , pp. 81–86, 2018.[10] S. Sridhara, B. Hou, J. Lu, and J. DeNero, “Fuzz Testing Projects inMassive Courses,” in
Proceedings of the Third (2016) ACM Conferenceon Learning @ Scale - L@S ’16 , pp. 361–367, ACM Press, 2016. [11] A. Leite and S. A. S. A. Blanco, “Effects of human vs. automatic feedbackon students’ understanding of ai concepts and programming style,” in
Proceedings of the 51st ACM Technical Symposium on Computer ScienceEducation (SIGCSE ’20) , vol. 20, pp. 44–50, Association for ComputingMachinery, feb 2020.[12] B. S. Clegg, “Gradeer Repository.” [Online; accessed 2020-10-18]https://github.com/ben-clegg/gradeer.[13] The Apache Software Foundation, “Apache Ant.” [Online; accessed2020-10-16] https://ant.apache.org/.[14] Checkstyle, “Checkstyle.” [Online; accessed 2020-10-16]https://checkstyle.sourceforge.io/.[15] PMD, “PMD.” [Online; accessed 2020-10-16] https://pmd.github.io/.[16] S. Zschaler, S. White, K. Hodgetts, and M. Chapman, “Modularityfor Automated Assessment: A Design-Space Exploration,” in
SoftwareEngineering 18
Proceedings of the Conferenceon Integrating Technology into Computer Science Education, ITiCSE ,(New York, New York, USA), p. 328, ACM Press, 2008.[19] S. H. Edwards, “What is Web-CAT? - Web-CAT.” [Online; accessed2020-10-15] http://web-cat.org/projects/Web-CAT/WhatIsWebCat.html.[20] M. Goedicke, M. Striewe, and M. Balz, “Computer aided assessmentsand programming exercises with JACK,” tech. rep., 2008.[21] S. Krusche and A. Seitz, “ArTEMiS - An Automatic AssessmentManagement System for Interactive Learning,” in
Proceedings of the 49thACM Technical Symposium on Computer Science Education - SIGCSE’18 , vol. 2018-Janua, (New York, New York, USA), pp. 284–289, ACMPress, feb 2018.[22] D. M. Souza, K. R. Felizardo, and E. F. Barbosa, “A systematic literaturereview of assessment tools for programming assignments,”
Proceedings- 2016 IEEE 29th Conference on Software Engineering Education andTraining, CSEEandT 2016 , pp. 147–156, apr 2016.[23] J. Breitner, M. Hecker, and G. Snelting, “Der Grader Praktomat,”
Autom.Bewertung der Program. Digit. , 2017.[24] D. Jackson, “A Semi-Automated Approach to Online Assessment,” in