Juho Leinonen | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Juho Leinonen is active.

Explore More

Publication

Featured researches published by Juho Leinonen.

koli calling international conference on computing education research | 2015

Identification of programmers from typing patterns

Krista Longi; Juho Leinonen; Henrik Nygren; Joni Salmi; Arto Klami; Arto Vihavainen

Being able to identify the user of a computer solely based on their typing patterns can lead to improvements in plagiarism detection, provide new opportunities for authentication, and enable novel guidance methods in tutoring systems. However, at the same time, if such identification is possible, new privacy and ethical concerns arise. In our work, we explore methods for identifying individuals from typing data captured by a programming environment as these individuals are learning to program. We compare the identification accuracy of automatically generated user profiles, ranging from the average amount of time that a user needs between keystrokes to the amount of time that it takes for the user to press specific pairs of keys, digraphs. We also explore the effect of data quantity and different acceptance thresholds on the identification accuracy, and analyze how the accuracy changes when identifying individuals across courses. Our results show that, while the identification accuracy varies depending on data quantity and the method, identification of users based on their programming data is possible. These results indicate that there is potential in using this method, for example, in identification of students taking exams, and that such data has privacy concerns that should be addressed.

learning at scale | 2017

Preventing Keystroke Based Identification in Open Data Sets

Juho Leinonen; Petri Ihantola; Arto Hellas

Large-scale courses such as Massive Online Open Courses (MOOCs) can be a great data source for researchers. Ideally, the data gathered on such courses should be openly available to all researchers. Studies could be easily replicated and novel studies on existing data could be conducted. However, very fine-grained data such as source code snapshots can contain hidden identifiers. For example, distinct typing patterns that identify individuals can be extracted from such data. Hence, simply removing explicit identifiers such as names and student numbers is not sufficient to protect the privacy of the users who have supplied the data. At the same time, removing all keystroke information would decrease the value of the shared data significantly. In this work, we study how keystroke data from a programming context could be modified to prevent keystroke latency based identification whilst still retaining information that can be used to e.g. infer programming experience. We investigate the degree of anonymization required to render identification of students based on their typing patterns unreliable. Then, we study whether the modified keystroke data can still be used to infer the programming experience of the students as a case study of whether the anonymized typing patterns have retained at least some informative value. We show that it is possible to modify data so that keystroke latency based identification is no longer accurate, but the programming experience of the students can still be inferred, i.e. the data still has value to researchers. In a broader context, our results indicate that information and anonymity are not necessarily mutually exclusive.

integrating technology into computer science education | 2018

Crowdsourcing programming assignments with CrowdSorcerer

Nea Pirttinen; Vilma Kangas; Irene Nikkarinen; Henrik Nygren; Juho Leinonen; Arto Hellas

Small automatically assessed programming assignments are an often used resource for learning programming. Creating sufficiently large amounts of such assignments is, however, time consuming. As a consequence, offering large quantities of practice assignments to students is not always possible. CrowdSorcerer is an embeddable open-source system that students and teachers alike can use for creating and evaluating small automatically assessed programming assignments. While creating programming assignments, the students also write simple input-output -tests, and are gently introduced to the basics of testing. Students can also evaluate the assignments of others and provide feedback on them, which exposes them to code written by others early in their education. In this article we both describe the CrowdSorcerer system and our experiences in using the system in a large undergraduate programming course. Moreover, we discuss the motivation for crowdsourcing course assignments and present some usage statistics.

integrating technology into computer science education | 2018

Taxonomizing features and methods for identifying at-risk students in computing courses

Arto Hellas; Petri Ihantola; Andrew Petersen; Vangel V. Ajanovski; Mirela Gutica; Timo Hynninen; Antti Knutas; Juho Leinonen; Chris H. Messom; Soohyun Nam Liao

Since computing education began, we have sought to learn why students struggle in computer science and how to identify these at-risk students as early as possible. Due to the increasing availability of instrumented coding tools in introductory CS courses, the amount of direct observational data of student working patterns has increased significantly in the past decade, leading to a flurry of attempts to identify at-risk students using data mining techniques on code artifacts. The goal of this work is to produce a systematic literature review to describe the breadth of work being done on the identification of at-risk students in computing courses. In addition to the review itself, which will summarize key areas of work being completed in the field, we will present a taxonomy (based on data sources, methods, and contexts) to classify work in the area.

koli calling international conference on computing education research | 2017

Thought crimes and profanities whilst programming

Juho Leinonen; Arto Hellas

Where should we draw the line of inappropriate conduct on a course that is given freely to anyone? If an individual starts profusely swearing on a lecture, they are most likely expelled from the class or even from the course. But what if they do it outside the lecture amongst their classmates, amongst a group of anonymous individuals - or by themselves? In this article, we study how students use profanities in source code when they are completing programming assignments on a massive open online course (MOOC). We examine how common it is to curse in source code as well as whether specific assignments incite more cursing than others. Additionally, we investigate differences between participants with regards to cursing. Our results indicate that a considerable amount of participants write curse words whilst programming, but most clean their code for the final submission. The data also shows that there are different degrees of profanity in use, ranging from quite inoffensive words to offensive racial slurs. Finally, we discuss options that one may take when individuals who swear are identified, starting from rescinding their right to study.

koli calling international conference on computing education research | 2017

Identification based on typing patterns between programming and free text

Petrus Peltola; Vilma Kangas; Nea Pirttinen; Henrik Nygren; Juho Leinonen

Identifying people based on their typing has been studied successfully in multiple different contexts. Previous research has shown that identification is possible based on writing predetermined texts such as typing passwords, free text such as essays, as well based on writing source code. In this work, we study typing pattern based identification when the text format and writing environment change. We replicate two earlier studies which suggested that typing profile identification works with programming data, and that it can be applied to a programming exam circumstances with decent results. Then, we examine how the identification accuracy changes when the user profiles are built using data from programming, and the identification is conducted on data from writing free text. Our results show that the identification accuracy is indeed high within the context of programming data, but drops when identifying essay typists based on typing profiles built from their programming data.

international conference informatics schools | 2017

Adolescent and Adult Student Attitudes Towards Progress Visualizations

Onni Aarne; Petrus Peltola; Antti Leinonen; Juho Leinonen; Arto Hellas

Keeping students motivated for the duration of a course is easier said than done. Contextualizing student efforts with learning progress visualizations can help maintain engagement. However, progress can be visualized in many different ways. So far very little research has been done into which types of visualizations are most effective, and how different contexts affect the effectiveness of visualizations. We compare the effects of two different progress visualizations in an introductory programming course. Preliminary results show that older students prefer a visualization that emphasizes long-term progress, whereas younger students prefer a visualization that highlights progress within a single week. Additionally, students perform better and are more motivated when their visualization matches their age group’s preferred visualization. Possible explanations and implications are discussed.

technical symposium on computer science education | 2016