Efthimia Aivaloglou | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Efthimia Aivaloglou is active.

Explore More

Publication

Featured researches published by Efthimia Aivaloglou.

international computing education research workshop | 2016

How Kids Code and How We Know: An Exploratory Study on the Scratch Repository

Efthimia Aivaloglou; Felienne Hermans

Block-based programming languages like Scratch, Alice and Blockly are becoming increasingly common as introductory languages in programming education. There is substantial research showing that these visual programming environments are suitable for teaching programming concepts. But, what do people do when they use Scratch? In this paper we explore the characteristics of Scratch programs. To this end we have scraped the Scratch public repository and retrieved 250,000 projects. We present an analysis of these projects in three different dimensions. Initially, we look at the types of blocks used and the size of the projects. We then investigate complexity, used abstractions and programming concepts. Finally we detect code smells such as large scripts, dead code and duplicated code blocks. Our results show that 1) most Scratch programs are small, however Scratch programs consisting of over 100 sprites exist, 2) programming abstraction concepts like procedures are not commonly used and 3) Scratch programs do suffer from code smells including large scripts and unmatched broadcast signals.

international conference on program comprehension | 2016

Do code smells hamper novice programming? A controlled experiment on Scratch programs

Felienne Hermans; Efthimia Aivaloglou

Recently, block-based programming languages like Alice, Scratch and Blockly have become popular tools for programming education. There is substantial research showing that block-based languages are suitable for early programming education. But can block-based programs be smelly too? And does that matter to learners? In this paper we explore the code smells metaphor in the context of block-based programming language Scratch. We conduct a controlled experiment with 61 novice Scratch programmers, in which we divided the novices into three groups. One third receive a non-smelly program, while the other groups receive a program suffering from the Duplication or the Long Method smell respectively. All subjects then perform the same comprehension tasks on their program, after which we measure their time and correctness. The results of the experiment show that code smell indeed influence performance: subjects working on the program exhibiting code smells perform significantly worse, but the smells did not affect the time subjects needed. Investigating different types of tasks in more detail, we find that Long Method mainly decreases system understanding, while Duplication decreases the ease with which subjects modify Scratch programs.

international workshop on software clones | 2017

Software clones in scratch projects: on the presence of copy-and-paste in computational thinking learning

Gregorio Robles; Jesús Moreno-León; Efthimia Aivaloglou; Felienne Hermans

Computer programming is being introduced in schools worldwide as part of a movement that promotes Computational Thinking (CT) skills among young learners. In general, learners use visual, block-based programming languages to acquire these skills, with Scratch being one of the most popular ones. Similar to professional developers, learners also copy and paste their code, resulting in duplication. In this paper we present the findings of correlating the assessment of the CT skills of learners with the presence of software clones in over 230,000 projects obtained from the Scratch platform. Specifically, we investigate i) if software cloning is an extended practice in Scratch projects, ii) if the presence of code cloning is independent of the programming mastery of learners, iii) if code cloning can be found more frequently in Scratch projects that require specific skills (as parallelism or logical thinking), and iv) if learners who have the skills to avoid software cloning really do so. The results show that i) software cloning can be commonly found in Scratch projects, that ii) it becomes more frequent as learners work on projects that require advanced skills, that iii) no CT dimension is to be found more related to the absence of software clones than others, and iv) that learners -even if they potentially know how to avoid cloning- still copy and paste frequently. The insights from this paper could be used by educators and learners to determine when it is pedagogically more effective to address software cloning, by educational programming platform developers to adapt their systems, and by learning assessment tools to provide better evaluations.

ieee international conference on software analysis evolution and reengineering | 2016

Evaluating Automatic Spreadsheet Metadata Extraction on a Large Set of Responses from MOOC Participants

Sohon Roy; Felienne Hermans; Efthimia Aivaloglou; Jos Winter; Arie van Deursen

Spreadsheets are popular end-user computing applications and one reason behind their popularity is that they offer a large degree of freedom to their users regarding the way they can structure their data. However, this flexibility also makes spreadsheets difficult to understand. Textual documentation can address this issue, yet for supporting automatic generation of textual documentation, an important pre-requisite is to extract metadata inside spreadsheets. It is a challenge though, to distinguish between data and metadata due to the lack of universally accepted structural patterns in spreadsheets. Two existing approaches for automatic extraction of spreadsheet metadata were not evaluated on large datasets consisting of user inputs. Hence in this paper, we describe the collection of a large number of user responses regarding identification of spreadsheet metadata from participants of a MOOC. We describe the use of this large dataset to understand how users identify metadata in spreadsheets, and to evaluate two existing approaches of automatic metadata extraction from spreadsheets. The results provide us with directions to follow in order to improve metadata extraction approaches, obtained from insights about user perception of metadata. We also understand what type of spreadsheet patterns the existing approaches perform well and on what type poorly, and thus which problem areas to focus on in order to improve.

ieee international conference on software analysis evolution and reengineering | 2016

Spreadsheets are Code: An Overview of Software Engineering Approaches Applied to Spreadsheets

Felienne Hermans; Bas Jansen; Sohon Roy; Efthimia Aivaloglou; Alaaeddin Swidan; David Hoepelman

Spreadsheets can be considered to be the worlds most successful end-user programming language. In fact, one could say spreadsheets are programs. This paper starts with a comparison of spreadsheets to software: spreadsheets are similar in terms of applications domains, expressive power and maintainability problems. We then reflect upon what makes spreadsheets successful: liveness, directness and an easy deployment environment seem contribute largely to their success. Being a programming language, several techniques from software engineering can be applied to spreadsheets. We present an overview of such research directions, including spreadsheet testing, reverse engineering, smell detection, clone detection and refactoring. Finally, open challenges and future plans for the domain of spreadsheet software engineering are presented.

source code analysis and manipulation | 2015

A grammar for spreadsheet formulas evaluated on two large datasets

Efthimia Aivaloglou; David Hoepelman; Felienne Hermans

Spreadsheets are ubiquitous in the industrial world and often perform a role similar to other computer programs, which makes them interesting research targets. However, there does not exist a reliable grammar that is concise enough to facilitate formula parsing and analysis and to support research on spreadsheet codebases. This paper presents a grammar for spreadsheet formulas that is compatible with the spreadsheet formula language, is compact enough to feasibly implement with a parser generator, and produces parse trees aimed at further manipulation and analysis. We evaluate the grammar against more than one million unique formulas extracted from the well known EUSES and Enron spreadsheet datasets, successfully parsing 99.99%. Additionally, we utilize the grammar to analyze these datasets and measure the frequency of usage of language features in spreadsheet formulas. Finally, we identify smelly constructs and uncommon cases in the syntax of formulas.

IEEE Transactions on Learning Technologies | 2016

Can Learners be Earners? Investigating a Design to Enable MOOC Learners to Apply their Skills and Earn Money in an Online Market Place

Guanliang Chen; Dan Davis; Markus Krause; Efthimia Aivaloglou; Claudia Hauff; Geert-Jan Houben

Massive Open Online Courses (MOOCs) aim to educate the world. More often than not, however, MOOCs fall short of this goal—a majority of learners are already highly educated (with a Bachelor’s degree or more) and come from specific parts of the (developed) world. Learners from developing countries without a higher degree are underrepresented, though desired, in MOOCs. One reason for those learners to drop out of a course can be found in their financial realities and the subsequent limited amount of time they can dedicate to a course besides earning a living. If we could pay learners to take a MOOC, this hurdle would largely disappear. With MOOCS, this leads to the following fundamental challenge: How can learners be paid at scale? Ultimately, we envision a recommendation engine that recommends tasks from online market places such as Upwork or witmart to learners, that are relevant to the course content of the MOOC. In this manner, the learners learn and earn money. To investigate the feasibility of this vision, in this paper, we explored to what extent (1) online market places contain tasks relevant to a specific MOOC, and (2) learners are able to solve real-world tasks correctly and with sufficient quality. Finally, based on our experimental design, we were also able to investigate the impact of real-world bonus tasks in a MOOC on the general learner population.

symposium on visual languages and human-centric computing | 2015

Detecting problematic lookup functions in spreadsheets

Felienne Hermans; Efthimia Aivaloglou; Bas Jansen

Spreadsheets are used heavily in many business domains around the world. They are easy to use and as such enable end-user programmers to and build and maintain all sorts of reports and analyses. In addition to using spreadsheets for modeling and calculation, spreadsheets are often also used for creating reports and dashboards: combining data from different sources and creating overviews. For this, lookup functions can be used: they search for a value in a range and return a corresponding row or column. Lookup functions are common: according to recent research the VLOOKUP is the fifth most common Excel function. In this paper we investigate the use of lookup functions in more detail. We analyze lookup functions within the newly released Enron spreadsheet corpus. The results show that 1) a minority of 43% of lookup formulas use the default setting where an approximate match may be returned, 2) 77% of approximate matches are used unnecessary and 3) 23% of approximate lookups is problematic: they search over unsorted ranges, while this is specifically advised against in the specification, and might lead to wrong results.

Journal of Software: Evolution and Process | 2017

Parsing Excel formulas: A grammar and its application on 4 large datasets

Efthimia Aivaloglou; David Hoepelman; Felienne Hermans

Spreadsheets are popular end user programming tools, especially in the industrial world. This makes them interesting research targets. However, there does not exist a reliable grammar that is concise enough to facilitate formula parsing and analysis and to support research on spreadsheet codebases. This paper presents a grammar for spreadsheet formulas that can successfully parse 99.99% of more than 8 million unique formulas extracted from 4 spreadsheet datasets. Our grammar is compatible with the spreadsheet formula language, recognizes the spreadsheet formula elements that are required for supporting spreadsheets research, and produces parse trees aimed at further manipulation and analysis. Additionally, we use the grammar to analyze the characteristics of the formulas of the 4 datasets in 3 different dimensions: complexity, functionality, and data utilization. Our results show that (1) most Excel formulas are simple, however formulas with more than 50 functions or operations exist, (2) almost all formulas use data from other cells, which is often not local, and (3) a surprising number of referring mechanisms are used by less than 1% of the formulas.

international conference on software engineering | 2017