Is this you? Create Your Porfile

Stefan Ruseti

Politehnica University of Bucharest

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Stefan Ruseti is active.

Explore More

Publication

Featured researches published by Stefan Ruseti.

Procedia Computer Science | 2017

Sentence selection with neural networks using string kernels

Mihai Masala; Stefan Ruseti; Traian Rebedea

Abstract In recent years, there have been several advancements in question answering systems. These were achieved both due to the availability of a greater number of datasets, some of them significantly larger in size than any of the existing corpora, and to the recent advancements in deep learning for text classification. In this paper, we explore the improvements achieved by employing neural networks using the features computed by a string kernel for sentence/answer selection. We have validated this approach using two different standard corpora used as benchmarks in question answering and we have found a significant improvement over string kernels and other unsupervised methods for sentence selection.

international conference on advanced learning technologies | 2017

Unlocking the Power of Word2Vec for Identifying Implicit Links

Gabriel-Marius Gutu; Mihai Dascalu; Stefan Ruseti; Traian Rebedea; Stefan Trausan-Matu

This paper presents a research on using Word2Vec for determining implicit links in multi-participant Computer-Supported Collaborative Learning chat conversations. Word2Vec is a powerful and one of the newest Natural Language Processing semantic models used for computing text cohesion and similarity between documents. This research considers cohesion scores in terms of the strength of the semantic relations established between two utterances, the higher the score, the stronger the similarity between two utterances. An implicit link is established based on cohesion to the most similar previous utterance, within an imposed window. Three similarity formulas were used to compute the cohesion score: an unnormalized score, a normalized score with distance and Mihalceas formula. Our corpus of conversations incorporated explicit references provided by authors, which were used for validation. A window of 5 utterances and a 1-minute time frame provided the highest detection rate both for exact matching and matching of a block of continuous utterances belonging to the same speaker. Moreover, the unnormalized score correctly identified the largest number of implicit links.

european conference on technology enhanced learning | 2017

ReaderBench: A Multi-lingual Framework for Analyzing Text Complexity

Mihai Dascalu; Gabriel-Marius Gutu; Stefan Ruseti; Ionut Cristian Paraschiv; Philippe Dessus; Danielle S. McNamara; Scott A. Crossley; Stefan Trausan-Matu

Assessing textual complexity is a difficult, but important endeavor, especially for adapting learning materials to students’ and readers’ levels of understanding. With the continuous growth of information technologies spanning through various research fields, automated assessment tools have become reliable solutions to automatically assessing textual complexity. ReaderBench is a text processing framework relying on advanced Natural Language Processing techniques that encompass a wide range of text analysis modules available in a variety of languages, including English, French, Romanian, and Dutch. To our knowledge, ReaderBench is the only open-source multilingual textual analysis solution that provides unified access to more than 200 textual complexity indices including: surface, syntactic, morphological, semantic, and discourse specific factors, alongside cohesion metrics derived from specific lexicalized ontologies and semantic models.

artificial intelligence in education | 2017

ReaderBench Learns Dutch: Building a Comprehensive Automated Essay Scoring System for Dutch Language

Mihai Dascalu; Wim Westera; Stefan Ruseti; Stefan Trausan-Matu; Hub Kurvers

Automated Essay Scoring has gained a wider applicability and usage with the integration of advanced Natural Language Processing techniques which enabled in-depth analyses of discourse in order capture the specificities of written texts. In this paper, we introduce a novel Automatic Essay Scoring method for Dutch language, built within the Readerbench framework, which encompasses a wide range of textual complexity indices, as well as an automated segmentation approach. Our method was evaluated on a corpus of 173 technical reports automatically split into sections and subsections, thus forming a hierarchical structure on which textual complexity indices were subsequently applied. The stepwise regression model explained 30.5% of the variance in students’ scores, while a Discriminant Function Analysis predicted with substantial accuracy (75.1%) whether they are high or low performance students.

artificial intelligence in education | 2018

Identifying Implicit Links in CSCL Chats Using String Kernels and Neural Networks.

Mihai Masala; Stefan Ruseti; Gabriel Gutu-Robu; Traian Rebedea; Mihai Dascalu; Stefan Trausan-Matu

Chat conversations between more than two participants are often used in Computer Supported Collaborative Learning (CSCL) scenarios because they enhance collaborative knowledge sharing and sustain creativity. However, multi-participant chats are more difficult to follow and analyze due to the complex ways in which different discussion threads and topics can interact. This paper introduces a novel method based on neural networks for detecting implicit links that uses features computed with string kernels and word embeddings. In contrast to previous experiments with an accuracy of 33%, we obtained a considerable increase to 44% for the same dataset. Our method represents an alternative to more complex deep neural networks that cannot be properly used due to overfitting on limited training data.

meeting of the association for computational linguistics | 2016

Using Embedding Masks for Word Categorization

Stefan Ruseti; Traian Rebedea; Stefan Trausan-Matu

Word embeddings are widely used nowadays for many NLP tasks. They reduce the dimensionality of the vocabulary space, but most importantly they should capture (part of) the meaning of words. The new vector space used by the embeddings allows computation of semantic distances between words, while some word embeddings also permit simple vector operations (e.g. summation, difference) resembling analogical reasoning. This paper proposes a new operation on word embeddings aimed to capturing categorical information by first learning and then applying an embedding mask for each analyzed category. Thus, we conducted a series of experiments related to categorization of words based on their embeddings. Several classical approaches were compared together with the one introduced in the paper which uses different embedding masks learnt for each category.

artificial intelligence methodology systems applications | 2016

Expressing Sentiments in Game Reviews

Ana Secui; Maria-Dorinela Sirbu; Mihai Dascalu; Scott A. Crossley; Stefan Ruseti; Stefan Trausan-Matu

Opinion mining and sentiment analysis are important research areas of Natural Language Processing (NLP) tools and have become viable alternatives for automatically extracting the affective information found in texts. Our aim is to build an NLP model to analyze gamers’ sentiments and opinions expressed in a corpus of 9750 game reviews. A Principal Component Analysis using sentiment analysis features explained 51.2 % of the variance of the reviews and provides an integrated view of the major sentiment and topic related dimensions expressed in game reviews. A Discriminant Function Analysis based on the emerging components classified game reviews into positive, neutral and negative ratings with a 55 % accuracy.

intelligent tutoring systems | 2018

Scoring Summaries Using Recurrent Neural Networks

Stefan Ruseti; Mihai Dascalu; Amy M. Johnson; Danielle S. McNamara; Renu Balyan; Kathryn S. McCarthy; Stefan Trausan-Matu

Summarization enhances comprehension and is considered an effective strategy to promote and enhance learning and deep understanding of texts. However, summarization is seldom implemented by teachers in classrooms because the manual evaluation requires a lot of effort and time. Although the need for automated support is stringent, there are only a few shallow systems available, most of which rely on basic word/n-gram overlaps. In this paper, we introduce a hybrid model that uses state-of-the-art recurrent neural networks and textual complexity indices to score summaries. Our best model achieves over 55% accuracy for a 3-way classification that measures the degree to which the main ideas from the original text are covered by the summary . Our experiments show that the writing style, represented by the textual complexity indices, together with the semantic content grasped within the summary are the best predictors, when combined. To the best of our knowledge, this is the first work of its kind that uses RNNs for scoring and evaluating summaries.

european conference on information retrieval | 2018

Improving Deep Learning for Multiple Choice Question Answering with Candidate Contexts

Bogdan Nicula; Stefan Ruseti; Traian Rebedea

Deep learning solutions have been widely used lately for improving question answering systems, especially as the amount of training data has increased. However, these solutions have been developed for specific tasks, when both the question and the candidate answers are long enough for the deep learning models to provide a better text representation and a more complex similarity function. For multiple choice questions that have short answers, information retrieval solutions are still largely used. In this paper we propose a novel deep learning model that determines the correct answer by combining the representation of each question-candidate answer pair with candidate contexts extracted from Wikipedia using a search engine.

artificial intelligence in education | 2018

Predicting Question Quality Using Recurrent Neural Networks.

Stefan Ruseti; Mihai Dascalu; Amy M. Johnson; Renu Balyan; Kristopher J. Kopp; Danielle S. McNamara; Scott A. Crossley; Stefan Trausan-Matu

This study assesses the extent to which machine learning techniques can be used to predict question quality. An algorithm based on textual complexity indices was previously developed to assess question quality to provide feedback on questions generated by students within iSTART (an intelligent tutoring system that teaches reading strategies). In this study, 4,575 questions were coded by human raters based on their corresponding depth, classifying questions into four categories: 1-very shallow to 4-very deep. Here we propose a novel approach to assessing question quality within this dataset based on Recurrent Neural Networks (RNNs) and word embeddings. The experiments evaluated multiple RNN architectures using GRU, BiGRU and LSTM cell types of different sizes, and different word embeddings (i.e., FastText and Glove). The most precise model achieved a classification accuracy of 81.22%, which surpasses the previous prediction results using lexical sophistication complexity indices (accuracy = 41.6%). These results are promising and have implications for the future development of automated assessment tools within computer-based learning environments.

Explore More