Marius Popescu
University of Bucharest
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Marius Popescu.
international conference on neural information processing | 2013
Ian J. Goodfellow; Dumitru Erhan; Pierre Carrier; Aaron C. Courville; Mehdi Mirza; Ben Hamner; Will Cukierski; Yichuan Tang; David Thaler; Dong-Hyun Lee; Yingbo Zhou; Chetan Ramaiah; Fangxiang Feng; Ruifan Li; Xiaojie Wang; Dimitris Athanasakis; John Shawe-Taylor; Maxim Milakov; John Park; Radu Tudor Ionescu; Marius Popescu; Cristian Grozea; James Bergstra; Jingjing Xie; Lukasz Romaszko; Bing Xu; Zhang Chuang; Yoshua Bengio
The ICML 2013 Workshop on Challenges in Representation Learning focused on three challenges: the black box learning challenge, the facial expression recognition challenge, and the multimodal learning challenge. We describe the datasets created for these challenges and summarize the results of the competitions. We provide suggestions for organizers of future challenges and some comments on what kind of knowledge can be gained from machine learning competitions.
Neural Networks | 2015
Ian J. Goodfellow; Dumitru Erhan; Pierre Carrier; Aaron C. Courville; Mehdi Mirza; Benjamin Hamner; William Cukierski; Yichuan Tang; David Thaler; Dong-Hyun Lee; Yingbo Zhou; Chetan Ramaiah; Fangxiang Feng; Ruifan Li; Xiaojie Wang; Dimitris Athanasakis; John Shawe-Taylor; Maxim Milakov; John Park; Radu Tudor Ionescu; Marius Popescu; Cristian Grozea; James Bergstra; Jingjing Xie; Lukasz Romaszko; Bing Xu; Zhang Chuang; Yoshua Bengio
The ICML 2013 Workshop on Challenges in Representation Learning(1) focused on three challenges: the black box learning challenge, the facial expression recognition challenge, and the multimodal learning challenge. We describe the datasets created for these challenges and summarize the results of the competitions. We provide suggestions for organizers of future challenges and some comments on what kind of knowledge can be gained from machine learning competitions.
empirical methods in natural language processing | 2014
Radu Tudor Ionescu; Marius Popescu; Aoife Cahill
A common approach in text mining tasks such as text categorization, authorship identification or plagiarism detection is to rely on features like words, part-of-speech tags, stems, or some other high-level linguistic features. In this work, an approach that uses character n-grams as features is proposed for the task of native language identification. Instead of doing standard feature selection, the proposed approach combines several string kernels using multiple kernel learning. Kernel Ridge Regression and Kernel Discriminant Analysis are independently used in the learning stage. The empirical results obtained in all the experiments conducted in this work indicate that the proposed approach achieves state of the art performance in native language identification, reaching an accuracy that is 1.7% above the top scoring system of the 2013 NLI Shared Task. Furthermore, the proposed approach has an important advantage in that it is language independent and linguistic theory neutral. In the cross-corpus experiment, the proposed approach shows that it can also be topic independent, improving the state of the art system by 32.3%.
international conference on computational linguistics | 2010
Cristian Grozea; Marius Popescu
Determining the direction of plagiarism (who plagiarized whom in a given pair of documents) is one of the most interesting problems in the field of automatic plagiarism detection. We present here an approach using an extension of the method Encoplot, which won the 1st international competition on plagiarism detection in 2009. We have tested it on a large-scale corpus of artificial plagiarism, with good results.
Pattern Recognition Letters | 2015
Radu Tudor Ionescu; Marius Popescu
We describe an efficient algorithm to compute the PQ kernel in the dual form.We provide open source implementations of the PQ kernel in Matlab and C/C++.We leverage the use of the PQ kernel for large vocabularies.We present extensive object recognition experiments using various kernels. Computer vision researchers have developed various learning methods based on the bag of words model for image related tasks, including image categorization and image retrieval. In this model, images are represented as histograms of visual words from a vocabulary that is obtained by clustering local image descriptors. Next, a classifier is trained on the data. Most often, the learning method is a kernel-based one. Various kernels, such as the linear kernel, the intersection kernel, the ?2 kernel or the Jensen-Shannon kernel, can be plugged into the kernel method. Recent results indicate that the novel PQ kernel of Ionescu and Popescu 8] seems to improve the accuracy over most of the state of the art kernels. The PQ kernel is inspired from a set of rank correlation statistics specific for ordinal data, that are based on counting concordant and discordant pairs among two variables. This paper describes an algorithm to compute the PQ kernel in O ( n APTARANORMAL log n ) time, based on merge sort. Matlab and C/C++ implementations are provided for future development and use at http://pq-kernel.herokuapp.com. Extensive object recognition experiments are conducted to compare the PQ kernel with other state of the art kernels on two benchmark data sets. The PQ kernel has the best results on both data sets, even when a spatial pyramid representation is used. In conclusion, the PQ kernel can be used to obtain a better pairwise similarity between visual word histograms, which, in turn, improves the object recognition accuracy of the bag of visual words system.
Artificial Intelligence Review | 2008
Florentina Hristea; Marius Popescu; Monica Dumitrescu
This paper aims to fully present a new word sense disambiguation method that has been introduced in Hristea and Popescu (Fundam Inform 91(3–4):547–562, 2009) and so far tested in the case of adjectives (Hristea and Popescu in Fundam Inform 91(3–4):547–562, 2009) and verbs (Hristea in Int Rev Comput Softw 4(1):58–67, 2009). We hereby extend the method to the case of nouns and draw conclusions regarding its performance with respect to all these parts of speech. The method lies at the border between unsupervised and knowledge-based techniques. It performs unsupervised word sense disambiguation based on an underlying Naïve Bayes model, while using WordNet as knowledge source for feature selection. The performance of the method is compared to that of previous approaches that rely on completely different feature sets. Test results for all involved parts of speech show that feature selection using a knowledge source of type WordNet is more effective in disambiguation than local type features (like part-of-speech tags) are.
Computational Linguistics | 2016
Radu Tudor Ionescu; Marius Popescu; Aoife Cahill
The most common approach in text mining classification tasks is to rely on features like words, part-of-speech tags, stems, or some other high-level linguistic features. Recently, an approach that uses only character p-grams as features has been proposed for the task of native language identification (NLI). The approach obtained state-of-the-art results by combining several string kernels using multiple kernel learning. Despite the fact that the approach based on string kernels performs so well, several questions about this method remain unanswered. First, it is not clear why such a simple approach can compete with far more complex approaches that take words, lemmas, syntactic information, or even semantics into account. Second, although the approach is designed to be language independent, all experiments to date have been on English. This work is an extensive study that aims to systematically present the string kernel approach and to clarify the open questions mentioned above.A broad set of native language identification experiments were conducted to compare the string kernels approach with other state-of-the-art methods. The empirical results obtained in all of the experiments conducted in this work indicate that the proposed approach achieves state-of-the-art performance in NLI, reaching an accuracy that is 1.7% above the top scoring system of the 2013 NLI Shared Task. Furthermore, the results obtained on both the Arabic and the Norwegian corpora demonstrate that the proposed approach is language independent. In the Arabic native language identification task, string kernels show an increase of more than 17% over the best accuracy reported so far. The results of string kernels on Norwegian native language identification are also significantly better than the state-of-the-art approach. In addition, in a cross-corpus experiment, the proposed approach shows that it can also be topic independent, improving the state-of-the-art system by 32.3%.To gain additional insights about the string kernels approach, the features selected by the classifier as being more discriminating are analyzed in this work. The analysis also offers information about localized language transfer effects, since the features used by the proposed model are p-grams of various lengths. The features captured by the model typically include stems, function words, and word prefixes and suffixes, which have the potential to generalize over purely word-based features. By analyzing the discriminating features, this article offers insights into two kinds of language transfer effects, namely, word choice (lexical transfer) and morphological differences. The goal of the current study is to give a full view of the string kernels approach and shed some light on why this approach works so well.
international conference on neural information processing | 2012
Liviu P. Dinu; Radu Tudor Ionescu; Marius Popescu
This paper aims to introduce a new distance measure for images, called Local Patch Dissimilarity. This new distance measure is inspired from rank distance which is a distance measure for strings. The distance measure introduced in this paper is based on patches. There are many other patch-based techniques used in image processing. Patches contain contextual information and have advantages in terms of generalization. An algorithm that computes the Local Patch Dissimilarity between two images is presented in this work. Experiments show that the extension of rank distance to images has very good results in image classification, more precisely in handwritten digit recognition.
Fundamenta Informaticae | 2009
Florentina Hristea; Marius Popescu
The present paper extends a new word sense disambiguation method [9] to the case of adjectives. The method lies at the border between unsupervised and knowledge-based techniques. It performs unsupervised word sense disambiguation based on an underlying Naive Bayes model, while using WordNet as knowledge source for feature selection. The proposed extension of the disambiguation method makes ample use of the WordNet semantic relations that are typical of adjectives. Its performance is compared to that of previous approaches that rely on completely different feature sets. Test results show that feature selection using a knowledge source of type WordNet is more effective in the disambiguation of adjective senses than local type features (like part-of-speech tags) are.
Archive | 2016
Radu Tudor Ionescu; Marius Popescu
Knowledge Trancfer Between Computer Vision and Text Mining , Knowledge Trancfer Between Computer Vision and Text Mining , کتابخانه مرکزی دانشگاه علوم پزشکی ایران