Manuel Cebrián | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Manuel Cebrián is active.

Explore More

Publication

Featured researches published by Manuel Cebrián.

IEEE Transactions on Information Theory | 2007

The Normalized Compression Distance Is Resistant to Noise

Manuel Cebrián; Manuel Alfonseca; Alfonso Ortega

This correspondence studies the influence of noise on the normalized compression distance (NCD), a measure based on the use of compressors to compute the degree of similarity of two files. This influence is approximated by a first order differential equation which gives rise to a complex effect, which explains the fact that the NCD may give values greater than 1, observed by other authors. The model is tested experimentally with good adjustment. Finally, the influence of noise on the clustering of files of different types is explored, finding that the NCD performs well even in the presence of quite high noise levels

IEEE Signal Processing Magazine | 2012

Modeling Dynamical Influence in Human Interaction: Using data to make better inferences about influence within social systems

Wei Pan; Wen Dong; Manuel Cebrián; Taemie Jung Kim; James H. Fowler; Alex Paul Pentland

How can we model influence between individuals in a social system, even when the network of interactions is unknown? In this article, we review the literature on the “influence model,” which utilizes independent time series to estimate how much the state of one actor affects the state of another actor in the system. We extend this model to incorporate dynamical parameters that allow us to infer how influence changes over time, and we provide three examples of how this model can be applied to simulated and real data. The results show that the model can recover known estimates of influence, it generates results that are consistent with other measures of social networks, and it allows us to uncover important shifts in the way states may be transmitted between actors at different points in time.

principles and practice of constraint programming | 2003

Redundant modeling for The QuasiGroup Completion Problem

Iván Dotú; Alvaro del Val; Manuel Cebrián

The Quasigroup Completion Problem (QCP) is a very challenging benchmark among combinatorial problems, and the focus of much recent interest in the area of constraint programming. [5] reports that QCPs of order 40 could not be solved by pure constraint programming approaches, but could sometimes be solved by hybrid approaches combining constraint programming with mixed integer programming techniques from operations research. In this paper, we show that the pure constraint satisfaction approach can solve many problems of order 45 in the transition phase, which corresponds to the peak of difficulty. Our solution combines a number of known ideas -the use of redundant modeling [3] with primal and dual models of the problem connected by channeling constraints [13] - with some novel aspects, as well as a new and very effective value ordering heuristic.

IEEE Transactions on Evolutionary Computation | 2009

Towards the Validation of Plagiarism Detection Tools by Means of Grammar Evolution

Manuel Cebrián; Manuel Alfonseca; Alfonso Ortega

Student plagiarism is a major problem in universities worldwide. In this paper, we focus on plagiarism in answers to computer programming assignments, where students mix and/or modify one or more original solutions to obtain counterfeits. Although several software tools have been developed to help the tedious and time consuming task of detecting plagiarism, little has been done to assess their quality, because determining the real authorship of the whole submission corpus is practically impossible for markers. In this paper, we present a grammar evolution technique which generates benchmarks for testing plagiarism detection tools. Given a programming language, our technique generates a set of original solutions to an assignment, together with a set of plagiarisms of the former set which mimic the basic plagiarism techniques performed by students. The authorship of the submission corpus is predefined by the user, providing a base for the assessment and further comparison of copy-catching tools. We give empirical evidence of the suitability of our approach by studying the behavior of one advanced plagiarism detection tool (AC) on four benchmarks coded in APL2, generated with our technique.

principles and practice of constraint programming | 2008

Protein Structure Prediction with Large Neighborhood Constraint Programming Search

Iván Dotú; Manuel Cebrián; Pascal Van Hentenryck; Peter Clote

Protein structure predictions is regarded as a highly challenging problem both for the biology and for the computational communities. Many approaches have been developed in the recent years, moving to increasingly complex lattice models or even off-lattice models. This paper presents a Large Neighborhood Search (LNS) to find the native state for the Hydrophobic-Polar (HP) model on the Face Centered Cubic (FCC) lattice or, in other words, a self- avoiding walk on the FCC lattice having a maximum number of H-H contacts. The algorithm starts with a tabu-search algorithm, whose solution is then improved by a combination of constraint programming and LNS. This hybrid algorithm improves earlier approaches in the literature over several well-known instances and demonstrates the potential of constraint-programming approaches for ab initiomethods.

arXiv: Information Theory | 2008

Evaluating the Impact of Information Distortion on Normalized Compression Distance

Ana Granados; Manuel Cebrián; David Camacho; Francisco de Borja Rodríguez

A process is provided for dyeing textile materials in a solid shade with a jet injection dyeing apparatus, including conveying means for transporting the textile, jet orifices for delivering dye to said textile material, and control means for supplying data to control the operation of the application of dye from the jet orifices to the textile material, which comprises the steps of: modifying the textile material prior to dyeing of same by applying to said textile material an aqueous admixture containing an effective minor amount of a thickening agent to maintain the viscosity of said aqueous admixture at from about 150 to about 750 centipoises, preferably about 200 to about 400 centipoises, to thoroughly wet said textile material; dyeing said textile material in a solid shade with an acid dye composition having a viscosity of from about 150 to about 750 centipoises, preferably from about 200 to about 400 centipoises, by applying said dye composition by means of said jet injection dyeing apparatus in an amount of at least about 300 percent based on the weight of said textile material; the pH of the textile material at the point of contact between said dye composition and said textile material being maintained at from about 3.5 to about 7.5, fixing said dye on said textile material, washing said textile material to remove any unfixed dye, and recovering a resulting textile material dyed in a solid shade.

congress on evolutionary computation | 2007

A simple genetic algorithm for music generation by means of algorithmic information theory

Manuel Alfonseca; Manuel Cebrián; Alfonso Ortega

Recent large scale experiments have shown that the normalized information distance, an algorithmic information measure, is among the best similarity metrics for melody classification. This paper proposes the use of this distance as a fitness function which may be used by genetic algorithms to automatically generate music in a given pre-defined style. The minimization of this distance of the generated music to a set of musical guides makes it possible to obtain computer-generated music which recalls the style of a certain human author. The recombination operator plays an important role in this problem and thus several variations are tested to fine tune the genetic algorithm for this application. The superiority of the relative pitch envelope over other music parameters, such as the lengths of the notes, brought us to develop a simplified algorithm that nevertheless obtains interesting results.

genetic and evolutionary computation conference | 2007

Automatic generation of benchmarks for plagiarism detection tools using grammatical evolution

Manuel Cebrián; Manuel Alfonseca; Alfonso Ortega

Student plagiarism is a mayor problem in universities worldwide. In this paper,we focus on plagiarism in answers to computer programming assignments,where student mix and/or modify one or more original solutions to obtain counterfeits. Although several software tools have been implemented to help the tedious and time consuming task of detecting plagiarism, little has been done to assess their quality, because, in fact, determining the original subset of the whole solutionset is practically impossible for graders. In this article we present a Grammatical Evolution technique which generates benchmarks. Given a programming language, our technique generates a set of original solutions to an assignment, together with a set of plagiarisms of the former set which mimic the way in which students act. The phylogeny of the coded solutions is predefined, providing a base for the evaluationof the performance of copy-catching tools. We give empirical evidence of the suitability of our approach by studying the behavior of one state-of-the-art detection tool (AC) on four benchmarks coded in APL2, generated with this technique.

Journal of Statistical Mechanics: Theory and Experiment | 2006

Detecting translations of the same text and data with common source

Kostadin Koroutchev; Manuel Cebrián

Compression based similarity distances have the main drawback of needing the same coding scheme for the objects to be compared. In some situations, there exists significant similarity with no literal shared information: text translations, different coding schemes, etc. To overcome this problem, we present a similarity measure that compares the redundancy structure of the data extracted by means of a Lempel–Ziv compression scheme. Each text is represented as a graph in which vertices are text positions and edges represent shared information; with our measure, two texts are similar if they have the same referential topology when compressed. In this paper we give empirical evidence and a phenomenological explanation that this new measure is a robust indicator, detecting similarity between data coded in different languages. We also regard a textual data without any structure, but with a common source, and find that we can detect such data and distinguish this situation from the previous one.

information theory workshop | 2008

Contextual information retrieval based on algorithmic information theory and statistical outlier detection

Ramírez Martínez; Manuel Cebrián; F. de Borja Rodriguez; David Camacho

This work presents an Information Retrieval technique based on algorithmic information theory (using the normalized compression distance), statistical data outlier detection, and a novel database structure. The paper shows how they all can be integrated to retrieve information from generic databases using long text-based queries. Two important problems are addressed. On the one hand, we analyze and tyr to solve the detection of a particular case of false positives: when the distance among two documents is outlyingly low but there is not actual similarity. On the other hand, we propose a way to structure the database such that the similarity distance estimation scales well with the length of the size of the query. All design choices are justified with an experimental evaluation.

Explore More