Julián Urbano | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Julián Urbano is active.

Explore More

Publication

Featured researches published by Julián Urbano.

intelligent information systems | 2013

The neglected user in music information retrieval research

Markus Schedl; Arthur Flexer; Julián Urbano

Personalization and context-awareness are highly important topics in research on Intelligent Information Systems. In the fields of Music Information Retrieval (MIR) and Music Recommendation in particular, user-centric algorithms should ideally provide music that perfectly fits each individual listener in each imaginable situation and for each of her information or entertainment needs. Even though preliminary steps towards such systems have recently been presented at the “International Society for Music Information Retrieval Conference” (ISMIR) and at similar venues, this vision is still far away from becoming a reality. In this article, we investigate and discuss literature on the topic of user-centric music retrieval and reflect on why the breakthrough in this field has not been achieved yet. Given the different expertises of the authors, we shed light on why this topic is a particularly challenging one, taking computer science and psychology points of view. Whereas the computer science aspect centers on the problems of user modeling, machine learning, and evaluation, the psychological discussion is mainly concerned with proper experimental design and interpretation of the results of an experiment. We further present our ideas on aspects crucial to consider when elaborating user-aware music retrieval systems.

Computer Standards & Interfaces | 2013

Named Entity Recognition: Fallacies, challenges and opportunities

Mónica Marrero; Julián Urbano; Sonia Sanchez-Cuadrado; Jorge Morato; Juan Miguel Gómez-Berbís

Abstract Named Entity Recognition serves as the basis for many other areas in Information Management. However, it is unclear what the meaning of Named Entity is, and yet there is a general belief that Named Entity Recognition is a solved task. In this paper we analyze the evolution of the field from a theoretical and practical point of view. We argue that the task is actually far from solved and show the consequences for the development and evaluation of tools. We discuss topics for further research with the goal of bringing the task back to the research scenario.

intelligent information systems | 2013

Evaluation in Music Information Retrieval

Julián Urbano; Markus Schedl; Xavier Serra

The field of Music Information Retrieval has always acknowledged the need for rigorous scientific evaluations, and several efforts have set out to develop and provide the infrastructure, technology and methodologies needed to carry out these evaluations. The community has enormously gained from these evaluation forums, but we have reached a point where we are stuck with evaluation frameworks that do not allow us to improve as much and as well as we want. The community recently acknowledged this problem and showed interest in addressing it, though it is not clear what to do to improve the situation. We argue that a good place to start is again the Text IR field. Based on a formalization of the evaluation process, this paper presents a survey of past evaluation work in the context of Text IR, from the point of view of validity, reliability and efficiency of the experiments. We show the problems that our community currently has in terms of evaluation, point to several lines of research to improve it and make various proposals in that line.

international acm sigir conference on research and development in information retrieval | 2013

On the measurement of test collection reliability

Julián Urbano; Mónica Marrero; Diego Martín

The reliability of a test collection is proportional to the number of queries it contains. But building a collection with many queries is expensive, so researchers have to find a balance between reliability and cost. Previous work on the measurement of test collection reliability relied on data-based approaches that contemplated random what if scenarios, and provided indicators such as swap rates and Kendall tau correlations. Generalizability Theory was proposed as an alternative founded on analysis of variance that provides reliability indicators based on statistical theory. However, these reliability indicators are hard to interpret in practice, because they do not correspond to well known indicators like Kendall tau correlation. We empirically established these relationships based on data from over 40 TREC collections, thus filling the gap in the practical interpretation of Generalizability Theory. We also review the computation of these indicators, and show that they are extremely dependent on the sample of systems and queries used, so much that the required number of queries to achieve a certain level of reliability can vary in orders of magnitude. We discuss the computation of confidence intervals for these statistics, providing a much more reliable tool to measure test collection reliability. Reflecting upon all these results, we review a wealth of TREC test collections, arguing that they are possibly not as reliable as generally accepted and that the common choice of 50 queries is insufficient even for stable rankings.

computer music modeling and retrieval | 2010

Melodic similarity through shape similarity

Julián Urbano; Juan Llorens; Jorge Morato; Sonia Sanchez-Cuadrado

We present a new geometric model to compute the melodic similarity of symbolic musical pieces. Melodies are represented as splines in the pitch-time plane, and their similarity is computed as the similarity of their shape. The model is very intuitive and it is transposition and time scale invariant. We have implemented it with a local alignment algorithm over sequences of n-grams that define spline spans. An evaluation with the MIREX 2005 collections shows that the model performs very well, obtaining the best effectiveness scores ever reported for these collections. Three systems based on this new model were evaluated in MIREX 2010, and the three systems obtained the best results.

international acm sigir conference on research and development in information retrieval | 2013

A comparison of the optimality of statistical significance tests for information retrieval evaluation

Julián Urbano; Mónica Marrero; Diego Martín

Previous research has suggested the permutation test as the theoretically optimal statistical significance test for IR evaluation, and advocated for the discontinuation of the Wilcoxon and sign tests. We present a large-scale study comprising nearly 60 million system comparisons showing that in practice the bootstrap, t-test and Wilcoxon test outperform the permutation test under different optimality criteria. We also show that actual error rates seem to be lower than the theoretically expected 5%, further confirming that we may actually be underestimating significance.

Information Retrieval | 2016

Test collection reliability: a study of bias and robustness to statistical assumptions via stochastic simulation

Julián Urbano

The number of topics that a test collection contains has a direct impact on how well the evaluation results reflect the true performance of systems. However, large collections can be prohibitively expensive, so researchers are bound to balance reliability and cost. This issue arises when researchers have an existing collection and they would like to know how much they can trust their results, and also when they are building a new collection and they would like to know how many topics it should contain before they can trust the results. Several measures have been proposed in the literature to quantify the accuracy of a collection to estimate the true scores, as well as different ways to estimate the expected accuracy of hypothetical collections with a certain number of topics. We can find ad-hoc measures such as Kendall tau correlation and swap rates, and statistical measures such as statistical power and indexes from generalizability theory. Each measure focuses on different aspects of evaluation, has a different theoretical basis, and makes a number of assumptions that are not met in practice, such as normality of distributions, homoscedasticity, uncorrelated effects and random sampling. However, how good these estimates are in practice remains a largely open question. In this paper we first compare measures and estimators of test collection accuracy and propose unbiased statistical estimators of the Kendall tau and tau AP correlation coefficients. Second, we detail a method for stochastic simulation of evaluation results under different statistical assumptions, which can be used for a variety of evaluation research where we need to know the true scores of systems. Third, through large-scale simulation from TREC data, we analyze the bias of a range of estimators of test collection accuracy. Fourth, we analyze the robustness to statistical assumptions of these estimators, in order to understand what aspects of an evaluation are affected by what assumptions and guide in the development of new collections and new measures. All the results in this paper are fully reproducible with data and code available online.

exploiting semantic annotations in information retrieval | 2010

On the definition of patterns for semantic annotation

Mónica Marrero; Julián Urbano; Jorge Morato; Sonia Sanchez-Cuadrado

The semantic annotation of documents is an additional advantage for retrieval, as long as the annotations and their maintenance process scale well. Automatic or semi-automatic annotation tools help in this matter with the use of patterns. In this paper we analyze the advantages of creating these patterns with standard web languages, as well as the requirements they should meet. We adopt the Speech Recognition Grammar Specification, by the W3C, initially intended for speech recognition in the Web. Our objective is to achieve its full adaptation to the information extraction processes, exploiting its powerful recognition, reuse and flexibility capabilities.

acm symposium on applied computing | 2013

A study of COTS integration projects: product characteristics, organization, and life cycle models

Katerina Megas; Gabriella Belli; William B. Frakes; Julián Urbano; Reghu Anguswamy

We present a descriptive and exploratory study of factors that can affect the success of COTS-based systems. Based on a review of the literature and industrial experience, the choice of life cycle model and the amount of glueware required were hypothesized as the main factors in predicting project success. In this study we examined the relationship between different life cycle models and COTS integration project success. Two life cycle models were studied: the sequential model and the iterative model. Seven subjects from six industrial organizations responded to a survey providing data on 23 COTS integration projects. While there was variability between iterative and sequential projects on a variety of organizational and product factors, little difference was found between the life cycle models on the success criteria of projects (i.e. being on time, meeting requirements and being within budget). We found that projects that met two or three of the success criteria had significantly higher scores on project characteristics (organizational plus product) than those meeting none or just one.

conference on information and knowledge management | 2010

Crawling the web for structured documents

Julián Urbano; Juan Loréns; Yorgos Andreadakis; Mónica Marrero

Structured Information Retrieval is gaining a lot of interest in recent years, as this kind of information is becoming an invaluable asset for professional communities such as Software Engineering. Most of the research has focused on XML documents, with initiatives like INEX to bring together and evaluate new techniques focused on structured information. Despite the use of XML documents is the immediate choice, the Web is filled with several other types of structured information, which account for millions of other documents. These documents may be collected directly using standard Web search engines like Google and Yahoo, or following specific search patterns in online repositories like SourceForge. This demo describes a distributed and focused web crawler for any kind of structured documents, and we show with it how to exploit general-purpose resources to gather large amounts of real-world structured documents off the Web. This kind of tool could help building large test collections of other types of documents, such as Java source code for software-oriented search engines or RDF for semantic searching.

Explore More