Luigi Di Caro | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Luigi Di Caro is active.

Explore More

Publication

Featured researches published by Luigi Di Caro.

Proceedings of the Tenth International Workshop on Multimedia Data Mining | 2010

Emerging topic detection on Twitter based on temporal and social terms evaluation

Mario Cataldi; Luigi Di Caro; Claudio Schifanella

Twitter is a user-generated content system that allows its users to share short text messages, called tweets, for a variety of purposes, including daily conversations, URLs sharing and information news. Considering its world-wide distributed network of users of any age and social condition, it represents a low level news flashes portal that, in its impressive short response time, has the principal advantage. In this paper we recognize this primary role of Twitter and we propose a novel topic detection technique that permits to retrieve in real-time the most emergent topics expressed by the community. First, we extract the contents (set of terms) of the tweets and model the term life cycle according to a novel aging theory intended to mine the emerging ones. A term can be defined as emerging if it frequently occurs in the specified time interval and it was relatively rare in the past. Moreover, considering that the importance of a content also depends on its source, we analyze the social relationships in the network with the well-known Page Rank algorithm in order to determine the authority of the users. Finally, we leverage a navigable topic graph which connects the emerging terms with other semantically related keywords, allowing the detection of the emerging topics, under user-specified time constraints. We provide different case studies which show the validity of the proposed approach.

management of emergent digital ecosystems | 2009

CoSeNa: a context-based search and navigation system

Mario Cataldi; Claudio Schifanella; K. Selçuk Candan; Maria Luisa Sapino; Luigi Di Caro

Most of the existing document and web search engines rely on keyword-based queries. To find matches, these queries are processed using retrieval algorithms that rely on word frequencies, topic recentness, document authority, and (in some cases) available ontologies. In this paper, we propose an innovative approach to exploring text collections using a novel keywords-by-concepts (KbC) graph, which supports navigation using domain-specific concepts as well as keywords that are characterizing the text corpus. The KbC graph is a weighted graph, created by tightly integrating keywords extracted from documents and concepts obtained from domain taxonomies. Documents in the corpus are associated to the nodes of the graph based on evidence supporting contextual relevance; thus, the KbC graph supports contextually informed access to these documents. In this paper, we also present CoSeNa (Context-based Search and Navigation) system that leverages the KbC model as the basis for document exploration and retrieval as well as contextually-informed media integration.

knowledge discovery and data mining | 2010

Analyzing the role of dimension arrangement for data visualization in radviz

Luigi Di Caro; Vanessa Frias-Martinez; Enrique Frias-Martinez

The Radial Coordinate Visualization (Radviz) technique has been widely used to effectively evaluate the existence of patterns in highly dimensional data sets A crucial aspect of this technique lies in the arrangement of the dimensions, which determines the quality of the posterior visualization Dimension arrangement (DA) has been shown to be an NP-problem and different heuristics have been proposed to solve it using optimization techniques However, very little work has focused on understanding the relation between the arrangement of the dimensions and the quality of the visualization In this paper we first present two variations of the DA problem: (1) a Radviz independent approach and (2) a Radviz dependent approach We then describe the use of the Davies-Bouldin index to automatically evaluate the quality of a visualization i.e., its visual usefulness Our empirical evaluation is extensive and uses both real and synthetic data sets in order to evaluate our proposed methods and to fully understand the impact that parameters such as number of samples, dimensions, or cluster separability have in the relation between the optimization algorithm and the visualization tool.

Artificial Intelligence and Law | 2016

Eunomos, a legal document and knowledge management system for the Web to provide relevant, reliable and up-to-date information on the law

Guido Boella; Luigi Di Caro; Llio Humphreys; Livio Robaldo; Piercarlo Rossi; Leendert W. N. van der Torre

This paper describes the Eunomos software, an advanced legal document and knowledge management system, based on legislative XML and ontologies. We describe the challenges of legal research in an increasingly complex, multi-level and multi-lingual world and how the Eunomos software helps users cut through the information overload to get the legal information they need in an organized and structured way and keep track of the state of the relevant law on any given topic. Using NLP tools to semi-automate the lower-skill tasks makes this ambitious project a realistic commercial prospect as it helps keep costs down while at the same time allowing greater coverage. We describe the core system from workflow and technical perspectives, and discuss applications of the system for various user groups.

knowledge discovery and data mining | 2008

Using tagflake for condensing navigable tag hierarchies from tag clouds

Luigi Di Caro; K. Selçuk Candan; Maria Luisa Sapino

We present the tagFlake system, which supports semantically informed navigation within a tag cloud. tagFlake relies on TMine for organizing tags extracted from textual content in hierarchical organizations, suitable for navigation, visualization, classification, and tracking. TMine extracts the most significant tag/terms from text documents and maps them onto a hierarchy in such a way that descendant terms are contextually dependent on their ancestors within the given corpus of documents. This provides tagFlake with a mechanism for enabling navigation within the tag space and for classification of the text documents based on the contextual structure captured by the created hierarchy. tagFlake is language neutral, since it does not rely on any natural language processing technique and is unsupervised.

ACM Transactions on Intelligent Systems and Technology | 2013

Personalized emerging topic detection based on a term aging model

Mario Cataldi; Luigi Di Caro; Claudio Schifanella

Twitter is a popular microblogging service that acts as a ground-level information news flashes portal where people with different background, age, and social condition provide information about what is happening in front of their eyes. This characteristic makes Twitter probably the fastest information service in the world. In this article, we recognize this role of Twitter and propose a novel, user-aware topic detection technique that permits to retrieve, in real time, the most emerging topics of discussion expressed by the community within the interests of specific users. First, we analyze the topology of Twitter looking at how the information spreads over the network, taking into account the authority/influence of each active user. Then, we make use of a novel term aging model to compute the burstiness of each term, and provide a graph-based method to retrieve the minimal set of terms that can represent the corresponding topic. Finally, since any user can have topic preferences inferable from the shared content, we leverage such knowledge to highlight the most emerging topics within her foci of interest. As evaluation we then provide several experiments together with a user study proving the validity and reliability of the proposed approach.

intelligent information systems | 2014

Learning from syntax generalizations for automatic semantic annotation

Guido Boella; Luigi Di Caro; Alice Ruggeri; Livio Robaldo

Nowadays, there is a huge amount of textual data coming from on-line social communities like Twitter or encyclopedic data provided by Wikipedia and similar platforms. This Big Data Era created novel challenges to be faced in order to make sense of large data storages as well as to efficiently find specific information within them. In a more domain-specific scenario like the management of legal documents, the extraction of semantic knowledge can support domain engineers to find relevant information in more rapid ways, and to provide assistance within the process of constructing application-based legal ontologies. In this work, we face the problem of automatically extracting structured knowledge to improve semantic search and ontology creation on textual databases. To achieve this goal, we propose an approach that first relies on well-known Natural Language Processing techniques like Part-Of-Speech tagging and Syntactic Parsing. Then, we transform these information into generalized features that aim at capturing the surrounding linguistic variability of the target semantic units. These new featured data are finally fed into a Support Vector Machine classifier that computes a model to automate the semantic annotation. We first tested our technique on the problem of automatically extracting semantic entities and involved objects within legal texts. Then, we focus on the identification of hypernym relations and definitional sentences, demonstrating the validity of the approach on different tasks and domains.

international conference on artificial intelligence and law | 2015

Linking legal open data: breaking the accessibility and language barrier in european legislation and case law

Guido Boella; Luigi Di Caro; Michele Graziadei; Loredana Cupi; Carlo Emilio Salaroglio; Llio Humphreys; Hristo Konstantinov; Kornel Marko; Livio Robaldo; Claudio Ruffini; Kiril Simov; Andrea Violato; Veli Stroetmann

In this paper we describe how the EUCases FP7 project is addressing the problem of lifting Legal Open Data to Linked Open Data to develop new applications for the legal information provision market by enriching structurally the documents (first of all with navigable references among legal texts) and semantically (with concepts from ontologies and classification). First we describe the social and economic need for breaking the accessibility barrier in legal information in the EU, then we describe the technological challenges and finally we explain how the EUCases project is addressing them by a combination of Human Language Technologies.

rules and rule markup languages for the semantic web | 2013

Semantic relation extraction from legislative text using generalized syntactic dependencies and support vector machines

Guido Boella; Luigi Di Caro; Livio Robaldo

In this paper we present a technique to automatically extract semantic knowledge from legislative text. Instead of using pattern matching methods relying on lexico-syntactic patterns, we propose a technique which uses syntactic dependencies between terms extracted with a syntactic parser. The idea is that syntactic information are more robust than pattern matching approaches when facing length and complexity of the sentences. Relying on a manually annotated legislative corpus, we transform all the surrounding syntax of the semantic information into abstract textual representations, which are then used to create a classification model by means of a standard Support Vector Machine system. In this work, we initially focus on three different semantic tags, achieving very high accuracy levels on two of them, demonstrating both the limits and the validity of the approach.

Scientometrics | 2012

The d-index: Discovering dependences among scientific collaborators from their bibliographic data records

Luigi Di Caro; Mario Cataldi; Claudio Schifanella

The evaluation of the work of a researcher and its impact on the research community has been deeply studied in literature through the definition of several measures, first among all the h-index and its variations. Although these measures represent valuable tools for analyzing researchers’ outputs, they usually assume the co-authorship to be a proportional collaboration between the parts, missing out their relationships and the relative scientific influences. In this work, we propose the d-index, a novel measure that estimates the dependence degree between authors on their research environment along their entire scientific publication history. We also present a web application that implements these ideas and provides a number of visualization tools for analyzing and comparing scientific dependences among all the scientists in the DBLP bibliographic database. Finally, relying on this web environment, we present case and user studies that highlight both the validity and the reliability of the proposed evaluation measure.

Explore More