Featured Researches

Social And Information Networks

A Visual Analytics Framework for Contrastive Network Analysis

A common network analysis task is comparison of two networks to identify unique characteristics in one network with respect to the other. For example, when comparing protein interaction networks derived from normal and cancer tissues, one essential task is to discover protein-protein interactions unique to cancer tissues. However, this task is challenging when the networks contain complex structural (and semantic) relations. To address this problem, we design ContraNA, a visual analytics framework leveraging both the power of machine learning for uncovering unique characteristics in networks and also the effectiveness of visualization for understanding such uniqueness. The basis of ContraNA is cNRL, which integrates two machine learning schemes, network representation learning (NRL) and contrastive learning (CL), to generate a low-dimensional embedding that reveals the uniqueness of one network when compared to another. ContraNA provides an interactive visualization interface to help analyze the uniqueness by relating embedding results and network structures as well as explaining the learned features by cNRL. We demonstrate the usefulness of ContraNA with two case studies using real-world datasets. We also evaluate through a controlled user study with 12 participants on network comparison tasks. The results show that participants were able to both effectively identify unique characteristics from complex networks and interpret the results obtained from cNRL.

Read more
Social And Information Networks

A comparative analysis of knowledge acquisition performance in complex networks

Discovery processes have been an important topic in the network science field. The exploration of nodes can be understood as the knowledge acquisition process taking place in the network, where nodes represent concepts and edges are the semantical relationships between concepts. While some studies have analyzed the performance of the knowledge acquisition process in particular network topologies, here we performed a systematic performance analysis in well-known dynamics and topologies. Several interesting results have been found. Overall, all learning curves displayed the same learning shape, with different speed rates. We also found ambiguities in the feature space describing the learning curves, meaning that the same knowledge acquisition curve can be generated in different combinations of network topology and dynamics. A surprising example of such patterns are the learning curves obtained from random and Waxman networks: despite the very distinct characteristics in terms of global structure, several curves from different models turned out to be similar. All in all, our results suggest that different learning strategies can lead to the same learning performance. From the network reconstruction point of view, however, this means that learning curves of observed sequences should be combined with other sequence features if one aims at inferring network topology from observed sequences.

Read more
Social And Information Networks

A comparative study of Bot Detection techniques methods with an application related to Covid-19 discourse on Twitter

Bot Detection is an essential asset in a period where Online Social Networks(OSN) is a part of our lives. This task becomes more relevant in crises, as the Covid-19 pandemic, where there is an incipient risk of proliferation of social bots, producing a possible source of misinformation. In order to address this issue, it has been compared different methods to detect automatically social bots on Twitter using Data Selection. The techniques utilized to elaborate the bot detection models include the utilization of features as the tweets metadata or the Digital Fingerprint of the Twitter accounts. In addition, it was analyzed the presence of bots in tweets from different periods of the first months of the Covid-19 pandemic, using the bot detection technique which best fits the scope of the task. Moreover, this work includes also analysis over aspects regarding the discourse of bots and humans, such as sentiment or hashtag utilization.

Read more
Social And Information Networks

A comparative study of similarity-based and GNN-based link prediction approaches

The task of inferring the missing links in a graph based on its current structure is referred to as link prediction. Link prediction methods that are based on pairwise node similarity are well-established approaches in the literature. They show good prediction performance in many real-world graphs though they are heuristics and lack of universal applicability. On the other hand, the success of neural networks for classification tasks in various domains leads researchers to study them in graphs. When a neural network can operate directly on the graph, then it is termed as the graph neural network (GNN). GNN is able to learn hidden features from graphs which can be used for link prediction task in graphs. Link predictions based on GNNs have gained much attention of researchers due to their convincing high performance in many real-world graphs. This appraisal paper studies some similarity and GNN-based link prediction approaches in the domain of homogeneous graphs that consists of a single type of (attributed) nodes and single type of pairwise links. We evaluate the studied approaches against several benchmark graphs with different properties from various domains.

Read more
Social And Information Networks

A deep-learning model for evaluating and predicting the impact of lockdown policies on COVID-19 cases

To reduce the impact of COVID-19 pandemic most countries have implemented several counter-measures to control the virus spread including school and border closing, shutting down public transport and workplace and restrictions on gathering. In this research work, we propose a deep-learning prediction model for evaluating and predicting the impact of various lockdown policies on daily COVID-19 cases. This is achieved by first clustering countries having similar lockdown policies, then training a prediction model based on the daily cases of the countries in each cluster along with the data describing their lockdown policies. Once the model is trained, it can used to evaluate several scenarios associated to lockdown policies and investigate their impact on the predicted COVID cases. Our evaluation experiments, conducted on Qatar as a use case, shows that the proposed approach achieved competitive prediction accuracy. Additionally, our findings highlighted that lifting restrictions particularly on schools and border opening would result in significant increase in the number of cases during the study period.

Read more
Social And Information Networks

A framework for constructing a huge name disambiguation dataset: algorithms, visualization and human collaboration

We present a manually-labeled Author Name Disambiguation(AND) Dataset called WhoisWho, which consists of 399,255 documents and 45,187 distinct authors with 421 ambiguous author names. To label such a great amount of AND data of high accuracy, we propose a novel annotation framework where the human and computer collaborate efficiently and precisely. Within the framework, we also propose an inductive disambiguation model to classify whether two documents belong to the same author. We evaluate the proposed method and other state-of-the-art disambiguation methods on WhoisWho. The experiment results show that: (1) Our model outperforms other disambiguation algorithms on this challenging benchmark. (2) The AND problem still remains largely unsolved and requires more in-depth research. We believe that such a large-scale benchmark would bring great value for the author name disambiguation task. We also conduct several experiments to prove our annotation framework could assist annotators to make accurate results efficiently and eliminate wrong label problems made by human annotators effectively.

Read more
Social And Information Networks

A guide to choosing and implementing reference models for social network analysis

Analyzing social networks is challenging. Key features of relational data require the use of non-standard statistical methods such as developing system-specific null, or reference, models that randomize one or more components of the observed data. Here we review a variety of randomization procedures that generate reference models for social network analysis. Reference models provide an expectation for hypothesis-testing when analyzing network data. We outline the key stages in producing an effective reference model and detail four approaches for generating reference distributions: permutation, resampling, sampling from a distribution, and generative models. We highlight when each type of approach would be appropriate and note potential pitfalls for researchers to avoid. Throughout, we illustrate our points with examples from a simulated social system. Our aim is to provide social network researchers with a deeper understanding of analytical approaches to enhance their confidence when tailoring reference models to specific research questions.

Read more
Social And Information Networks

A multi-modal approach towards mining social media data during natural disasters -- a case study of Hurricane Irma

Streaming social media provides a real-time glimpse of extreme weather impacts. However, the volume of streaming data makes mining information a challenge for emergency managers, policy makers, and disciplinary scientists. Here we explore the effectiveness of data learned approaches to mine and filter information from streaming social media data from Hurricane Irma's landfall in Florida, USA. We use 54,383 Twitter messages (out of 784K geolocated messages) from 16,598 users from Sept. 10 - 12, 2017 to develop 4 independent models to filter data for relevance: 1) a geospatial model based on forcing conditions at the place and time of each tweet, 2) an image classification model for tweets that include images, 3) a user model to predict the reliability of the tweeter, and 4) a text model to determine if the text is related to Hurricane Irma. All four models are independently tested, and can be combined to quickly filter and visualize tweets based on user-defined thresholds for each submodel. We envision that this type of filtering and visualization routine can be useful as a base model for data capture from noisy sources such as Twitter. The data can then be subsequently used by policy makers, environmental managers, emergency managers, and domain scientists interested in finding tweets with specific attributes to use during different stages of the disaster (e.g., preparedness, response, and recovery), or for detailed research.

Read more
Social And Information Networks

A network approach to expertise retrieval based on path similarity and credit allocation

With the increasing availability of online scholarly databases, publication records can be easily extracted and analysed. Researchers can promptly keep abreast of others' scientific production and, in principle, can select new collaborators and build new research teams. A critical factor one should consider when contemplating new potential collaborations is the possibility of unambiguously defining the expertise of other researchers. While some organisations have established database systems to enable their members to manually produce a profile, maintaining such systems is time-consuming and costly. Therefore, there has been a growing interest in retrieving expertise through automated approaches. Indeed, the identification of researchers' expertise is of great value in many applications, such as identifying qualified experts to supervise new researchers, assigning manuscripts to reviewers, and forming a qualified team. Here, we propose a network-based approach to the construction of authors' expertise profiles. Using the MEDLINE corpus as an example, we show that our method can be applied to a number of widely used data sets and outperforms other methods traditionally used for expertise identification.

Read more
Social And Information Networks

A note on (matricial and fast) ways to compute Burt's structural holes

In this note I derive simple formulas based on the adjacency matrix of a network to compute measures associated with Ronald S. Burt's structural holes (effective size, redundancy, local constraint and constraint). This can help to interpret these measures and also to define naive algorithms for their computation based on matrix operations.

Read more

Ready to get started?

Join us today