Marko A. Rodriguez
Los Alamos National Laboratory
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Marko A. Rodriguez.
PLOS ONE | 2009
Johan Bollen; Herbert Van de Sompel; Aric Hagberg; Luís M. A. Bettencourt; Ryan Chute; Marko A. Rodriguez; Lyudmila Balakireva
Background Intricate maps of science have been created from citation data to visualize the structure of scientific activity. However, most scientific publications are now accessed online. Scholarly web portals record detailed log data at a scale that exceeds the number of all existing citations combined. Such log data is recorded immediately upon publication and keeps track of the sequences of user requests (clickstreams) that are issued by a variety of users across many different domains. Given these advantages of log datasets over citation data, we investigate whether they can produce high-resolution, more current maps of science. Methodology Over the course of 2007 and 2008, we collected nearly 1 billion user interactions recorded by the scholarly web portals of some of the most significant publishers, aggregators and institutional consortia. The resulting reference data set covers a significant part of world-wide use of scholarly web portals in 2006, and provides a balanced coverage of the humanities, social sciences, and natural sciences. A journal clickstream model, i.e. a first-order Markov chain, was extracted from the sequences of user interactions in the logs. The clickstream model was validated by comparing it to the Getty Research Institutes Architecture and Art Thesaurus. The resulting model was visualized as a journal network that outlines the relationships between various scientific domains and clarifies the connection of the social sciences and humanities to the natural sciences. Conclusions Maps of science resulting from large-scale clickstream data provide a detailed, contemporary view of scientific activity and correct the underrepresentation of the social sciences and humanities that is commonly found in citation data.
Journal of Informetrics | 2010
Marko A. Rodriguez; Joshua Shinavier
Many, if not most network analysis algorithms have been designed specifically for single-relational networks; that is, networks in which all edges are of the same type. For example, edges may either represent “friendship,” “kinship,” or “collaboration,” but not all of them together. In contrast, a multi-relational network is a network with a heterogeneous set of edge labels which can represent relationships of various types in a single data structure. While multi-relational networks are more expressive in terms of the variety of relationships they can capture, there is a need for a general framework for transferring the many single-relational network analysis algorithms to the multi-relational domain. It is not sufficient to execute a single-relational network analysis algorithm on a multi-relational network by simply ignoring edge labels. This article presents an algebra for mapping multi-relational networks to single-relational networks, thereby exposing them to single-relational network analysis algorithms.
Journal of Informetrics | 2008
Marko A. Rodriguez; Alberto Pepe
This article presents a study that compares detected structural communities in a coauthorship network to the socioacademic characteristics of the scholars that compose the network. The coauthorship network was created from the bibliographic record of a multi-institution, interdisciplinary research group focused on the study of sensor networks and wireless communication. Four different community detection algorithms were employed to assign a structural community to each scholar in the network: leading eigenvector, walktrap, edge betweenness and spinglass. Socioacademic characteristics were gathered from the scholars and include such information as their academic department, academic affiliation, country of origin, and academic position. A Pearson’s χ2test, with a simulated Monte Carlo, revealed that structural communities best represent groupings of individuals working in the same academic department and at the same institution. A generalization of this result suggests that, even in interdisciplinary, multi-institutional research groups, coauthorship is primarily driven by departmental and institutional affiliation.
arXiv: Data Structures and Algorithms | 2010
Marko A. Rodriguez; Peter Neubauer
A graph is a structure composed of a set of vertices (i.e.nodes, dots) connected to one another by a set of edges (i.e.links, lines). The concept of a graph has been around since the late 19
conference on information and knowledge management | 2008
Marko A. Rodriguez; Johan Bollen
^\text{th}
ACM Transactions on Information Systems | 2009
Marko A. Rodriguez; Johan Bollen; Herbert Van de Sompel
century, however, only in recent decades has there been a strong resurgence in both theoretical and applied graph research in mathematics, physics, and computer science. In applied computing, since the late 1960s, the interlinked table structure of the relational database has been the predominant information storage and retrieval model. With the growth of graph/network-based data and the need to efficiently process such data, new data management systems have been developed. In contrast to the index-intensive, set-theoretic operations of relational databases, graph databases make use of index-free, local traversals. This article discusses the graph traversal pattern and its use in computing.
hawaii international conference on system sciences | 2007
Marko A. Rodriguez; Daniel J. Steinbock; Jennifer H. Watkins; Carlos Gershenson; Johan Bollen; Victor Grey; Brad deGraf
The peer-review process is the most widely accepted certification mechanism for officially accepting the written results of researchers within the scientific community. An essential component of peer-review is the identification of competent referees to review a submitted manuscript. This article presents an algorithm to automatically determine the most appropriate reviewers for a manuscript by way of a co-authorship network data structure and a relative-rank particle-swarm algorithm. This approach is novel in that it is not limited to a pre-selected set of referees, is computationally efficient, requires no human-intervention, and, in some instances, can automatically identify conflict of interest situations. A useful application of this algorithm would be to open commentary peer-review systems because it provides a weighting for each referee with respects to their expertise in the domain of a manuscript. The algorithm is validated using referee bid data from the 2005 Joint Conference on Digital Libraries.
acm/ieee joint conference on digital libraries | 2008
Johan Bollen; Herbert Van de Sompel; Marko A. Rodriguez
In spite of its tremendous value, metadata is generally sparse and incomplete, thereby hampering the effectiveness of digital information services. Many of the existing mechanisms for the automated creation of metadata rely primarily on content analysis which can be costly and inefficient. The automatic metadata generation system proposed in this article leverages resource relationships generated from existing metadata as a medium for propagation from metadata-rich to metadata-poor resources. Because of its independence from content analysis, it can be applied to a wide variety of resource media types and is shown to be computationally inexpensive. The proposed method operates through two distinct phases. Occurrence and cooccurrence algorithms first generate an associative network of repository resources leveraging existing repository metadata. Second, using the associative network as a substrate, metadata associated with metadata-rich resources is propagated to metadata-poor resources by means of a discrete-form spreading activation algorithm. This article discusses the general framework for building associative networks, an algorithm for disseminating metadata through such networks, and the results of an experiment and validation of the proposed method using a standard bibliographic dataset.
acm/ieee joint conference on digital libraries | 2007
Johan Bollen; Marko A. Rodriguez; Herbert Van de Sompel
Smartocracy is a social software system for collective decision making. The system is composed of a social network that links individuals to those they trust to make good decisions and a decision network that links individuals to their voted-on solutions. Such networks allow a variety of algorithms to convert the link choices made by individual participants into specific decision outcomes. Simply interpreting the linkages differently (e.g. ignoring trust links, or using them to weight an individuals vote) provides a variety of outcomes fit for different decision making scenarios. This paper can discuss the Smartocracy network data structures, the suite of collective decision making algorithms currently supported, and the results of two collective decisions regarding the design of the system
acm/ieee joint conference on digital libraries | 2007
Marko A. Rodriguez; Johan Bollen; Herbert Van de Sompel
Scholarly usage data holds the potential to be used as a tool to study the dynamics of scholarship in real time, and to form the basis for the definition of novel metrics of scholarly impact. However, the formal groundwork to reliably and validly exploit usage data is lacking, and the exact nature, meaning and applicability of usage-based metrics is poorly understood. The MESUR project funded by the Andrew W. Mellon Foundation constitutes a systematic effort to define, validate and cross-validate a range of usage-based metrics of scholarly impact. MESUR has collected nearly 1 billion usage events as well as all associated bibliographic and citation data from significant publishers, aggregators and institutional consortia to construct a large-scale usage data reference set. This paper describes some major challenges related to aggregating and processing usage data, and discusses preliminary results obtained from analyzing the MESUR reference data set. The results confirm the intrinsic value of scholarly usage data, and support the feasibility of reliable and valid usage-based metrics of scholarly impact.