George D. Montanez
Carnegie Mellon University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by George D. Montanez.
conference on information and knowledge management | 2014
George D. Montanez; Ryen W. White; Xiao Huang
Ownership and use of multiple devices such as desktop computers, smartphones, and tablets is increasing rapidly. Search is popular and people often perform search tasks that span device boundaries. Understanding how these devices are used and how people transition between them during information seeking is essential in developing search support for a multi-device world. In this paper, we study search across devices and propose models to predict aspects of cross-device search transitions. We characterize multi-device search across four device types, including aspects of search behavior on each device (e.g., topics of interest) and characteristics of device transitions. Building on the characterization, we learn models to predict various aspects of cross-device search, including the next device used for search. This enables many applications. For example, accurately forecasting the device used for the next query lets search engines proactively retrieve device-appropriate content (e.g., short documents for smartphones), while knowledge of the current device combined with device-specific topical interest models may assist in better query-sense disambiguation. %to help the searcher once they transition to the target device.
southeastern symposium on system theory | 2010
Winston Ewert; George D. Montanez; William A. Dembski; Robert J. Marks
Computer search often uses an oracle to determine the value of a proposed problem solution. Information is extracted from the oracle using repeated queries. Crafting a search algorithm to most efficiently extract this information is the job of the programmer. In many instances this is done using the programmers experience and knowledge of the problem being solved. For the Hamming oracle, we have the ability to assess the performance of various search algorithms using the currency of query count. Of the search procedures considered, blind search performs the worst. We show that evolutionary algorithms, although better than blind search, are a relatively inefficient method of information extraction. An algorithm methodically establishing and tracking the frequency of occurrence of alphabet characters performs even better. We also show that a search for the search for an optimal tree search, as suggested by our previous work, becomes computationally intensive.
Bio-complexity | 2010
George D. Montanez; Winston Ewert; William A. Dembski; Robert J. Marks
ev is an evolutionary search algorithm proposed to simulate biological evolution. As such, researchers have claimed that it demonstrates that a blind, unguided search is able to generate new information. However, analysis shows that any non-trivial computer search needs to exploit one or more sources of knowledge to make the search successful. Search algorithms mine active information from these resources, with some search algorithms performing better than others. We illustrate these principles in the analysis of ev . The sources of knowledge in ev include a Hamming oracle and a perceptron structure that predisposes the search towards its target. The original ev uses these resources in an evolutionary algorithm. Although the evolutionary algorithm finds the target, we demonstrate a simple stochastic hill climbing algorithm uses the resources more efficiently.
Proceedings of the Symposium | 2013
George D. Montanez; Robert J. Marks; Jorge Fernandez; John C. Sanford
There is growing evidence that much of the DNA in higher genomes is poly-functional, with the same nucleotide contributing to more than one type of code. Such poly-functional DNA should logically be multiply-constrained in terms of the probability of sequence improvement via random mutation. We describe a model of this relationship, which relates the degree of poly-functionality and the degree of constraint on mutational improvement. We show that: a) the probability of beneficial mutation is inversely related to the degree that a sequence is already optimized for a given code; b) the probability of beneficial mutation drastically diminishes as the number of overlapping codes increases. The growing evidence for a high degree of optimization in biological systems, and the growing evidence for multiple levels of poly-functionality within DNA, both suggest that mutations that are unambiguously beneficial must be especially rare. The theoretical scarcity of beneficial mutations is compounded by the fact that most of the beneficial mutations that do arise should confer extremely small increments of improvement in terms of total biological function. This makes such mutations invisible to natural selection. Beneficial mutations that are below a population’s selection threshold are effectively neutral in terms of selection, and so should be entirely unproductive from an evolutionary perspective. We conclude that beneficial mutations that are unambiguous (not deleterious at any level), and useful (subject to natural selection), should be extremely rare.
congress on evolutionary computation | 2013
George D. Montanez
According to the No Free Lunch theorems for search, when uniformly averaged over all possible search functions, every search algorithm has identical search performance for a wide variety of common performance metrics [1], [2], [3], [4]. Differences in performance can arise, however, between two algorithms when performance is measured over non-closed under permutation sets of functions, such as sets consisting of a single function. Using uniform random sampling with replacement as a baseline, we ask how many functions exist such that a search algorithm has better expected performance than random sampling. We define favorable functions as those that allow an algorithm to locate a search target with higher probability than uniform random sampling with replacement, and we bound the proportion of favorable functions for stochastic search methods, including genetic algorithms. Using active information [5] as our divergence measure, we demonstrate that no more than 2-b of all functions are favorable by b or more bits, for b ≥ 2 and reasonably sized search spaces (n ≥ 19). Thus, the proportion of functions for which an algorithm performs relatively well by a moderate degree is strictly bounded. Our results can be viewed as statement of information conservation [6], [7], [1], [8], [5], since identifying a favorable function of b or more bits requires at least b bits of information, under the conditions given.
international symposium on neural networks | 2017
George D. Montanez; Cosma Rohilla Shalizi
Spatio-temporal data is intrinsically high dimensional, so unsupervised modeling is only feasible if we can exploit structure in the process. When the dynamics are local in both space and time, this structure can be exploited by splitting the global field into many lower-dimensional “light cones”. We review light cone decompositions for predictive state reconstruction, introducing three simple light cone algorithms. These methods allow for tractable inference of spatio-temporal data, such as full-frame video. The algorithms make few assumptions on the underlying process yet have good predictive performance and can provide distributions over spatio-temporal data, enabling sophisticated probabilistic inference.
congress on evolutionary computation | 2013
George D. Montanez
To bound the amount of information transmitted from a fitness map to a genetic algorithm population, we use a method suggested by Abu-Mostafa et al. [1] for measuring the information storage capacity of general forms of memory and represent the genetic algorithm as a communication channel. Our results show that a number of bits linear in the size of the search space can be stored in a fitness map, but on average only a logarithmic number of bits can be stored within a genetic algorithm population of bounded size and finite precision representation. Our results place an upper bound on the rate at which information can be transmitted through, or generated by and later extracted from, a genetic algorithm under fairly general conditions.
computational intelligence in bioinformatics and computational biology | 2012
George D. Montanez; Young-Rae Cho
Recent advances in genome-wide identification of protein-protein interactions (PPIs) have produced an abundance of interaction data which give an insight into functional associations among proteins. However, it is known that the PPI datasets determined by high-throughput experiments or inferred by computational methods include an extremely large number of false positives. Using Gene Ontology (GO) and its annotations, we assess reliability of the PPIs by considering the semantic similarity of interacting proteins. Protein pairs with high semantic similarity are considered highly likely to share common functions, and therefore, are more likely to interact. We analyze the performance of existing semantic similarity measures in terms of functional consistency and propose a combined method that achieves improved performance over existing methods. The semantic similarity measures are applied to identify false positive PPIs. The classification results show that the combined hybrid method has higher accuracy than the other existing measures. Furthermore, the combined hybrid classifier predicts that 59.6% of the S. cerevisiae PPIs from the BioGRID database are false positives.
national conference on artificial intelligence | 2015
George D. Montanez; Saeed Amizadeh; Nikolay Laptev
systems, man and cybernetics | 2017
George D. Montanez