Moshe Looks
Washington University in St. Louis
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Moshe Looks.
genetic and evolutionary computation conference | 2005
Moshe Looks; Ben Goertzel; Cassio Pennachin
We describe an extension of the Bayesian Optimization Algorithm (BOA), a probabilistic model building genetic algorithm, to the domain of program tree evolution. The new system, BOA programming (BOAP), improves significantly on previous probabilistic model building genetic programming (PMBGP) systems in terms of the articulacy and open-ended flexibility of the models learned, and hence control over the distribution of instances generated. Innovations include a novel tree representation and a generalized program evaluation scheme.
ieee aerospace conference | 2007
Moshe Looks; Andrew Levine; G.A. Covington; Ronald Prescott Loui; John W. Lockwood; Young H. Cho
We are concerned with the general problem of concept mining - discovering useful associations, relationships, and groupings in large collections of data. Mathematical transformation algorithms have proven effective at reducing the content of multilingual, unstructured data into a vector that describes the content. Such methods are particularly desirable in fields undergoing information explosions, such as network traffic analysis, bioinformatics, and the intelligence community. In response, concept mining methodology is being extended to improve performance and permit hardware implementation -traditional methods are not sufficiently scalable. Hardware-accelerated systems have proven effective at automatically classifying such content when topics are known in advance. Our complete system builds on our past work in this area, presented in the Aerospace 2005 and 2006 conferences, where we described a novel algorithmic approach for extracting semantic content from unstructured text document streams. However, there is an additional need within the intelligence community to cluster related sets of content without advance training. To allow this function to happen at high speed, we have implemented a system that hierarchically clusters streaming content. The method, streaming hierarchical partitioning, is designed to be implemented in hardware and handle extremely high ingestion rates. As new documents are ingested, they are dynamically organized into a hierarchy, which has a fixed maximal size. Once this limit is reached, documents must consequently be excreted at a rate equaling their ingestion. The choice of documents to excrete is a point of interest -we present several autonomous heuristics for doing so intelligently, as well as a proposal for incorporating user interaction to focus attention on concepts of interest. A related desideratum is robust accommodation of concept drift -gradual change in the distribution and content of the document stream over time. Accordingly, we present and analyze experimental results for document streams evolving over time under several regimes. Current and proposed methods for concisely and informatively presenting derived content from streaming hierarchical clustering to the user for analysis are presented in this content. To support our claims of eventual hardware implementation and real-time performance with a high ingestion rate, we provide a detailed hardware-ready design, with asymptotic analysis and performance predictions. The system has been prototyped and tested on a Xeon processor as well as on a PowerPC embedded within a Xilinx Virtex2 FPGA. In summary, we describe a system designed to satisfy three primary goals: (1) real-time concept mining of high-volume data streams; (2) dynamic organization of concepts into a relational hierarchy; (3) adaptive reorganization of the concept hierarchy in response to evolving circumstances and user feedback.
genetic and evolutionary computation conference | 2007
Moshe Looks
I present a new estimation-of-distribution approach to program evolution where distributions are not estimated over the entire space of programs. Rather, a novel representation-building procedure that exploits domain knowledge is used to dynamically select program subspaces for estimation over. This leads to a system of demes consisting of alternative rep-resentations (i.e. program subspaces) that are maintained simultaneously and managed by the overall system. Meta-optimizing semantic evolutionary search (MOSES), a program evolution system based on this approach, is described, and its representation-building subcomponent is analyzed in depth. Experimental results are also provided for the overall MOSES procedure that demonstrate good scalability.
genetic and evolutionary computation conference | 2007
Moshe Looks
Generating a random sampling of program trees with spec-ified function and terminal sets is the initial step of many program evolution systems. I present a theoretical and experimental analysis of the expected distribution of uniformly sampled programs, guided by algorithmic information theory. This analysis demonstrates that increasing the sample size is often an inefficient means of increasing the overall diversity of program behaviors (outputs). A novel sampling scheme (semantic sampling) is proposed that exploits semantics to heuristically increase behavioral diversity. An important property of the scheme is that no calls of the problem-specific fitness function are required. Its effective-ness at increasing behavioral diversity is demonstrated empirically for Boolean formulae. Furthermore, it is found to lead to statistically significant improvements in performance for genetic programming on parity and multiplexer problems.
genetic and evolutionary computation conference | 2007
Moshe Looks
A powerful heuristic allowing many optimization problems of interest to be solved quickly is to attempt decomposition – breaking problems down into smaller subproblems that may be solved independently. For example, the hierarchical Bayesian optimization algorithm (hBOA) [5] dynamically learns a problem decomposition in terms of solution parameters. The effectiveness of this approach hinges on the existence of some compact and reasonably correct decomposition in the space (of decompositions, not solutions). Difficulty arises when no such decomposition exists, or when an effective decomposition cannot be formulated directly as a model over solution parameters. In other words, how successfully an optimization algorithm can exploit neardecomposability depends on how clever an encoding has been chosen by humans to represent the problem. I posit that the characteristics of program spaces and the typically chaotic mapping from programs to outputs tend to scramble problems – even if the mapping from program outputs to fitness levels is nearly decomposable, the overall problem will not be (in terms of parameters of program spaces). MOSES (meta-optimizing semantic evolutionary search) [4] is a new estimation-of-distribution approach to program evolution. Distributions are not estimated over the entire space of programs. Rather, a novel representationbuilding procedure that exploits domain knowledge is used to dynamically select program subspaces for estimation over. This leads to a system of demes consisting of alternative representations (i.e. program subspaces) that are maintained simultaneously and managed by the overall system. The hBOA is applied to learn new programs within
genetic and evolutionary computation conference | 2006
Moshe Looks
I introduce a generalization of probabilistic modeling and sampling for estimation of distribution algorithms (EDAs), that allows models to contain features, additional level(s) of abstraction defined in terms of the problems base-level variables. I demonstrate how a simple feature class, variable-position motifs within fixed-length strings, may be exploited by a powerful EDA, the Bayesian optimization algorithm (BOA). Experimental results are presented where motifs are learned autonomously via a simple heuristic. The effectiveness of this feature-based BOA is demonstrated across a range of problems where such motifs are relevant.
genetic and evolutionary computation conference | 2007
Moshe Looks; Ben Goertzel; Lucio Coelho; Mauricio Mudado; Cassio Pennachin
Many researchers have used supervised categorization algorithms such as GP and SVMs, to analyze gene expression microarray data. Overall, the results in this area using SVMs have been stronger than those for GP. However, GP is sometimes preferable to SVMs because of the relative transparency of the models it produces. Studying the GP models themselves can indicate exactly how the classification is being performed, which can lead to biological insights. We ask here first whether the use of an alternate program evolution technique, MOSES (meta-optimizing semantic evolutionary search) [2], can improve GP’s results in this domain (in terms of both accuracy and model simplicity), and second, if MOSES might succeed in providing “important gene” lists with substantial biological relevance. Here we report results for two datasets: (1) distinguishing between types of lymphoma based on gene expression data [4]; and (2) classifying between young and old human brains [3]. Three issues are relevant to any classification approach to microarray analysis: (1) dealing with a huge number of problem variables; (2) dealing with noisy continuous data; (3) avoiding overfitting to the data. We dealt with (1) by selecting the 50 most-differentiating features to use in all experiments, (2) by considering gene expression levels as Boolean features determined by median-thresholding (to eliminates concerns regarding noise and scaling), and (3) by using TP +TN − s/2 as our fitness function, where s is the number of nodes in the classifier, TP is the number of true positives, and TN is the number of true negatives (i.e., high parsimony pressure). See [2] for details and justification, along with algorithm parameter settings (which were fixed across a variety of experiments). Results are presented in the
field-programmable custom computing machines | 2006
Shobana Padmanabhan; Moshe Looks; Dan Legorreta; Young H. Cho; John W. Lockwood
Non-hierarchical k-means algorithms have been implemented in hardware, most frequently for image clustering. Here, we focus on hierarchical clustering of text documents based on document similarity. To our knowledge, this is the first work to present a hierarchical clustering algorithm designed for hardware implementation and ours is the first hardware-accelerated implementation
international joint conference on artificial intelligence | 2003
Weixiong Zhang; Ananda Rangan; Moshe Looks
international joint conference on artificial intelligence | 2005
Weixiong Zhang; Moshe Looks