Krysta M. Svore | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Krysta M. Svore is active.

Explore More

Publication

Featured researches published by Krysta M. Svore.

Information Retrieval | 2010

Adapting boosting for information retrieval measures

Qiang Wu; Christopher J. C. Burges; Krysta M. Svore; Jianfeng Gao

We present a new ranking algorithm that combines the strengths of two previous methods: boosted tree classification, and LambdaRank, which has been shown to be empirically optimal for a widely used information retrieval measure. Our algorithm is based on boosted regression trees, although the ideas apply to any weak learners, and it is significantly faster in both train and test phases than the state of the art, for comparable accuracy. We also show how to find the optimal linear combination for any two rankers, and we use this method to solve the line search problem exactly during boosting. In addition, we show that starting with a previously trained model, and boosting using its residuals, furnishes an effective technique for model adaptation, and we give significantly improved results for a particularly pressing problem in web search—training rankers for markets for which only small amounts of labeled data are available, given a ranker trained on much more data from a larger market.

web search and data mining | 2011

Understanding temporal query dynamics

Anagha Kulkarni; Jaime Teevan; Krysta M. Svore; Susan T. Dumais

Web search is strongly influenced by time. The queries people issue change over time, with some queries occasionally spiking in popularity (e.g., earthquake) and others remaining relatively constant (e.g., youtube). The documents indexed by the search engine also change, with some documents always being about a particular query (e.g., the Wikipedia page on earthquakes is about the query earthquake) and others being about the query only at a particular point in time (e.g., the New York Times is only about earthquakes following a major seismic activity). The relationship between documents and queries can also change as peoples intent changes (e.g., people sought different content for the query earthquake before the Haitian earthquake than they did after). In this paper, we explore how queries, their associated documents, and the query intent change over the course of 10 weeks by analyzing query log data, a daily Web crawl, and periodic human relevance judgments. We identify several interesting features by which changes to query popularity can be classified, and show that presence of these features, when accompanied by changes in result content, can be a good indicator of change in query intent.

Workshop on Data Mining for Computer Security (DMSEC), Melbourne, FL, November 19, 2003 | 2003

One Class Support Vector Machines for Detecting Anomalous Windows Registry Accesses

Katherine A. Heller; Krysta M. Svore; Angelos D. Keromytis; Salvatore J. Stolfo

We present a new Host-based Intrusion Detection System (IDS) that monitors accesses to the Microsoft Windows Registry using Registry Anomaly Detection (RAD). Our system uses a one class Support Vector Machine (OCSVM) to detect anomalous registry behavior by training on a dataset of normal registry accesses. It then uses this model to detect outliers in new (unclassified) data generated from the same system. Given the success of OCSVMs in other applications, we apply them to the Windows Registry anomaly detection problem. We compare our system to the RAD system using the Probabilistic Anomaly Detection (PAD) algorithm on the same dataset. Surprisingly, we find that PAD outperforms our OCSVM system due to properties of the hierarchical prior incorporated in the PAD algorithm. In the future, these properties may be used to develop an improved kernel and increase the performance of the OCSVM system.

international world wide web conferences | 2010

Classification-enhanced ranking

Paul N. Bennett; Krysta M. Svore; Susan T. Dumais

Many have speculated that classifying web pages can improve a search engines ranking of results. Intuitively results should be more relevant when they match the class of a query. We present a simple framework for classification-enhanced ranking that uses clicks in combination with the classification of web pages to derive a class distribution for the query. We then go on to define a variety of features that capture the match between the class distributions of a web page and a query, the ambiguity of a query, and the coverage of a retrieved result relative to a querys set of classes. Experimental results demonstrate that a ranker learned with these features significantly improves ranking over a competitive baseline. Furthermore, our methodology is agnostic with respect to the classification space and can be used to derive query classes for a variety of different taxonomies.

IEEE Computer | 2006

A layered software architecture for quantum computing design tools

Krysta M. Svore; Alfred V. Aho; Andrew W. Cross; Isaac L. Chuang; Igor L. Markov

Compilers and computer-aided design tools are essential for fine-grained control of nanoscale quantum-mechanical systems. A proposed four-phase design flow assists with computations by transforming a quantum algorithm from a high-level language program into precisely scheduled physical actions.

international acm sigir conference on research and development in information retrieval | 2009

On the local optimality of LambdaRank

Pinar Donmez; Krysta M. Svore; Christopher J. C. Burges

A machine learning approach to learning to rank trains a model to optimize a target evaluation measure with repect to training data. Currently, existing information retrieval measures are impossible to optimize directly except for models with a very small number of parameters. The IR community thus faces a major challenge: how to optimize IR measures of interest directly. In this paper, we present a solution. Specifically, we show that LambdaRank, which smoothly approximates the gradient of the target measure, can be adapted to work with four popular IR target evaluation measures using the same underlying gradient construction. It is likely, therefore, that this construction is extendable to other evaluation measures. We empirically show that LambdaRank finds a locally optimal solution for mean NDCG@10, mean NDCG, MAP and MRR with a 99% confidence rate. We also show that the amount of effective training data varies with IR measure and that with a sufficiently large training set size, matching the training optimization measure to the target evaluation measure yields the best accuracy.

Journal of Computer Security | 2005

A comparative evaluation of two algorithms for Windows Registry Anomaly Detection

Salvatore J. Stolfo; Frank Apap; Eleazar Eskin; Katherine A. Heller; Shlomo Hershkop; Andrew Honig; Krysta M. Svore

We present a component anomaly detector for a host-based intrusion detection system (IDS) for Microsoft Windows. The core of the detector is a learning-based anomaly detection algorithm that detects attacks on a host machine by looking for anomalous accesses to the Windows Registry. We present and compare two anomaly detection algorithms for use in our IDS system and evaluate their performance. One algorithm called PAD, for Probabilistic Anomaly Detection, is based upon a probability density estimation while the second uses the Support Vector Machine framework. The key idea behind the detector is to first train a model of normal Registry behavior on a Windows host, even when noise may be present in the training data, and use this model to detect abnormal Registry accesses. At run-time the model is used to check each access to the Registry in real-time to determine whether or not the behavior is abnormal and possibly corresponds to an attack. The system is effective in detecting the actions of malicious software while maintaining a low rate of false alarms. We show that the probabilistic anomaly detection algorithm exhibits better performance in accuracy and in computational complexity over the support vector machine implementation under three different kernel functions.

adversarial information retrieval on the web | 2007

Improving web spam classification using rank-time features

Krysta M. Svore; Qiang Wu; Christopher J. C. Burges; Aaswath Raman

In this paper, we study the classification of web spam. Web spam refers to pages that use techniques to mislead search engines into assigning them higher rank, thus increasing their site traffic. Our contributions are two fold. First, we find that the method of datset construction is crucial for accurate spam classification and we note that this problem occurs generally in learning problems and can be hard to detect. In particular, we find that ensuring no overlapping domains between test and training sets is necessary to accurately test a web spam classifier. In our case, classification performance can differ by as much as 40% in precision when using non-domain-separated data. Second, we show rank-time features can improve the performance of a web spam classifier. Our paper is the first to investigate the use of rank-time features, and in particular query-dependent rank-time features, for web spam detection. We show that the use of rank-time and query-dependent features can lead to an increase in accuracy over a classifier trained using page-based content only.

international acm sigir conference on research and development in information retrieval | 2010

How good is a span of terms?: exploiting proximity to improve web retrieval

Krysta M. Svore; Pallika H. Kanani; Nazan Khan

Ranking search results is a fundamental problem in information retrieval. In this paper we explore whether the use of proximity and phrase information can improve web retrieval accuracy. We build on existing research by incorporating novel ranking features based on flexible proximity terms with recent state-of-the-art machine learning ranking models. We introduce a method of determining the goodness of a set of proximity terms that takes advantage of the structured nature of web documents, document metadata, and phrasal information from search engine user query logs. We perform experiments on a large real-world Web data collection and show that using the goodness score of flexible proximity terms can improve ranking accuracy over state-of-the-art ranking methods by as much as 13%. We also show that we can improve accuracy on the hardest queries by as much as 9% relative to state-of-the-art approaches.

Proceedings of the National Academy of Sciences of the United States of America | 2017

Elucidating reaction mechanisms on quantum computers

Markus Reiher; Nathan Wiebe; Krysta M. Svore; Dave Wecker; Matthias Troyer

Significance Our work addresses the question of compelling killer applications for quantum computers. Although quantum chemistry is a strong candidate, the lack of details of how quantum computers can be used for specific applications makes it difficult to assess whether they will be able to deliver on the promises. Here, we show how quantum computers can be used to elucidate the reaction mechanism for biological nitrogen fixation in nitrogenase, by augmenting classical calculation of reaction mechanisms with reliable estimates for relative and activation energies that are beyond the reach of traditional methods. We also show that, taking into account overheads of quantum error correction and gate synthesis, a modular architecture for parallel quantum computers can perform such calculations with components of reasonable complexity. With rapid recent advances in quantum technology, we are close to the threshold of quantum devices whose computational powers can exceed those of classical supercomputers. Here, we show that a quantum computer can be used to elucidate reaction mechanisms in complex chemical systems, using the open problem of biological nitrogen fixation in nitrogenase as an example. We discuss how quantum computers can augment classical computer simulations used to probe these reaction mechanisms, to significantly increase their accuracy and enable hitherto intractable simulations. Our resource estimates show that, even when taking into account the substantial overhead of quantum error correction, and the need to compile into discrete gate sets, the necessary computations can be performed in reasonable time on small quantum computers. Our results demonstrate that quantum computers will be able to tackle important problems in chemistry without requiring exorbitant resources.

Explore More