Joel Coffman
University of Virginia
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Joel Coffman.
conference on information and knowledge management | 2010
Joel Coffman; Alfred C. Weaver
With regard to keyword search systems for structured data, research during the past decade has largely focused on performance. Researchers have validated their work using ad hoc experiments that may not reflect real-world workloads. We illustrate the wide deviation in existing evaluations and present an evaluation framework designed to validate the next decade of research in this field. Our comparison of 9 state-of-the-art keyword search systems contradicts the retrieval effectiveness purported by existing evaluations and reinforces the need for standardized evaluation. Our results also suggest that there remains considerable room for improvement in this field. We found that many techniques cannot scale to even moderately-sized datasets that contain roughly a million tuples. Given that existing databases are considerably larger than this threshold, our results motivate the creation of new algorithms and indexing techniques that scale to meet both current and future workloads.
IEEE Transactions on Knowledge and Data Engineering | 2014
Joel Coffman; Alfred C. Weaver
Extending the keyword search paradigm to relational data has been an active area of research within the database and IR community during the past decade. Many approaches have been proposed, but despite numerous publications, there remains a severe lack of standardization for the evaluation of proposed search techniques. Lack of standardization has resulted in contradictory results from different evaluations, and the numerous discrepancies muddle what advantages are proffered by different approaches. In this paper, we present the most extensive empirical performance evaluation of relational keyword search techniques to appear to date in the literature. Our results indicate that many existing search techniques do not provide acceptable performance for realistic retrieval tasks. In particular, memory consumption precludes many search techniques from scaling beyond small data sets with tens of thousands of vertices. We also explore the relationship between execution time and factors varied in previous evaluations; our analysis indicates that most of these factors have relatively little impact on performance. In summary, our work confirms previous claims regarding the unacceptable performance of these search techniques and underscores the need for standardization in evaluations--standardization exemplified by the IR community.
international conference on green computing | 2010
Clinton Wills Smullen; Joel Coffman; Sudhanva Gurumurthi
Flash memory is now widely used in the design of solid-state disks (SSDs) as they are able to sustain significantly higher I/O rates than even high-performance hard disks, while using significantly less power. These characteristics make SSDs especially attractive for use in enterprise storage systems, and it is predicted that the use of SSDs will save 58,000 MWh/year by 2013. However, Flash-based SSDs are unable to reach peak performance on common enterprise data patterns such as log-file and metadata updates due to slow write speeds (an order-of-magnitude slower than reads) and the inability to do in-place updates. In this paper, we utilize an auxiliary, byte-addressable, non-volatile memory to design a general purpose merge cache that significantly improves write performance. We also utilize simple read policies that further improve the performance of the SSD without adding significant overhead. Together, these policies reduce the average response time by more than 75%, making it possible to meet performance requirements with fewer drives.
conference on information and knowledge management | 2011
Joel Coffman; Alfred C. Weaver
Keyword search within databases has become a hot topic within the research community as databases store increasing amounts of information. Users require an effective method to retrieve information from these databases without learning complex query languages (viz. SQL). Despite the recent research interest, performance and search effectiveness have not received equal attention, and scoring functions in particular have become increasingly complex while providing only modest benefits with regards to the quality of search results. An analysis of the factors appearing in existing scoring functions suggests that some factors previously deemed critical to search effectiveness are at best loosely correlated with relevance. We consider a number of these different scoring factors and use machine learning to create a new scoring function that provides significantly better results than existing approaches. We simplify our scoring function by systematically removing the factors with the lowest weight and show that this version still outperforms the previous state-of-the-art in this area.
Proceedings of the 5th ACM Workshop on Moving Target Defense - MTD '18 | 2018
Joel Coffman; Aurin Chakravarty; Joshua A. Russo; Andrew S. Gearhart
Software diversity is touted as a way to substantially increase the cost of cyber attacks by limiting an attackers ability to reuse exploits across diversified variants of an application. Despite the number of diversity techniques that have been described in the research literature, little is known about their effectiveness. In this paper, we consider near-duplicate detection algorithms as a way to measure the static aspects of software diversity---viz., their ability to recognize variants of an application. Due to the widely varying results reported by previous studies, we describe a novel technique for measuring the similarity of applications that share libraries. We use this technique to systematically compare various near-duplication detection algorithms and demonstrate their wide range in effectiveness, including for real-world tasks such as malware triage. In addition, we use these algorithms as a way to assess the relative strength of various diversity strategies, from recompilation with different compilers and optimization levels to techniques specifically designed to thwart exploit reuse. Our results indicate that even small changes to a binary disproportionately affect the similarity reported by near-duplicate detection algorithms. In addition, we observe a wide range in the effectiveness of various diversity strategies.
international conference on cloud computing | 2017
Bruce Benjamin; Joel Coffman; Hadi Esiely-Barrera; Kaitlin Farr; Dane Fichter; Daniel Genin; Laura J. Glendenning; Peter A. Hamilton; Shaku Harshavardhana; Rosalind Hom; Brianna Poulos; Nathan S. Reller
As cloud computing becomes increasingly pervasive, it is critical for cloud providers to support basic security controls. Although major cloud providers tout such features, relatively little is known in many cases about their design and implementation. In this paper, we describe several security features in OpenStack, a widely-used, open source cloud computing platform. Our contributions to OpenStack range from key management and storage encryption to guaranteeing the integrity of virtual machine (VM) images prior to boot. We describe the design and implementation of these features in detail and provide a security analysis that enumerates the threats that each mitigates. Our performance evaluation shows that these security features have an acceptable cost—in some cases, within the measurement error observed in an operational cloud deployment. Finally, we highlight lessons learned from our real-world development experiences from contributing these features to OpenStack as a way to encourage others to transition their research into practice.
Archive | 2017
Joel Coffman; Alfred C. Weaver
The benchmark for relational keyword search is a collection of data sets, queries, and relevance assessments designed to facilitate the evaluation of systems supporting keyword search in databases. The benchmark includes three separate data sets with fifty information needs (i.e., queries) for each data set and follows the traditional approach to evaluate keyword search systems (i.e., ad hoc retrieval) developed by the information retrieval (IR) research community.
international conference on management of data | 2010
Joel Coffman; Alfred C. Weaver
Archive | 2012
Alfred C. Weaver; Joel Coffman
technical symposium on computer science education | 2010
Joel Coffman; Alfred C. Weaver