Chris Mayfield | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Chris Mayfield is active.

Explore More

Publication

Featured researches published by Chris Mayfield.

international conference on data engineering | 2007

Indexing Uncertain Categorical Data

Sarvjeet Singh; Chris Mayfield; Sunil Prabhakar; Rahul Shah; Susanne E. Hambrusch

Uncertainty in categorical data is commonplace in many applications, including data cleaning, database integration, and biological annotation. In such domains, the correct value of an attribute is often unknown, but may be selected from a reasonable number of alternatives. Current database management systems do not provide a convenient means for representing or manipulating this type of uncertainty. In this paper we extend traditional systems to explicitly handle uncertainty in data values. We propose two index structures for efficiently searching uncertain categorical data, one based on the R-tree and another based on an inverted index structure. Using these structures, we provide a detailed description of the probabilistic equality queries they support. Experimental results using real and synthetic datasets demonstrate how these index structures can effectively improve the performance of queries through the use of internal probabilistic information.

international conference on data engineering | 2008

Database Support for Probabilistic Attributes and Tuples

Sarvjeet Singh; Chris Mayfield; Rahul Shah; Sunil Prabhakar; Susanne E. Hambrusch; Jennifer Neville; Reynold Cheng

The inherent uncertainty of data present in numerous applications such as sensor databases, text annotations, and information retrieval motivate the need to handle imprecise data at the database level. Uncertainty can be at the attribute or tuple level and is present in both continuous and discrete data domains. This paper presents a model for handling arbitrary probabilistic uncertain data (both discrete and continuous) natively at the database level. Our approach leads to a natural and efficient representation for probabilistic data. We develop a model that is consistent with possible worlds semantics and closed under basic relational operators. This is the first model that accurately and efficiently handles both continuous and discrete uncertainty. The model is implemented in a real database system (PostgreSQL) and the effectiveness and efficiency of our approach is validated experimentally.

international conference on management of data | 2008

Orion 2.0: native support for uncertain data

Sarvjeet Singh; Chris Mayfield; Sagar Mittal; Sunil Prabhakar; Susanne E. Hambrusch; Rahul Shah

Orion is a state-of-the-art uncertain database management system with built-in support for probabilistic data as first class data types. In contrast to other uncertain databases, Orion supports both attribute and tuple uncertainty with arbitrary correlations. This enables the database engine to handle both discrete and continuous pdfs in a natural and accurate manner. The underlying model is closed under the basic relational operators and is consistent with Possible Worlds Semantics. We demonstrate how Orion simplifies the design and enhances the capabilities of two example applications: managing sensor data (continuous uncertainty) and inferring missing values (discrete uncertainty).

international conference on management of data | 2010

ERACER: a database approach for statistical inference and data cleaning

Chris Mayfield; Jennifer Neville; Sunil Prabhakar

Real-world databases often contain syntactic and semantic errors, in spite of integrity constraints and other safety measures incorporated into modern DBMSs. We present ERACER, an iterative statistical framework for inferring missing information and correcting such errors automatically. Our approach is based on belief propagation and relational dependency networks, and includes an efficient approximate inference algorithm that is easily implemented in standard DBMSs using SQL and user defined functions. The system performs the inference and cleansing tasks in an integrated manner, using shrinkage techniques to infer correct values accurately even in the presence of dirty data. We evaluate the proposed methods empirically on multiple synthetic and real-world data sets. The results show that our framework achieves accuracy comparable to a baseline statistical method using Bayesian networks with exact inference. However, our framework has wider applicability than the Bayesian network baseline, due to its ability to reason with complex, cyclic relational dependencies.

ACM Transactions on Computing Education | 2014

Computational Thinking in Elementary and Secondary Teacher Education

Aman Yadav; Chris Mayfield; Ninger Zhou; Susanne E. Hambrusch; John T. Korb

Computational thinking (CT) is broadly defined as the mental activity for abstracting problems and formulating solutions that can be automated. In an increasingly information-based society, CT is becoming an essential skill for everyone. To ensure that students develop this ability at the K-12 level, it is important to provide teachers with an adequate knowledge about CT and how to incorporate it into their teaching. This article describes a study on designing and introducing computational thinking modules and assessing their impact on preservice teachers’ understanding of CT concepts, as well as their attitude towards computing. Results demonstrate that introducing computational thinking into education courses can effectively influence preservice teachers’ understanding of CT concepts.

BMC Genomics | 2009

Analysis of regulatory protease sequences identified through bioinformatic data mining of the Schistosoma mansoni genome

David H. Bos; Chris Mayfield; Dennis J. Minchella

BackgroundNew chemotherapeutic agents against Schistosoma mansoni, an etiological agent of human schistosomiasis, are a priority due to the emerging drug resistance and the inability of current drug treatments to prevent reinfection. Proteases have been under scrutiny as targets of immunological or chemotherapeutic anti-Schistosoma agents because of their vital role in many stages of the parasitic life cycle. Function has been established for only a handful of identified S. mansoni proteases, and the vast majority of these are the digestive proteases; very few of the conserved classes of regulatory proteases have been identified from Schistosoma species, despite their vital role in numerous cellular processes. To that end, we identified protease protein coding genes from the S. mansoni genome project and EST library.ResultsWe identified 255 protease sequences from five catalytic classes using predicted proteins of the S. mansoni genome. The vast majority of these show significant similarity to proteins in KEGG and the Conserved Domain Database. Proteases include calpains, caspases, cytosolic and mitochondrial signal peptidases, proteases that interact with ubiquitin and ubiquitin-like molecules, and proteases that perform regulated intramembrane proteolysis. Comparative analysis of classes of important regulatory proteases find conserved active site domains, and where appropriate, signal peptides and transmembrane helices. Phylogenetic analysis provides support for inferring functional divergence among regulatory aspartic, cysteine, and serine proteases.ConclusionNumerous proteases are identified for the first time in S. mansoni. We characterized important regulatory proteases and focus analysis on these proteases to complement the growing knowledge base of digestive proteases. This work provides a foundation for expanding knowledge of proteases in Schistosoma species and examining their diverse function and potential as targets for new chemotherapies.

statistical and scientific database management | 2008

Query Selectivity Estimation for Uncertain Data

Sarvjeet Singh; Chris Mayfield; Rahul Shah; Sunil Prabhakar; Susanne E. Hambrusch

Applications requiring the handling of uncertain data have led to the development of database management systems extending the scope of relational databases to include uncertain (probabilistic) data as a native data type. New automatic query optimizations having the ability to estimate the cost of execution of a given query plan, as available in existing databases, need to be developed. For probabilistic data this involves providing selectivity estimations that can handle multiple values for each attribute and also new query types with threshold values. This paper presents novel selectivity estimation functions for uncertain data and shows how these functions can be integrated into PostgreSQL to achieve query optimization for probabilistic queries over uncertain data. The proposed methods are able to handle both attribute- and tuple-uncertainty. Our experimental results show that our algorithms are efficient and give good selectivity estimates with low space-time overhead.

technical symposium on computer science education | 2014

Learning relational algebra by snapping blocks

Jason Gorman; Sebastian Gsell; Chris Mayfield

Relational algebra provides a theoretical foundation for how modern database management systems optimize and execute queries. Its main concepts are based on set theory and first order logic, which can be challenging for students to learn due to their abstract nature. This paper presents Bags, a new type of visual programming environment (inspired by Snap!) for the teaching of relational operations and data analysis. Students formulate algebraic queries by snapping together graphical blocks that represent data sets and relational operators, resulting in an interactive visualization of the underlying concepts. The outcomes of this work will not only enhance university-level database courses, but also provide an engaging computational thinking resource for K-12 teachers in content areas outside of science and engineering.

integrating technology into computer science education | 2016

Results from a Survey of Faculty Adoption of Process Oriented Guided Inquiry Learning (POGIL) in Computer Science

Helen H. Hu; Clifton Kussmaul; Brian Knaeble; Chris Mayfield; Aman Yadav

This paper presents an analysis of CS faculty perceptions of the benefits of POGIL, the obstacles to POGIL adoption, and opportunities for professional development. Participants strongly agreed that with POGIL, students are more engaged and active, develop communication and teamwork skills, and have better learning outcomes. The largest perceived obstacle was lack of preparation time; other obstacles included availability of relevant POGIL activities and pressure to cover more content. Participants expressed a desire for further training and mentoring beyond workshops. Our data analysis also considers bivariate associations and interactions. The results should help to improve professional development for CS faculty adopting evidence-based strategies, and thereby help more CS students to be successful.

technical symposium on computer science education | 2014

Guided inquiry learning in context: perspectives on POGIL in CS

Helen H. Hu; Matthew Lang; Clif Kussmaul; Chris Mayfield; Tammy Pirmann

Process oriented guided inquiry learning (POGIL) is an active, student-centered approach to teaching/learning [6]. In a POGIL classroom, students work in small teams on inquiry-based activities that guide students to discover concepts. These activities are designed to align with the learning cycle [8] and include elements that are designed to additionally develop process skills (e.g., team work, conflict resolution, written and oral communication, etc.). The role of the instructor in a POGIL classroom is to facilitate student discovery, rather than to deliver lecture. The POGIL approach was developed and refined within the physical sciences and its success in general and organic chemistry courses has been documented in a variety of university contexts. In particular, POGIL classes contain fewer failing grades and withdrawals [9] and result in a high degree of mastery [5] than traditional classes. Because of its success, the approach has begun to be adopted by the computer science community and has generated increasing interest and activity at SIGCSE ([4], [3], [7], [2]). Though the POGIL approach is well-documented, there is no single way to implement a POGIL classroom. The purpose of this panel is to examine the varying challenges to adopting POGIL in different institutional contexts and to explore how POGIL has been implemented in a wide variety of computer science classrooms. In addition to giving a brief overview of the POGIL approach, panel members will

Explore More