Dawn Wilkins | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Dawn Wilkins is active.

Explore More

Publication

Featured researches published by Dawn Wilkins.

Cancer Cell | 2002

Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling

Eng Juh Yeoh; Mary E. Ross; Sheila A. Shurtleff; W. Kent Williams; Divyen H. Patel; Rami Mahfouz; Fred G. Behm; Susana C. Raimondi; Mary V. Relling; Anami R. Patel; Cheng Cheng; Dario Campana; Dawn Wilkins; Xiaodong Zhou; Jinyan Li; Huiqing Liu; Ching-Hon Pui; William E. Evans; Clayton W. Naeve; Limsoon Wong; James R. Downing

Treatment of pediatric acute lymphoblastic leukemia (ALL) is based on the concept of tailoring the intensity of therapy to a patients risk of relapse. To determine whether gene expression profiling could enhance risk assignment, we used oligonucleotide microarrays to analyze the pattern of genes expressed in leukemic blasts from 360 pediatric ALL patients. Distinct expression profiles identified each of the prognostically important leukemia subtypes, including T-ALL, E2A-PBX1, BCR-ABL, TEL-AML1, MLL rearrangement, and hyperdiploid >50 chromosomes. In addition, another ALL subgroup was identified based on its unique expression profile. Examination of the genes comprising the expression signatures provided important insights into the biology of these leukemia subgroups. Further, within some genetic subgroups, expression profiles identified those patients that would eventually fail therapy. Thus, the single platform of expression profiling should enhance the accurate risk stratification of pediatric ALL patients.

acm southeast regional conference | 2010

A comparison of a graph database and a relational database: a data provenance perspective

Chad Vicknair; Michael Macias; Zhendong Zhao; Xiaofei Nan; Yixin Chen; Dawn Wilkins

Relational databases have been around for many decades and are the database technology of choice for most traditional data-intensive storage and retrieval applications. Retrievals are usually accomplished using SQL, a declarative query language. Relational database systems are generally efficient unless the data contains many relationships requiring joins of large tables. Recently there has been much interest in data stores that do not use SQL exclusively, the so-called NoSQL movement. Examples are Googles BigTable and Facebooks Cassandra. This paper reports on a comparison of one such NoSQL graph database called Neo4j with a common relational database system, MySQL, for use as the underlying technology in the development of a software system to record and query data provenance information.

Journal of the ACM | 1996

How many queries are needed to learn

Lisa Hellerstein; Krishnan Pillaipakkamnatt; Vijay Raghavan; Dawn Wilkins

We investigate the query complexity of exact learning in the membership and (proper) equivalence query model. We give a complete characterization of concept classes that are learnable with a polynomial number of polynomial sized queries in this model. We give applications of this characterization, including results on learning a natural subclass of DNF formulas, and on learning with membership queries alone. Query complexity has previously been used to prove lower bounds on the time complexity of exact learning. We show a new relationship between query complexity and time complexity in exact learning: If any “honest” class is exactly and properly learnable with polynomial query complexity, but not learnable in polynomial time, then P = NP. In particular, we show that an honest class is exactly polynomial-query learnable if and only if it is learnable using an oracle for Γ p 4 .

technical symposium on computer science education | 2000

Evaluating individuals in team projects

Dawn Wilkins; Pamela B. Lawhead

In 1999, most computer science students participate in at least one group project in some class prior to graduation. However, assessing individual student contributions to a group project is a difficult task faced by instructors of these classes. In this paper, we have compiled a wide range of assessment instruments, and identified situations where they can be effective. This paper is a compilation of potential evaluation strategies. No comparison is made among the many strategies nor are particular techniques ranked above or below others. The goal is simply to provide a wide range of potential team evaluation techniques. Since each technique evaluates a particular characteristic and different team project courses have different goals it is up to the instructor to choose the techniques that best evaluate the individual in light of the course goals.

BMC Bioinformatics | 2006

Improving the Performance of SVM-RFE to Select Genes in Microarray Data

Yuanyuan Ding; Dawn Wilkins

BackgroundRecursive Feature Elimination is a common and well-studied method for reducing the number of attributes used for further analysis or development of prediction models. The effectiveness of the RFE algorithm is generally considered excellent, but the primary obstacle in using it is the amount of computational power required.ResultsHere we introduce a variant of RFE which employs ideas from simulated annealing. The goal of the algorithm is to improve the computational performance of recursive feature elimination by eliminating chunks of features at a time with as little effect on the quality of the reduced feature set as possible. The algorithm has been tested on several large gene expression data sets. The RFE algorithm is implemented using a Support Vector Machine to assist in identifying the least useful gene(s) to eliminate.ConclusionThe algorithm is simple and efficient and generates a set of attributes that is very similar to the set produced by RFE.

international conference on artificial intelligence and law | 1997

The effectiveness of machine learning techniques for predicting time to case disposition

Dawn Wilkins; Krishnan Pillaipakkamnatt

One of the difficult tasks in the court system is the scheduling of the entities involved at the various stages of the criminal justice system. These include judges, jurors, witnesses, defendants, attorneys and court rooms. In this paper we examine the feasibility of using machine learning techniques for the task of predicting the elapsed time between the arrest of an offender and the final disposition of his or her case. Accurate prediction of time to case disposition will aid in the resolution of conflicts that arise in the scheduling of the above entities. Using a pre-existing dataset called Offender Based Transaction Statistics (1990) and two well-known learning algorithms we show that there is scope for the use of such techniques.

conference on learning theory | 1993

Learning μ-branching programs with queries

Vijay Raghavan; Dawn Wilkins

We show that the class of p-branching programs can be exactly learned in ()(rz5) time us:n

BMC Bioinformatics | 2012

Implementation of multiple-instance learning in drug activity prediction

Gang Fu; Xiaofei Nan; Haining Liu; Ronak Y. Patel; Pankaj R. Daga; Yixin Chen; Dawn Wilkins; Robert J. Doerksen

only O(n) equivalence queries and O(n ) membership queries, but neither type of query alone is sufficient for polynomial time learning.

IEEE Transactions on Nanobioscience | 2012

Combined Rule Extraction and Feature Elimination in Supervised Classification

Sheng Liu; Ronak Y. Patel; Pankaj R. Daga; Haining Liu; Gang Fu; Robert J. Doerksen; Yixin Chen; Dawn Wilkins

BackgroundIn the context of drug discovery and development, much effort has been exerted to determine which conformers of a given molecule are responsible for the observed biological activity. In this work we aimed to predict bioactive conformers using a variant of supervised learning, named multiple-instance learning. A single molecule, treated as a bag of conformers, is biologically active if and only if at least one of its conformers, treated as an instance, is responsible for the observed bioactivity; and a molecule is inactive if none of its conformers is responsible for the observed bioactivity. The implementation requires instance-based embedding, and joint feature selection and classification. The goal of the present project is to implement multiple-instance learning in drug activity prediction, and subsequently to identify the bioactive conformers for each molecule.MethodsWe encoded the 3-dimensional structures using pharmacophore fingerprints which are binary strings, and accomplished instance-based embedding using calculated dissimilarity distances. Four dissimilarity measures were employed and their performances were compared. 1-norm SVM was used for joint feature selection and classification. The approach was applied to four data sets, and the best proposed model for each data set was determined by using the dissimilarity measure yielding the smallest number of selected features.ResultsThe predictive abilities of the proposed approach were compared with three classical predictive models without instance-based embedding. The proposed approach produced the best predictive models for one data set and second best predictive models for the rest of the data sets, based on the external validations. To validate the ability of the proposed approach to find bioactive conformers, 12 small molecules with co-crystallized structures were seeded in one data set. 10 out of 12 co-crystallized structures were indeed identified as significant conformers using the proposed approach.ConclusionsThe proposed approach was proven not to suffer from overfitting and to be highly competitive with classical predictive models, so it is very powerful for drug activity prediction. The approach was also validated as a useful method for pursuit of bioactive conformers.

BMC Bioinformatics | 2009

Graph ranking for exploratory gene data analysis

Cuilan Gao; Xin Dang; Yixin Chen; Dawn Wilkins

There are a vast number of biology related research problems involving a combination of multiple sources of data to achieve a better understanding of the underlying problems. It is important to select and interpret the most important information from these sources. Thus it will be beneficial to have a good algorithm to simultaneously extract rules and select features for better interpretation of the predictive model. We propose an efficient algorithm, Combined Rule Extraction and Feature Elimination (CRF), based on 1-norm regularized random forests. CRF simultaneously extracts a small number of rules generated by random forests and selects important features. We applied CRF to several drug activity prediction and microarray data sets. CRF is capable of producing performance comparable with state-of-the-art prediction algorithms using a small number of decision rules. Some of the decision rules are biologically significant.

Explore More