Vinhthuy Phan
University of Memphis
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Vinhthuy Phan.
Carcinogenesis | 2009
Quynh T. Tran; Lijing Xu; Vinhthuy Phan; Shirlean Goodwin; Mohammed Mostafizur Rahman; Victor X. Jin; Carrie Hayes Sutter; Bill D. Roebuck; Thomas W. Kensler; E. Olusegun George; Thomas R. Sutter
3H-1,2-dithiole-3-thione (D3T) and its analogues 4-methyl-5-pyrazinyl-3H-1,2-dithiole-3-thione (OLT) and 5-tert-butyl-3H-1,2-dithiole-3-thione (TBD) are chemopreventive agents that block or diminish early stages of carcinogenesis by inducing activities of detoxication enzymes. While OLT has been used in clinical trials, TBD has been shown to be more efficacious and possibly less toxic than OLT in animals. Here, we utilize a robust and high-resolution chemical genomics procedure to examine the pharmacological structure–activity relationships of these compounds in livers of male rats by microarray analyses. We identified 226 differentially expressed genes that were common to all treatments. Functional analysis identified the relation of these genes to glutathione metabolism and the nuclear factor, erythroid derived 2-related factor 2 pathway (Nrf2) that is known to regulate many of the protective actions of dithiolethiones. OLT and TBD were shown to have similar efficacies and both were weaker than D3T. In addition, we identified 40 genes whose responses were common to OLT and TBD, yet distinct from D3T. As inhibition of cytochrome P450 (CYP) has been associated with the effects of OLT on CYP expression, we determined the half maximal inhibitory concentration (IC50) values for inhibition of CYP1A2. The rank order of inhibitor potency was OLT ≫ TBD ≫ D3T, with IC50 values estimated as 0.2, 12.8 and >100 μM, respectively. Functional analysis revealed that OLT and TBD, in addition to their effects on CYP, modulate liver lipid metabolism, especially fatty acids. Together, these findings provide new insight into the actions of clinically relevant and lead dithiolethione analogues.
international conference on dna computing | 2006
Max H. Garzon; Vinhthuy Phan; Sujoy Sinha Roy; Andrew Neel
Encoding and processing information in DNA-, RNA- and other biomolecule-based devices is an important requirement for DNA-based computing with potentially important applications. Recent experimental and theoretical advances have produced and tested new methods to obtain large code sets of oligonucleotides to support virtually any kind of application. We report results of a tour de force to conduct an exhaustive search to produce code sets that are arguably of sizes comparable to that of maximal sets while guaranteeing high quality, as measured by the minimum Gibbs energy between any pair of code words and other criteria. The method is constructive and directly produces the actual composition of the sets, unlike their counterpart in vitro . The sequences allow a quantitative characterization of their composition. We also present a new technique to generate code sets with desirable more stringent constraints on their possible interaction under a variety of conditions, as measured by Gibbs energies of duplex formation. The results predict close agreement with known results in vitro for 20–mers. Consequences of these results are bounds on the capacity of DNA for information storage and processing in wet tubes for a given oligo length, as well as many other applications where specific and complex self-directed assembly of large number of components may be required.
Natural Computing | 2009
Vinhthuy Phan; Max H. Garzon
Finding a large set of single DNA strands that do not crosshybridize to themselves and/or to their complements is an important problem in DNA computing, self-assembly, and DNA memories. We describe a theoretical framework to analyze this problem, gauge its computational difficulty, and provide nearly optimal solutions. In this framework, codeword design is reduced to finding large sets of strands maximaly separated in a DNA space and the size of such sets depends on the geometry of these metric spaces. We show that codeword design is NP-complete using any single reasonable measure that approximates the Gibbs energy, thus practically excluding the possibility of finding any procedure to find maximal sets efficiently. Second, we extend a technique known as shuffling to provide a construction that yields provably nearly-maximal codes. Third, we propose a filtering process that removes strands creating pairs with low Gibbs energies, as approximated by the nearest-neighbor model. These two steps produce large codes of thermodynamic high quality. The proposed framework can be used to gain an understanding of the Gibbs energy landscapes for DNA strands on which much of DNA computing and self-assembly are based.
Bioinformatics | 2001
Vinhthuy Phan; Steven Skiena
MOTIVATION A realistic approach to sequencing by hybridization must deal with realistic sequencing errors. The results of such a method can surely be applied to similar sequencing tasks. RESULTS We provide the first algorithms for interactive sequencing by hybridization which are robust in the presence of hybridization errors. Under a strong error model allowing both positive and negative hybridization errors without repeated queries, we demonstrate accurate and efficient reconstruction with error rates up to 7%. Under the weaker traditional error model of Shamir and Tsur (Proceedings of the Fifth International Conference on Computational Molecular Biology (RECOMB-01), pp 269-277, 2000), we obtain accurate reconstructions with up to 20% false negative hybridization errors. Finally, we establish theoretical bounds on the performance of the sequential probing algorithm of Skiena and Sundaram (J. Comput. Biol., 2, 333-353, 1995) under the strong error model. AVAILABILTY Freely available upon request. CONTACT [email protected].
BMC Bioinformatics | 2011
Sujoy Sinha Roy; Kevin Heinrich; Vinhthuy Phan; Michael W. Berry; Ramin Homayouni
BackgroundIdentification of transcription factors (TFs) responsible for modulation of differentially expressed genes is a key step in deducing gene regulatory pathways. Most current methods identify TFs by searching for presence of DNA binding motifs in the promoter regions of co-regulated genes. However, this strategy may not always be useful as presence of a motif does not necessarily imply a regulatory role. Conversely, motif presence may not be required for a TF to regulate a set of genes. Therefore, it is imperative to include functional (biochemical and molecular) associations, such as those found in the biomedical literature, into algorithms for identification of putative regulatory TFs that might be explicitly or implicitly linked to the genes under investigation.ResultsIn this study, we present a Latent Semantic Indexing (LSI) based text mining approach for identification and ranking of putative regulatory TFs from microarray derived differentially expressed genes (DEGs). Two LSI models were built using different term weighting schemes to devise pair-wise similarities between 21,027 mouse genes annotated in the Entrez Gene repository. Amongst these genes, 433 were designated TFs in the TRANSFAC database. The LSI derived TF-to-gene similarities were used to calculate TF literature enrichment p-values and rank the TFs for a given set of genes. We evaluated our approach using five different publicly available microarray datasets focusing on TFs Rel, Stat6, Ddit3, Stat5 and Nfic. In addition, for each of the datasets, we constructed gold standard TFs known to be functionally relevant to the study in question. Receiver Operating Characteristics (ROC) curves showed that the log-entropy LSI model outperformed the tf-normal LSI model and a benchmark co-occurrence based method for four out of five datasets, as well as motif searching approaches, in identifying putative TFs.ConclusionsOur results suggest that our LSI based text mining approach can complement existing approaches used in systems biology research to decipher gene regulatory networks by providing putative lists of ranked TFs that might be explicitly or implicitly associated with sets of DEGs derived from microarray experiments. In addition, unlike motif searching approaches, LSI based approaches can reveal TFs that may indirectly regulate genes.
International Journal of Nanotechnology and Molecular Computation | 2009
Max H. Garzon; Vinhthuy Phan; Andrew Neel
DNA has been re-discovered and explored in the last decade as a “smart glue” for self-assembly from the “bottom-up” at nanoscales through mesoscales to microand macro-scales. These applications require an unprecedented degree of precision in placing atom-scale components. Finding large sets of probes to serve as anchors for such applications has been thus explored in the last few years through several methods. We describe results of a tour de force to conduct an exhaustive search to produce large codes that are (nearly) maximal sets while guaranteeing high quality, as measured by the minimum Gibbs energy between any pair of code words, and other criteria. We also present a quantitative characterization of the sets for sizes up to 20-mers and show how critical building blocks can be extracted to produce codes of very high quality for larger lengths by probabilistic combinations, for which an exhaustive search is out of reach.
international conference on dna computing | 2006
Kiranchand V. Bobba; Andrew Neel; Vinhthuy Phan; Max H. Garzon
Memory is a fundamental challenge in computing, particularly if they are to store large amounts of interrelated data based on content and be queried associatively to retrieve information useful to the owners of the storage, such as self-assembled DNA structures, cells, and biological organisms. New methods to encode large data sets compactly on DNA chips have been recently proposed in (Garzon S Deaton, 2004) [6]. The method consists of shredding the data into short oligonucleotides and pouring it over a DNA chip with spots populated by copies of a basis set of noncrosshybridizing strands. In this paper, we probe into the capacity of these memories in terms of their ability to discern semantic relationships and discriminate information in complex contexts in two applications, as opposed to their raw capacity to store volumes of uncorrelated data. First, we show that DNA memories can be designed to store information about English texts so that they can “conduct a conversation” about their content with an interlocutor who wants to learn about the subject contained in the memories. In this preliminary approach, the results are competitive, if not better, with state-of-the-art methods in conventional artificial intelligence. In a second application in biology, we show how a biomolecular computing analysis based on similar techniques can be used to re-design DNA microarrays in order to increase their sensitivity to the level required for successful discrimination of conditions that may escape detection by standard methods. Finally, we briefly discuss the scalability of the common technique to large amounts of data given recent advances in the design of noncrosshybridizing DNA oligo sets, as well other applications in bioinformatics and medical diagnosis.
Journal of Bioinformatics and Computational Biology | 2009
Vinhthuy Phan; E. Olusegun George; Quynh T. Tran; Shirlean Goodwin; Sridevi Bodreddigari; Thomas R. Sutter
Post hoc assignment of patterns determined by all pairwise comparisons in microarray experiments with multiple treatments has been proven to be useful in assessing treatment effects. We propose the usage of transitive directed acyclic graphs (tDAG) as the representation of these patterns and show that such representation can be useful in clustering treatment effects, annotating existing clustering methods, and analyzing sample sizes. Advantages of this approach include: (1) unique and descriptive meaning of each cluster in terms of how genes respond to all pairs of treatments; (2) insensitivity of the observed patterns to the number of genes analyzed; and (3) a combinatorial perspective to address the sample size problem by observing the rate of contractible tDAG as the number of replicates increases. The advantages and overall utility of the method in elaborating drug structure activity relationships are exemplified in a controlled study with real and simulated data.
workshop on algorithms and data structures | 2003
Vinhthuy Phan; Steven Skiena; Pavel Sumazin
The design of heuristics for NP-hard problems is perhaps the most active area of research in the theory of combinatorial algorithms. However, practitioners more often resort to local-improvement heuristics such as gradient-descent search, simulated annealing, tabu search, or genetic algorithms. Properly implemented, local-improvement heuristics can lead to short, efficient programs that yield reasonable solutions. Designers of efficient local-improvement heuristics must make several crucial decisions, including the choice of neighborhood and heuristic for the problem at hand. We are interested in developing a general methodology for predicting the quality of local-neighborhood operators and heuristics, given a time budget and a solution evaluation function.
algorithm engineering and experimentation | 2002
Vinhthuy Phan; Pavel Sumazin; Steven Skiena
When faced with a combinatorial optimization problem, practitioners often turn to black-box search heuristics such as simulated annealing and genetic algorithms. In black-box optimization, the problem-specific components are limited to functions that (1) generate candidate solutions, and (2) evaluate the quality of a given solution. A primary reason for the popularity of black-box optimization is its ease of implementation. The basic simulated annealing search algorithm can be implemented in roughly 30–50 lines of any modern programming language, not counting the problem-specific local-move and cost-evaluation functions. This search algorithm is so simple that it is often rewritten from scratch for each new application rather than being reused.