Sing-Hoi Sze
University of Southern California
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sing-Hoi Sze.
research in computational molecular biology | 1997
Sing-Hoi Sze; Pavel A. Pevzner
Recently, Gelfand, Mironov and Pevzner (1996) proposed a spliced alignment approach to gene recognition that provides 99% accurate recognition of human genes if a related mammalian protein is available. However, even 99% accurate gene predictions are insufficient for automated sequence annotation in large-scale sequencing projects and therefore have to be complemented by experimental gene verification. One hundred percent accurate gene predictions would lead to a substantial reduction of experimental work on gene identification. Our goal is to develop an algorithm that either predicts an exon assembly with accuracy sufficient for sequence annotation or warns a biologist that the accuracy of a prediction is insufficient and further experimental work is required. We study suboptimal and error-tolerant spliced alignment problems as the first steps towards such an algorithm, and report an algorithm which provides 100% accurate recognition of human genes in 37% of cases (if a related mammalian protein is available). In 52% of genes, the algorithm predicts at least one exon with 100% accuracy.
pacific symposium on biocomputing | 2001
Sing-Hoi Sze; Mikhail S. Gelfand; Pavel A. Pevzner
Recognition of regulatory sites in unaligned DNA sequences is an old and well-studied problem in computational molecular biology. Recently, large-scale expression studies and comparative genomics brought this problem into a spotlight by generating a large number of samples with unknown regulatory signals. Here we develop algorithms for recognition of signals in corrupted samples (where only a fraction of sequences contain sites) with biased nucleotide composition. We further benchmark these and other algorithms on several bacterial and archaeal sites in a setting specifically designed to imitate the situations arising in comparative genomics studies.
Bioinformatics | 1998
Sing-Hoi Sze; Mikhail A. Roytberg; Mikhail S. Gelfand; Andrey A. Mironov; Tatiana V. Astakhova; Pavel A. Pevzner
MOTIVATIONnGene annotation is the final goal of gene prediction algorithms. However, these algorithms frequently make mistakes and therefore the use of gene predictions for sequence annotation is hardly possible. As a result, biologists are forced to conduct time-consuming gene identification experiments by designing appropriate PCR primers to test cDNA libraries or applying RT-PCR, exon trapping/amplification, or other techniques. This process frequently amounts to guessing PCR primers on top of unreliable gene predictions and frequently leads to wasting of experimental efforts.nnnRESULTSnThe present paper proposes a simple and reliable algorithm for experimental gene identification which bypasses the unreliable gene prediction step. Studies of the performance of the algorithm on a sample of human genes indicate that an experimental protocol based on the algorithms predictions achieves an accurate gene identification with relatively few PCR primers. Predictions of PCR primers may be used for exon amplification in preliminary mutation analysis during an attempt to identify a gene responsible for a disease. We propose a simple approach to find a short region from a genomic sequence that with high probability overlaps with some exon of the gene. The algorithm is enhanced to find one or more segments that are probably contained in the translated region of the gene and can be used as PCR primers to select appropriate clones in cDNA libraries by selective amplification. The algorithm is further extended to locate a set of PCR primers that uniformly cover all translated regions and can be used for RT-PCR and further sequencing of (unknown) mRNA.
Journal of Computational Biology | 1997
Sing-Hoi Sze; Pavel A. Pevzner
Recently, Gelfand, Mironov and Pevzner (1996) proposed a spliced alignment approach to gene recognition that provides 99% accurate recognition of human genes if a related mammalian protein is available. However, even 99% accurate gene predictions are insufficient for automated sequence annotation in large-scale sequencing projects and therefore have to be complemented by experimental gene verification. One hundred percent accurate gene predictions would lead to a substantial reduction of experimental work on gene identification. Our goal is to develop an algorithm that either predicts an exon assembly with accuracy sufficient for sequence annotation or warns a biologist that the accuracy of a prediction is insufficient and further experimental work is required. We study suboptimal and error-tolerant spliced alignment problems as the first steps towards such an algorithm, and report an algorithm which provides 100% accurate recognition of human genes in 37% of cases (if a related mammalian protein is available). In 52% of genes, the algorithm predicts at least one exon with 100% accuracy.
intelligent systems in molecular biology | 2000
Pavel A. Pevzner; Sing-Hoi Sze
Science | 2002
Victoria V. Lunyak; Robert W. Burgess; Gratien G. Prefontaine; Charles A. Nelson; Sing-Hoi Sze; Josh Chenoweth; Phillip Schwartz; Pavel A. Pevzner; Christopher K. Glass; Gail Mandel; Michael G. Rosenfeld
intelligent systems in molecular biology | 2002
Steffen Heber; Max A. Alekseyev; Sing-Hoi Sze; Haixu Tang; Pavel A. Pevzner
Genomics | 1998
Guorong Xu; Sing-Hoi Sze; Cheng-Pin Liu; Pavel A. Pevzner; Norman Arnheim
pacific symposium on biocomputing | 2002
Sing-Hoi Sze; Mikhail S. Gelfand; Pavel A. Pevzner
Archive | 2000
Pavel A. Pevzner; Sing-Hoi Sze