Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Pegah Tootoonchi Afshar is active.

Publication


Featured researches published by Pegah Tootoonchi Afshar.


Proceedings of the National Academy of Sciences of the United States of America | 2013

Characterization of the human ESC transcriptome by hybrid sequencing

Kin Fai Au; Vittorio Sebastiano; Pegah Tootoonchi Afshar; Jens Durruthy Durruthy; Lawrence Lee; Brian A. Williams; Harm van Bakel; Eric E. Schadt; Renee Reijo-Pera; Jason G. Underwood; Wing Hung Wong

Significance Isoform identification and discovery are an important goal for transcriptome analysis because the majority of human genes express multiple isoforms with context- and tissue-specific functions. Better annotation of isoforms will also benefit downstream analysis such as expression quantification. Current RNA-Seq methods based on short-read sequencing are not reliable for isoform discovery. In this study we developed a new method based on the combined analysis of short reads and long reads generated, respectively, by second- and third-generation sequencing and applied this method to obtain a comprehensive characterization of the transcriptome of the human embryonic stem cell. The results showed that large gain in sensitivity and specificity can be achieved with this strategy. Although transcriptional and posttranscriptional events are detected in RNA-Seq data from second-generation sequencing, full-length mRNA isoforms are not captured. On the other hand, third-generation sequencing, which yields much longer reads, has current limitations of lower raw accuracy and throughput. Here, we combine second-generation sequencing and third-generation sequencing with a custom-designed method for isoform identification and quantification to generate a high-confidence isoform dataset for human embryonic stem cells (hESCs). We report 8,084 RefSeq-annotated isoforms detected as full-length and an additional 5,459 isoforms predicted through statistical inference. Over one-third of these are novel isoforms, including 273 RNAs from gene loci that have not previously been identified. Further characterization of the novel loci indicates that a subset is expressed in pluripotent cells but not in diverse fetal and adult tissues; moreover, their reduced expression perturbs the network of pluripotency-associated genes. Results suggest that gene identification, even in well-characterized human cell lines and tissues, is likely far from complete.


Nucleic Acids Research | 2015

Characterization of fusion genes and the significantly expressed fusion isoforms in breast cancer by hybrid sequencing

Jason L. Weirather; Pegah Tootoonchi Afshar; Tyson A. Clark; Elizabeth Tseng; Linda S. Powers; Jason G. Underwood; Joseph Zabner; Jonas Korlach; Wing Hung Wong; Kin Fai Au

We developed an innovative hybrid sequencing approach, IDP-fusion, to detect fusion genes, determine fusion sites and identify and quantify fusion isoforms. IDP-fusion is the first method to study gene fusion events by integrating Third Generation Sequencing long reads and Second Generation Sequencing short reads. We applied IDP-fusion to PacBio data and Illumina data from the MCF-7 breast cancer cells. Compared with the existing tools, IDP-fusion detects fusion genes at higher precision and a very low false positive rate. The results show that IDP-fusion will be useful for unraveling the complexity of multiple fusion splices and fusion isoforms within tumorigenesis-relevant fusion genes.


Nature Communications | 2017

Gaining comprehensive biological insight into the transcriptome by performing a broad-spectrum RNA-seq analysis

Sayed Mohammad Ebrahim Sahraeian; Marghoob Mohiyuddin; Robert Sebra; Hagen Tilgner; Pegah Tootoonchi Afshar; Kin Fai Au; Narges Bani Asadi; Mark Gerstein; Wing Hung Wong; Michael Snyder; Eric E. Schadt; Hugo Y. K. Lam

RNA-sequencing (RNA-seq) is an essential technique for transcriptome studies, hundreds of analysis tools have been developed since it was debuted. Although recent efforts have attempted to assess the latest available tools, they have not evaluated the analysis workflows comprehensively to unleash the power within RNA-seq. Here we conduct an extensive study analysing a broad spectrum of RNA-seq workflows. Surpassing the expression analysis scope, our work also includes assessment of RNA variant-calling, RNA editing and RNA fusion detection techniques. Specifically, we examine both short- and long-read RNA-seq technologies, 39 analysis tools resulting in ~120 combinations, and ~490 analyses involving 15 samples with a variety of germline, cancer and stem cell data sets. We report the performance and propose a comprehensive RNA-seq analysis protocol, named RNACocktail, along with a computational pipeline achieving high accuracy. Validation on different samples reveals that our proposed protocol could help researchers extract more biologically relevant predictions by broad analysis of the transcriptome.RNA-seq is widely used for transcriptome analysis. Here, the authors analyse a wide spectrum of RNA-seq workflows and present a comprehensive analysis protocol named RNACocktail as well as a computational pipeline leveraging the widely used tools for accurate RNA-seq analysis.


Genome Biology | 2015

An ensemble approach to accurately detect somatic mutations using SomaticSeq

Li Tai Fang; Pegah Tootoonchi Afshar; Aparna Chhibber; Marghoob Mohiyuddin; Yu Fan; John C. Mu; Greg Gibeling; Sharon Barr; Narges Bani Asadi; Mark Gerstein; Daniel C. Koboldt; Wenyi Wang; Wing Hung Wong; Hugo Y. K. Lam

SomaticSeq is an accurate somatic mutation detection pipeline implementing a stochastic boosting algorithm to produce highly accurate somatic mutation calls for both single nucleotide variants and small insertions and deletions. The workflow currently incorporates five state-of-the-art somatic mutation callers, and extracts over 70 individual genomic and sequencing features for each candidate site. A training set is provided to an adaptively boosted decision tree learner to create a classifier for predicting mutation statuses. We validate our results with both synthetic and real data. We report that SomaticSeq is able to achieve better overall accuracy than any individual tool incorporated.


international conference on transparent optical networks | 2010

European and American research toward next-generation optical access networks

Leonid G. Kazovsky; Claus Popp Larsen; Dirk Breuer; Anders Gavler; Mikhail Popov; Kun Wang; Gunnar Jacobsen; Erik Weis; Christoph Lange; Shing-Wa Wong; She-Hwa Yen; Vinesh Gudla; Pegah Tootoonchi Afshar

Next-generation optical access networks will deliver substantial benefits to consumers including a dedicated high-QoS access to bit rates of hundreds of Megabits per second. They must also deliver significant benefits to network owners/operators to justify the needed infrastructure investment expected to reach billions of Euros. Benefits to network owners/operators are expected to include reduced total cost of ownership, due to higher reliability, lower energy consumption, better flexibility and efficiency, and a smaller number of sites needed to support network operations. This paper will describe recent progress toward that goal including R&D efforts in Europe under the FP7 projects ALPHA and OASE and in the US at Stanford University under the SUCCESS and DAN projects.


Nucleic Acids Research | 2017

COSINE: non-seeding method for mapping long noisy sequences

Pegah Tootoonchi Afshar; Wing Hung Wong

Abstract Third generation sequencing (TGS) are highly promising technologies but the long and noisy reads from TGS are difficult to align using existing algorithms. Here, we present COSINE, a conceptually new method designed specifically for aligning long reads contaminated by a high level of errors. COSINE computes the context similarity of two stretches of nucleobases given the similarity over distributions of their short k-mers (k = 3–4) along the sequences. The results on simulated and real data show that COSINE achieves high sensitivity and specificity under a wide range of read accuracies. When the error rate is high, COSINE can offer substantial advantages over existing alignment methods.


Scientific Reports | 2015

Leveraging long read sequencing from a single individual to provide a comprehensive resource for benchmarking variant calling methods

John C. Mu; Pegah Tootoonchi Afshar; Marghoob Mohiyuddin; Xi Chen; Jian Li; Narges Bani Asadi; Mark Gerstein; Wing Hung Wong; Hugo Y. K. Lam

A high-confidence, comprehensive human variant set is critical in assessing accuracy of sequencing algorithms, which are crucial in precision medicine based on high-throughput sequencing. Although recent works have attempted to provide such a resource, they still do not encompass all major types of variants including structural variants (SVs). Thus, we leveraged the massive high-quality Sanger sequences from the HuRef genome to construct by far the most comprehensive gold set of a single individual, which was cross validated with deep Illumina sequencing, population datasets, and well-established algorithms. It was a necessary effort to completely reanalyze the HuRef genome as its previously published variants were mostly reported five years ago, suffering from compatibility, organization, and accuracy issues that prevent their direct use in benchmarking. Our extensive analysis and validation resulted in a gold set with high specificity and sensitivity. In contrast to the current gold sets of the NA12878 or HS1011 genomes, our gold set is the first that includes small variants, deletion SVs and insertion SVs up to a hundred thousand base-pairs. We demonstrate the utility of our HuRef gold set to benchmark several published SV detection tools.


Cancer Research | 2015

Abstract LB-306: An ensemble approach to accurately detect somatic mutations via adaptive boosting

Li Tai Fang; Pegah Tootoonchi Afshar; John C. Mu; Narges Bani Asadi; Wing Hung Wong; Hugo Y. K. Lam

Identifying somatic mutations is a key analysis in cancer research. The challenge lies in the impure and heterogeneous nature of the tumor samples. Oftentimes, an algorithm works well for one tumor but poorly for another. Here, we present an ensemble approach that integrates multiple algorithms and demonstrate its performance and high accuracy with validation from both synthetic data and real data. Our approach incorporates state-of-the-art callers including MuTect, SomaticSniper, VarScan2, JointSNVMix2, and VarDict for somatic mutation detection. Each of these algorithms has its unique strength, capable of detecting variants that are missed by some others. The call sets are combined based on 70 independent sequencing and genomic features, which are then used by an adaptively boosted decision tree learner. The learner is trained with a sophisticated simulated data to discriminate true mutations from very noisy data of the tumor samples. In our latest submission to the ICGC-TCGA DREAM Mutation Calling Challenge (the Challenge), our approach obtained an unprecedented somatic SNV detection accuracy of 97.1% with a recall of 94.2% and a precision of 99.9%. The synthetic data was a tumor-normal pair of samples with 30x sequencing depth each. The tumor sample was synthesized by spiking in a whole spectrum of variants ranging from SNVs/Indels to SVs, resulting in an SNV allele frequency (VAF) of 25%. We further validated our approach with “in silico titration”. The titration mixed two different real genomes at different proportions with validated ground truths to generate different sample conditions, ranging from the simplest case where the normal and tumor were pure to the more challenging case where the tumor and normal tissues cross contaminated. From an VAF of 50%, 25% to 15%, our approach achieved an accuracy of 95.7%, 92.5%, and 85.3% respectively based on cross validation, consistent with the results from the Challenge. Finally, we validated our approach with three widely-used and published cancer datasets, obtained from TCGA and EGA, including a whole-genome sequenced malignant melanoma cell line, a whole-genome sequenced chronic lymphocytic leukemia cell line, and a whole-exome sequenced colon adenocarcinoma patient sample with experimentally validated somatic mutations. Our approach was trained on the data from the Challenge and applied to the aforementioned samples to measure its accuracy. Our results showed that we achieved a recall of 98.9%, 89.1% and 87.9% respectively. Although precision on real data cannot be measured without a comprehensive whole-genome experimental validation, our comparatively smaller call sets compared to all other methods considered implying that it has the highest precision among all. We extended our study of the above three validation approaches, namely synthetic genomes, in silico titration, and real samples, to compare with all the five individual callers for accuracy performance. We found that our approach had the highest accuracy when compared to any individual caller. To conclude, our approach is shown to have high accuracy in different types and conditions of tumor samples and by far the best in its class. Citation Format: Li Tai Fang, Pegah T. Afshar, John C. Mu, Narges Bani Asadi, Wing H. Wong, Hugo Y. K. Lam. An ensemble approach to accurately detect somatic mutations via adaptive boosting. [abstract]. In: Proceedings of the 106th Annual Meeting of the American Association for Cancer Research; 2015 Apr 18-22; Philadelphia, PA. Philadelphia (PA): AACR; Cancer Res 2015;75(15 Suppl):Abstract nr LB-306. doi:10.1158/1538-7445.AM2015-LB-306


optical fiber communication conference | 2010

Demonstration of energy conserving TDM-PON with sleep mode ONU using fast clock recovery circuit

Shing-Wa Wong; She-Hwa Yen; Pegah Tootoonchi Afshar; Shinji Yamashita; Leonid G. Kazovsky


Iet Optoelectronics | 2011

Challenges in next-generation optical access networks: addressing reach extension and security weaknesses

Leonid G. Kazovsky; Shing-Wa Wong; Vinesh Gudla; Pegah Tootoonchi Afshar; S.-H. Yen; Shinji Yamashita; Ying Yan

Collaboration


Dive into the Pegah Tootoonchi Afshar's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge