Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Faraz Faghri is active.

Publication


Featured researches published by Faraz Faghri.


PLOS Biology | 2015

Big Data: Astronomical or Genomical?

Zachary D. Stephens; Skylar Y. Lee; Faraz Faghri; Roy H. Campbell; ChengXiang Zhai; Miles Efron; Ravishankar K. Iyer; Michael C. Schatz; Saurabh Sinha; Gene E. Robinson

Genomics is a Big Data science and is going to get much bigger, very soon, but it is not known whether the needs of genomics will exceed other Big Data domains. Projecting to the year 2025, we compared genomics with three other major generators of Big Data: astronomy, YouTube, and Twitter. Our estimates show that genomics is a “four-headed beast”—it is either on par with or the most demanding of the domains analyzed here in terms of data acquisition, storage, distribution, and analysis. We discuss aspects of new technologies that will need to be developed to rise up and meet the computational challenges that genomics poses for the near future. Now is the time for concerted, community-wide planning for the “genomical” challenges of the next decade.


Proceedings of the Workshop on Secure and Dependable Middleware for Cloud Monitoring and Management | 2012

Failure scenario as a service (FSaaS) for Hadoop clusters

Faraz Faghri; Sobir Bazarbayev; Mark Overholt; Reza Farivar; Roy H. Campbell; William H. Sanders

As the use of cloud computing resources grows in academic research and industry, so does the likelihood of failures that catastrophically affect the applications being run on the cloud. For that reason, cloud service providers as well as cloud applications need to expect failures and shield their services accordingly. We propose a new model called Failure Scenario as a Service (FSaaS). FSaaS will be utilized across the cloud for testing the resilience of cloud applications. In an effort to provide both Hadoop service and application vendors with the means to test their applications against the risk of massive failure, we focus our efforts on the Hadoop platform. We have generated a series of failure scenarios for certain types of jobs. Customers will be able to choose specific scenarios based on their jobs to evaluate their systems.


Neurobiology of Aging | 2017

NeuroChip, an updated version of the NeuroX genotyping platform to rapidly screen for variants associated with neurological diseases

Cornelis Blauwendraat; Faraz Faghri; Lasse Pihlstrøm; Joshua T. Geiger; Alexis Elbaz; Suzanne Lesage; Jean-Christophe Corvol; Patrick May; Aude Nicolas; Yevgeniya Abramzon; Natalie A. Murphy; J. Raphael Gibbs; Mina Ryten; Raffaele Ferrari; Jose Bras; Rita Guerreiro; Julie Williams; Rebecca Sims; Steven Lubbe; Dena Hernandez; Kin Mok; Laurie Robak; Roy H. Campbell; Ekaterina Rogaeva; Bryan J. Traynor; Ruth Chia; Sun Ju Chung; John Hardy; Alexis Brice; Nicholas W. Wood

Genetics has proven to be a powerful approach in neurodegenerative diseases research, resulting in the identification of numerous causal and risk variants. Previously, we introduced the NeuroX Illumina genotyping array, a fast and efficient genotyping platform designed for the investigation of genetic variation in neurodegenerative diseases. Here, we present its updated version, named NeuroChip. The NeuroChip is a low-cost, custom-designed array containing a tagging variant backbone of about 306,670 variants complemented with a manually curated custom content comprised of 179,467 variants implicated in diverse neurological diseases, including Alzheimers disease, Parkinsons disease, Lewy body dementia, amyotrophic lateral sclerosis, frontotemporal dementia, progressive supranuclear palsy, corticobasal degeneration, and multiple system atrophy. The tagging backbone was chosen because of the low cost and good genome-wide resolution; the custom content can be combined with other backbones, like population or drug development arrays. Using the NeuroChip, we can accurately identify rare variants and impute over 5.3 million common SNPs from the latest release of the Haplotype Reference Consortium. In summary, we describe the design and usage of the NeuroChip array and show its capability for detecting rare pathogenic variants in numerous neurodegenerative diseases. The NeuroChip has a more comprehensive and improved content, which makes it a reliable, high-throughput, cost-effective screening tool for genetic research and molecular diagnostics in neurodegenerative diseases.


bioRxiv | 2018

Parkinson's disease genetics: identifying novel risk loci, providing causal insights and improving estimates of heritable risk.

Michael A. Nalls; Cornelis Blauwendraat; Costanza Vallerga; Karl Heilbron; Sara Bandres-Ciga; Diana Chang; Manuela Tan; Demis Kia; Alastair J. Noyce; Angli Xue; Jose Bras; Emily Young; Ranier von Coelln; Javier Simón-Sánchez; Claudia Schulte; Manu Sharma; Lynne Krohn; Lasse Pihlstrøm; Ari Siitonen; Hirotaka Iwaki; Hampton Leonard; Faraz Faghri; J. Raphael Gibbs; Dena Hernandez; Sonja W. Scholz; Juan A. Botía; María Rodríguez Martínez; Jean-Chrstophe Corvol; Suzanne Lesage; Joseph Jankovic

We performed the largest genetic study of Parkinson9s disease to date, involving analysis of 11.4M SNPs in 37.7K cases, 18.6K 9proxy-cases9 and 1.4M controls, discovering 39 novel risk loci. In total, we identified 92 putative independent genome-wide significant signals including 53 at previously published loci. Next, we dissected risk within these loci, identifying 22 candidate independent risk variants in close proximity to one another representing multiple risk signals in one locus (20 variants proximal to known risk loci). We then employed tests of causality within a Mendelian randomization framework to infer functional genomic consequences for genes within loci of interest in concert with protein-centric network analyses to nominate likely candidates for follow-up investigation. This report also shows expression network signatures of PD loci to be heavily brain enriched and different in comparison to Alzheimer9s disease. We also used risk scoring methods to improve genetic predictions of disease risk, and show that GWAS signals explain 11-15% of the heritable risk of PD at thresholds below genome-wide significance. Additionally, these data also suggest genetic correlations relating to risk overlapping with brain morphology, smoking status and educational attainment. Further analyses of smoking initiation and cognitive performance relating to PD risk in more comprehensive datasets show complex etiological links between PD risk and these traits. These data in sum provide the most comprehensive understanding of the genetic architecture of PD to date, revealing a large number of additional loci, and demonstrating that there remains a considerable genetic component of this disease that has not yet been discovered.We performed the largest genome-wide association study of PD to date, involving the analysis of 7.8M SNPs in 37.7K cases, 18.6K UK Biobank proxy-cases, and 1.4M controls. We identified 90 independent genome-wide significant signals across 78 loci, including 38 independent risk signals in 37 novel loci. These variants explained 26-36% of the heritable risk of PD. Tests of causality within a Mendelian randomization framework identified putatively causal genes for 70 risk signals. Tissue expression enrichment analysis suggested that signatures of PD loci were heavily brain-enriched, consistent with specific neuronal cell types being implicated from single cell expression data. We found significant genetic correlations with brain volumes, smoking status, and educational attainment. In sum, these data provide the most comprehensive understanding of the genetic architecture of PD to date by revealing many additional PD risk loci, providing a biological context for these risk factors, and demonstrating that a considerable genetic component of this disease remains unidentified.


bioRxiv | 2018

Analysis and Prediction of Unplanned Intensive Care Unit Readmission using Recurrent Neural Networks with Long Short-Term Memory

Yu-Wei Lin; Yuqian Zhou; Faraz Faghri; Michael J Shaw; Roy H. Campbell

Background Unplanned readmission of a hospitalized patient is an extremely undesirable outcome as the patient may have been exposed to additional risks. The rates of unplanned readmission are, therefore, regarded as an important performance indicator for the medical quality of a hospital and healthcare system. Identifying high-risk patients likely to suffer from readmission before release benefits both the patients and the medical providers. The emergence of machine learning to detect hidden patterns in complex, multi-dimensional datasets provides unparalleled opportunities to develop efficient discharge decision-making support system for physicians. Methods and Findings We used supervised machine learning approaches for ICU readmission prediction. We used machine learning methods on comprehensive, longitudinal clinical data from the MIMIC-III to predict the ICU readmission of patients within 30 days of their discharge. We have utilized recent machine learning techniques such as Recurrent Neural Networks (RNN) with Long Short-Term Memory (LSTM), by this we have been able incorporate the multivariate features of EHRs and capture sudden fluctuations in chart event features (e.g. glucose and heart rate) that are significant in time series with temporal dependencies, which cannot be properly captured by traditional static models, but can be captured by our proposed deep neural network based model. We incorporate multiple types of features including chart events, demographic, and ICD9 embeddings. Our machine learning models identifies ICU readmissions at a higher sensitivity rate (0.742) and an improved Area Under the Curve (0.791) compared with traditional methods. We also illustrate the importance of each portion of the features and different combinations of the models to verify the effectiveness of the proposed model. Conclusion Our manuscript highlights the ability of machine learning models to improve our ICU decision making accuracy, and is a real-world example of precision medicine in hospitals. These data-driven results enable clinicians to make assisted decisions within their patient cohorts. This knowledge could have immediate implications for hospitals by improving the detection of possible readmission. We anticipate that machine learning models will improve patient counseling, hospital administration, allocation of healthcare resources and ultimately individualized clinical care.


bioRxiv | 2018

Predicting onset, progression, and clinical subtypes of Parkinson disease using machine learning

Faraz Faghri; Sayed Hadi Hashemi; Hampton Leonard; Sonja W. Scholz; Roy H. Campbell; Michael A. Nalls; Andrew Singleton

Background The clinical manifestations of Parkinson disease are characterized by heterogeneity in age at onset, disease duration, rate of progression, and constellation of motor versus nonmotor features. Due to these variable presentations, counseling of patients about their individual risks and prognosis is limited. There is an unmet need for predictive tests that facilitate early detection and characterization of distinct disease subtypes as well as improved, individualized predictions of the disease course. The emergence of machine learning to detect hidden patterns in complex, multi-dimensional datasets provides unparalleled opportunities to address this critical need. Methods and Findings We used unsupervised and supervised machine learning approaches for subtype identification and prediction. We used machine learning methods on comprehensive, longitudinal clinical data from the Parkinson Disease Progression Marker Initiative (PPMI) (n=328 cases) to identify patient subtypes and to predict disease progression. The resulting models were validated in an independent, clinically well-characterized cohort from the Parkinson Disease Biomarker Program (PDBP) (n=112 cases). Our analysis distinguished three distinct disease subtypes with highly predictable progression rates, corresponding to slow, moderate and fast disease progressors. We achieved highly accurate projections of disease progression four years after initial diagnosis with an average Area Under the Curve of 0.93 (95% CI: 0.96 ± 0.01 for PDvec1, 0.87 ± 0.03 for PDvec2, and 0.96 ± 0.02 for PDvec3). We have demonstrated robust replication of these findings in the independent validation cohort. Conclusions These data-driven results enable clinicians to deconstruct the heterogeneity within their patient cohorts. This knowledge could have immediate implications for clinical trials by improving the detection of significant clinical outcomes that might have been masked by cohort heterogeneity. We anticipate that machine learning models will improve patient counseling, clinical trial design, allocation of healthcare resources and ultimately individualized clinical care.


bioRxiv | 2018

Genetic variability and potential effects on clinical trial outcomes: perspectives in Parkinson\'s disease

Hampton Leonard; Cornelis Blauwendraat; Lynne Krohn; Faraz Faghri; Hirotaka Iwaki; Glen Furgeson; Aaron G. Day-Williams; David J. Stone; Andrew Singleton; Michael A. Nalls; Ziv Gan-Or

Background Improper randomization in clinical trials can result in the failure of the trial to meet its primary end-point. The last ∼10 years have revealed that common and rare genetic variants are an important disease factor and sometimes account for a substantial portion of disease risk variance. However, the burden of common genetic risk variants is not often considered in the randomization of clinical trials and can therefore lead to additional unwanted variance between trial arms. We simulated clinical trials to estimate false negative and false positive rates and investigated differences in single variants and mean genetic risk scores (GRS) between trial arms to investigate the potential effect of genetic variance on clinical trial outcomes at different sample sizes. Methods Single variant and genetic risk score analyses were conducted in a clinical trial simulation environment using data from 5851 Parkinson’s Disease patients as well as two simulated virtual cohorts based on public data. The virtual cohorts included a GBA variant cohort and a two variant interaction cohort. Data was resampled at different sizes (n = 200-5000 for the Parkinson’s Disease cohort) and (n = 50-800 and n = 50-2000 for virtual cohorts) for 1000 iterations and randomly assigned to the two arms of a trial. False negative and false positive rates were estimated using simulated clinical trials, and percent difference in genetic risk score and allele frequency was calculated to quantify disparity between arms. Findings Significant genetic differences between the two arms of a trial are found at all sample sizes. Approximately 90% of the iterations had at least one statistically significant difference in individual risk SNPs between each trial arm. Approximately 10% of iterations had a statistically significant difference between trial arms in polygenic risk score mean or variance. For significant iterations at sample size 200, the average percent difference for mean GRS between trial arms was 130.87%, decreasing to 29.87% as sample size reached 5000. In the GBA only simulations we see an average 18.86% difference in GRS scores between trial arms at n = 50, decreasing to 3.09% as sample size reaches 2000. Balancing patients by genotype reduced mean percent difference in GRS between arms to 36.71% for the main cohort and 2.00% for the GBA cohort at n = 200. When adding a drug effect to the simulations, we found that unbalanced genetics with an effect on the chosen measurable clinical outcome can result in high false negative rates among trials, especially at small sample sizes. At a sample size of n = 50 and a targeted drug effect of −0.5 points in UPDRS per year, we discovered 33.9% of trials resulted in false negatives. Interpretations Our data support the hypothesis that within genetically unmatched clinical trials, particularly those below 1000 participants, heterogeneity could confound true therapeutic effects as expected. This is particularly important in the changing environment of drug approvals. Clinical trials should undergo pre-trial genetic adjustment or, at the minimum, post-trial adjustment and analysis for failed trials. Clinical trial arms should be balanced on genetic risk variants, as well as cumulative variant distributions represented by GRS, in order to ensure the maximum reduction in trial arm disparities. The reduction in variance after balancing allows smaller sample sizes to be utilized without risking the large disparities between trial arms witnessed in typical randomized trials. As the cost of genotyping will likely be far less than greatly increasing sample size, genetically balancing trial arms can lead to more cost-effective clinical trials as well as better outcomes.


the internet of things | 2016

World of Empowered IoT Users

Sayed Hadi Hashemi; Faraz Faghri; Paul Rausch; Roy H. Campbell


arXiv: Distributed, Parallel, and Cluster Computing | 2017

Toward Scalable Machine Learning and Data Mining: the Bioinformatics Case.

Faraz Faghri; Sayed Hadi Hashemi; Mohammad Babaeizadeh; Michael A. Nalls; Saurabh Sinha; Roy H. Campbell


Archive | 2017

Decentralized User-Centric Access Control using PubSub over Blockchain.

Sayed Hadi Hashemi; Faraz Faghri; Roy H. Campbell

Collaboration


Dive into the Faraz Faghri's collaboration.

Top Co-Authors

Avatar

Michael A. Nalls

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Cornelis Blauwendraat

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Hampton Leonard

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Andrew Singleton

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Dena Hernandez

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Hirotaka Iwaki

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

J. Raphael Gibbs

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Sonja W. Scholz

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge