Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Karim Oualkacha is active.

Publication


Featured researches published by Karim Oualkacha.


Genetic Epidemiology | 2013

Adjusted Sequence Kernel Association Test for Rare Variants Controlling for Cryptic and Family Relatedness

Karim Oualkacha; Zari Dastani; Rui Li; Pablo Cingolani; Tim D. Spector; Christopher J. Hammond; J. Brent Richards; Antonio Ciampi; Celia M. T. Greenwood

Recent progress in sequencing technologies makes it possible to identify rare and unique variants that may be associated with complex traits. However, the results of such efforts depend crucially on the use of efficient statistical methods and study designs. Although family‐based designs might enrich a data set for familial rare disease variants, most existing rare variant association approaches assume independence of all individuals. We introduce here a framework for association testing of rare variants in family‐based designs. This framework is an adaptation of the sequence kernel association test (SKAT) which allows us to control for family structure. Our adjusted SKAT (ASKAT) combines the SKAT approach and the factored spectrally transformed linear mixed models (FaST‐LMMs) algorithm to capture family effects based on a LMM incorporating the realized proportion of the genome that is identical by descent between pairs of individuals, and using restricted maximum likelihood methods for estimation. In simulation studies, we evaluated type I error and power of this proposed method and we showed that regardless of the level of the trait heritability, our approach has good control of type I error and good power. Since our approach uses FaST‐LMM to calculate variance components for the proposed mixed model, ASKAT is reasonably fast and can analyze hundreds of thousands of markers. Data from the UK twins consortium are presented to illustrate the ASKAT methodology.


European Journal of Human Genetics | 2016

A method for analyzing multiple continuous phenotypes in rare variant association studies allowing for flexible correlations in variant effects.

Jianping Sun; Karim Oualkacha; Vincenzo Forgetta; Hou-Feng Zheng; J. Brent Richards; Antonio Ciampi; Celia M. T. Greenwood

For region-based sequencing data, power to detect genetic associations can be improved through analysis of multiple related phenotypes. With this motivation, we propose a novel test to detect association simultaneously between a set of rare variants, such as those obtained by sequencing in a small genomic region, and multiple continuous phenotypes. We allow arbitrary correlations among the phenotypes and build on a linear mixed model by assuming the effects of the variants follow a multivariate normal distribution with a zero mean and a specific covariance matrix structure. In order to account for the unknown correlation parameter in the covariance matrix of the variant effects, a data-adaptive variance component test based on score-type statistics is derived. As our approach can calculate the P-value analytically, the proposed test procedure is computationally efficient. Broad simulations and an application to the UK10K project show that our proposed multivariate test is generally more powerful than univariate tests, especially when there are pleiotropic effects or highly correlated phenotypes.


Statistical Applications in Genetics and Molecular Biology | 2012

Principal Components of Heritability for High Dimension Quantitative Traits and General Pedigrees

Karim Oualkacha; Aurelie Labbe; Antonio Ciampi; Marc-André Roy; Michel Maziade

For many complex disorders, genetically relevant disease definition is still unclear. For this reason, researchers tend to collect large numbers of items related directly or indirectly to the disease diagnostic. Since the measured traits may not be all influenced by genetic factors, researchers are faced with the problem of choosing which traits or combinations of traits to consider in linkage analysis. To combine items, one can subject the data to a principal component analysis. However, when family date are collected, principal component analysis does not take family structure into account. In order to deal with these issues, Ott & Rabinowitz (1999) introduced the principal components of heritability (PCH), which capture the familial information across traits by calculating linear combinations of traits that maximize heritability. The calculation of the PCHs is based on the estimation of the genetic and the environmental components of variance. In the genetic context, the standard estimators of the variance components are Langes maximum likelihood estimators, which require complex numerical calculations. The objectives of this paper are the following: i) to review some standard strategies available in the literature to estimate variance components for unbalanced data in mixed models; ii) to propose an ANOVA method for a genetic random effect model to estimate the variance components, which can be applied to general pedigrees and high dimensional family data within the PCH framework; iii) to elucidate the connection between PCH analysis and Linear Discriminant Analysis. We use computer simulations to show that the proposed method has similar asymptotic properties as Langes method when the number of traits is small, and we study the efficiency of our method when the number of traits is large. A data analysis involving schizophrenia and bipolar quantitative traits is finally presented to illustrate the PCH methodology.


Frontiers in Genetics | 2016

Gene Coexpression Analyses Differentiate Networks Associated with Diverse Cancers Harboring TP53 Missense or Null Mutations.

Kathleen Oros Klein; Karim Oualkacha; Marie-Hélène Lafond; Sahir Bhatnagar; Patricia N. Tonin; Celia M. T. Greenwood

In a variety of solid cancers, missense mutations in the well-established TP53 tumor suppressor gene may lead to the presence of a partially-functioning protein molecule, whereas mutations affecting the protein encoding reading frame, often referred to as null mutations, result in the absence of p53 protein. Both types of mutations have been observed in the same cancer type. As the resulting tumor biology may be quite different between these two groups, we used RNA-sequencing data from The Cancer Genome Atlas (TCGA) from four different cancers with poor prognosis, namely ovarian, breast, lung and skin cancers, to compare the patterns of coexpression of genes in tumors grouped according to their TP53 missense or null mutation status. We used Weighted Gene Coexpression Network analysis (WGCNA) and a new test statistic built on differences between groups in the measures of gene connectivity. For each cancer, our analysis identified a set of genes showing differential coexpression patterns between the TP53 missense- and null mutation-carrying groups that was robust to the choice of the tuning parameter in WGCNA. After comparing these sets of genes across the four cancers, one gene (KIR3DL2) consistently showed differential coexpression patterns between the null and missense groups. KIR3DL2 is known to play an important role in regulating the immune response, which is consistent with our observation that this genes strongly-correlated partners implicated many immune-related pathways. Examining mutation-type-related changes in correlations between sets of genes may provide new insight into tumor biology.


BMC Proceedings | 2018

CpG-set association assessment of lipid concentration changes and DNA methylation

Kaiqiong Zhao; Lai Jiang; Kathleen Oros Klein; Celia M. T. Greenwood; Karim Oualkacha

Epigenome association studies that test a large number of methylation sites suffer from stringent multiple-testing corrections. This study’s goals were to investigate region-based associations between DNA methylation sites and lipid-level changes in response to the treatment with fenofibrate in the GAW20 data and to investigate whether improvements in power could be obtained by taking into account correlations between DNA methylation at neighboring cytosine-phosphate-guanine (CpG) sites. To this end, we applied both a recently developed block-based data-dimension-reduction approach and a region-based variance-component (VC) linear mixed model to GAW20 data. We compared analyses of unrelated individuals with familial data. The region-based VC approach using unrelated (independent) individuals identified the gene LGALS9C as significantly associated with changes in triglycerides. However, univariate tests of individual CpG sites yielded no valid statistically significant results.


PLOS ONE | 2017

Specific expression of novel long non-coding RNAs in high-hyperdiploid childhood acute lymphoblastic leukemia

Mathieu Lajoie; Simon Drouin; Maxime Caron; Pascal St-Onge; Manon Ouimet; Romain Gioia; Marie-Hélène Lafond; Ramon Vidal; Chantal Richer; Karim Oualkacha; Arnaud Droit; Daniel Sinnett

Pre-B cell childhood acute lymphoblastic leukemia (pre-B cALL) is a heterogeneous disease involving many subtypes typically stratified using a combination of cytogenetic and molecular-based assays. These methods, although widely used, rely on the presence of known chromosomal translocations, which is a limiting factor. There is therefore a need for robust, sensitive, and specific molecular biomarkers unaffected by such limitations that would allow better risk stratification and consequently better clinical outcome. In this study we performed a transcriptome analysis of 56 pre-B cALL patients to identify expression signatures in different subtypes. In both protein-coding and long non-coding RNAs (lncRNA), we identified subtype-specific gene signatures distinguishing pre-B cALL subtypes, particularly in t(12;21) and hyperdiploid cases. The genes up-regulated in pre-B cALL subtypes were enriched in bivalent chromatin marks in their promoters. LncRNAs is a new and under-studied class of transcripts. The subtype-specific nature of lncRNAs suggests they may be suitable clinical biomarkers to guide risk stratification and targeted therapies in pre-B cALL patients.


BMC Proceedings | 2014

Pathway analysis for genetic association studies: to do, or not to do? That is the question

Line Dufresne; Karim Oualkacha; Vincenzo Forgetta; Celia M. T. Greenwood

In Genetic Analysis Workshop 18 data, we used a 3-stage approach to explore the benefits of pathway analysis in improving a model to predict 2 diastolic blood pressure phenotypes as a function of genetic variation. At stage 1, gene-based tests of association in family data of approximately 800 individuals found over 600 genes associated at p<0.05 for each phenotype. At stage 2, networks and enriched pathways were estimated with Cytoscape for genes from stage 1, separately for the 2 phenotypes, then examining network overlap. This overlap identified 4 enriched pathways, and 3 of these pathways appear to interact, and are likely candidates for playing a role in hypertension. At stage 3, using 157 maximally unrelated individuals, partial least squares regression was used to find associations between diastolic blood pressure and single-nucleotide polymorphisms in genes highlighted by the pathway analyses. However, we saw no improvement in the adjusted cross-validated R2. Although our pathway-motivated regressions did not improve prediction of diastolic blood pressure, merging gene networks did identify several plausible pathways for hypertension.


Statistical Methods in Medical Research | 2018

Principal component of explained variance: An efficient and optimal data dimension reduction framework for association studies.

Maxime Turgeon; Karim Oualkacha; Antonio Ciampi; Hanane Miftah; Golsa Dehghan; Brent W. Zanke; Andrea Lessa Benedet; Pedro Rosa-Neto; Celia M. T. Greenwood; Aurelie Labbe

The genomics era has led to an increase in the dimensionality of data collected in the investigation of biological questions. In this context, dimension-reduction techniques can be used to summarise high-dimensional signals into low-dimensional ones, to further test for association with one or more covariates of interest. This paper revisits one such approach, previously known as principal component of heritability and renamed here as principal component of explained variance (PCEV). As its name suggests, the PCEV seeks a linear combination of outcomes in an optimal manner, by maximising the proportion of variance explained by one or several covariates of interest. By construction, this method optimises power; however, due to its computational complexity, it has unfortunately received little attention in the past. Here, we propose a general analytical PCEV framework that builds on the assets of the original method, i.e. conceptually simple and free of tuning parameters. Moreover, our framework extends the range of applications of the original procedure by providing a computationally simple strategy for high-dimensional outcomes, along with exact and asymptotic testing procedures that drastically reduce its computational cost. We investigate the merits of the PCEV using an extensive set of simulations. Furthermore, the use of the PCEV approach is illustrated using three examples taken from the fields of epigenetics and brain imaging.


BMC Proceedings | 2018

Investigating potential causal relationships between SNPs, DNA methylation and HDL

Lai Jiang; Kaiqiong Zhao; Kathleen Oros Klein; Angelo J. Canty; Karim Oualkacha; Celia M. T. Greenwood

Using data on 680 patients from the GAW20 real data set, we conducted Mendelian randomization (MR) studies to explore the causal relationships between methylation levels at selected probes (cytosine-phosphate-guanine sites [CpGs]) and high-density lipoprotein (HDL) changes (ΔHDL) using single-nucleotide polymorphisms (SNPs) as instrumental variables. Several methods were used to estimate the causal effects at CpGs of interest on ΔHDL, including a newly developed method that we call constrained instrumental variables (CIV). CIV performs automatic SNP selection while providing estimates of causal effects adjusted for possible pleiotropy, when the potentially-pleiotropic phenotypes are measured. For CpGs in or near the 10 genes identified as associated with ΔHDL using a family-based VC-score test, we compared CIV to Egger regression and the two-stage least squares (TSLS) method. All 3 approaches selected at least 1CpG in 2 genes—RNMT;C18orf19 and C6orf141—as showing a causal relationship with ΔHDL.


Immunity, inflammation and disease | 2017

Performance of an allele-level multi-locus HLA genotype imputation tool in hematopoietic stem cell donors from Quebec

Abdelhakim Ferradji; Yasmin D'Souza; Chee Loong Saw; Karim Oualkacha; Lucie Richard; Ruth Sapir-Pichhadze

Donor‐recipient HLA compatibility is an important determinant of transplant outcomes. Allele‐group to allele‐level imputations help assign HLA genotypes when allele‐level genotypes are not available during donor selection.

Collaboration


Dive into the Karim Oualkacha's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge