Sangita Kulathinal
University of Helsinki
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sangita Kulathinal.
PLOS ONE | 2008
Kaisa Silander; Mervi Alanne; Kati Kristiansson; Olli Saarela; Samuli Ripatti; Kirsi Auro; Juha Karvanen; Sangita Kulathinal; Matti Niemelä; Pekka Ellonen; Erkki Vartiainen; Pekka Jousilahti; Janna Saarela; Kari Kuulasmaa; Alun Evans; Markus Perola; Veikko Salomaa; Leena Peltonen
Background Cardiovascular disease (CVD) incidence, complications and burden differ markedly between women and men. Although there is variation in the distribution of lifestyle factors between the genders, they do not fully explain the differences in CVD incidence and suggest the existence of gender-specific genetic risk factors. We aimed to estimate whether the genetic risk profiles of coronary heart disease (CHD), ischemic stroke and the composite end-point of CVD differ between the genders. Methodology/Principal Findings We studied in two Finnish population cohorts, using the case-cohort design the association between common variation in 46 candidate genes and CHD, ischemic stroke, CVD, and CVD-related quantitative risk factors. We analyzed men and women jointly and also conducted genotype-gender interaction analysis. Several allelic variants conferred disease risk for men and women jointly, including rs1801020 in coagulation factor XII (HR = 1.31 (1.08–1.60) for CVD, uncorrected p = 0.006 multiplicative model). Variant rs11673407 in the fucosyltransferase 3 gene was strongly associated with waist/hip ratio (uncorrected p = 0.00005) in joint analysis. In interaction analysis we found statistical evidence of variant-gender interaction conferring risk of CHD and CVD: rs3742264 in the carboxypeptidase B2 gene, p(interaction) = 0.009 for CHD, and rs2774279 in the upstream stimulatory factor 1 gene, p(interaction) = 0.007 for CHD and CVD, showed strong association in women but not in men, while rs2069840 in interleukin 6 gene, p(interaction) = 0.004 for CVD, showed strong association in men but not in women (uncorrected p-values). Also, two variants in the selenoprotein S gene conferred risk for ischemic stroke in women, p(interaction) = 0.003 and 0.007. Importantly, we identified a larger number of gender-specific effects for women than for men. Conclusions/Significance A false discovery rate analysis suggests that we may expect half of the reported findings for combined gender analysis to be true positives, while at least third of the reported genotype-gender interaction results are true positives. The asymmetry in positive findings between the genders could imply that genetic risk loci for CVD are more readily detectable in women, while for men they are more confounded by environmental/lifestyle risk factors. The possible differences in genetic risk profiles between the genders should be addressed in more detail in genetic studies of CVD, and more focus on female CVD risk is also warranted in genome-wide association studies.
Epidemiologic Perspectives & Innovations | 2007
Sangita Kulathinal; Juha Karvanen; Olli Saarela; Kari Kuulasmaa
When carefully planned and analysed, the case-cohort design is a powerful choice for follow-up studies with multiple event types of interest. While the literature is rich with analysis methods for case-cohort data, little is written about the designing of a case-cohort study. Our experiences in designing, coordinating and analysing the MORGAM case-cohort study are potentially useful for other studies with similar characteristics. The motivation for using the case-cohort design in the MORGAM genetic study is discussed and issues relevant to its planning and analysis are studied. We propose solutions for appending the earlier case-cohort selection after an extension of the follow-up period and for achieving maximum overlap between earlier designs and the case-cohort design. Approaches for statistical analysis are studied in a simulation example based on the MORGAM data.
Stroke | 2009
Kjell Asplund; Juha Karvanen; Pekka Jousilahti; Matti Niemelä; Grażyna Broda; Giancarlo Cesana; Jean Dallongeville; Pierre Ducimetriere; Alun Evans; Jean Ferrières; Bernadette Haas; Torben Jørgensen; Abdonas Tamosiunas; Diego Vanuzzo; Per-Gunnar Wiklund; John Yarnell; Kari Kuulasmaa; Sangita Kulathinal
Background and Purpose— Within the framework of the MOnica Risk, Genetics, Archiving and Monograph (MORGAM) Project, the variations in impact of classical risk factors of stroke by population, sex, and age were analyzed. Methods— Follow-up data were collected in 43 cohorts in 18 populations in 8 European countries surveyed for cardiovascular risk factors. In 93 695 persons aged 19 to 77 years and free of major cardiovascular disease at baseline, total observation years were 1 234 252 and the number of stroke events analyzed was 3142. Hazard ratios were calculated by Cox regression analyses. Results— Each year of age increased the risk of stroke (fatal and nonfatal together) by 9% (95% CI, 9% to 10%) in men and by 10% (9% to 10%) in women. A 10-mm Hg increase in systolic blood pressure involved a similar increase in risk in men (28%; 24% to 32%) and women (25%; 20% to 29%). Smoking conferred a similar excess risk in women (104%; 78% to 133%) and in men (82%; 66% to 100%). The effect of increasing body mass index was very modest. Higher high-density lipoprotein cholesterol levels decreased the risk of stroke more in women (hazard ratio per mmol/L 0.58; 0.49 to 0.68) than in men (0.80; 0.69 to 0.92). The impact of the individual risk factors differed somewhat between countries/regions with high blood pressure being particularly important in central Europe (Poland and Lithuania). Conclusions— Age, sex, and region-specific estimates of relative risks for stroke conferred by classical risk factors in various regions of Europe are provided. From a public health perspective, an important lesson is that smoking confers a high risk for stroke across Europe.
PLOS Genetics | 2005
Kati Komulainen; Mervi Alanne; Kirsi Auro; Riika Kilpikari; Päivi Pajukanta; Janna Saarela; Pekka Ellonen; Kaisa Salminen; Sangita Kulathinal; Kari Kuulasmaa; Kaisa Silander; Veikko Salomaa; Markus Perola; Leena Peltonen
Upstream transcription factor 1 (USF1) is a ubiquitously expressed transcription factor controlling several critical genes in lipid and glucose metabolism. Of some 40 genes regulated by USF1, several are involved in the molecular pathogenesis of cardiovascular disease (CVD). Although the USF1 gene has been shown to have a critical role in the etiology of familial combined hyperlipidemia, which predisposes to early CVD, the genes potential role as a risk factor for CVD events at the population level has not been established. Here we report the results from a prospective genetic–epidemiological study of the association between the USF1 variants, CVD, and mortality in two large Finnish cohorts. Haplotype-tagging single nucleotide polymorphisms exposing all common allelic variants of USF1 were genotyped in a prospective case-cohort design with two distinct cohorts followed up during 1992–2001 and 1997–2003. The total number of follow-up years was 112,435 in 14,140 individuals, of which 2,225 were selected for genotyping based on the case-cohort study strategy. After adjustment for conventional risk factors, we observed an association of USF1 with CVD and mortality among females. In combined analysis of the two cohorts, female carriers of a USF1 risk haplotype had a 2-fold risk of a CVD event (hazard ratio [HR] 2.02; 95% confidence interval [CI] 1.16–3.53; p = 0.01) and an increased risk of all-cause mortality (HR 2.52; 95% CI 1.46–4.35; p = 0.0009). A putative protective haplotype of USF1 was also identified. Our study shows how a gene identified in exceptional families proves to be important also at the population level, implying that allelic variants of USF1 significantly influence the prospective risk of CVD and even all-cause mortality in females.
Statistics in Medicine | 2008
Olli Saarela; Sangita Kulathinal; Elja Arjas; Esa Läärä
Suppose a nested case-control design has been applied for collecting covariate data when studying a specific disease. With possible new outcomes of interest it would be sensible to utilize the previously selected control group instead of (or in addition to) a new control selection, given that the same covariate data were relevant and available, and that their measurements had adequate stability and quality. We formulate this problem in the framework of the competing risks survival model. In this approach covariate information collected for all outcomes can be utilized in the analysis. We not only propose likelihood-based parameter estimation but we also review alternative methods based on weighted partial/pseudolikelihoods. The methods discussed here are closely related to the analysis of a case-cohort design, where the control group is not tied to cases of a specific disease. The different methods are compared in a simulation study.
Lifetime Data Analysis | 2002
Sangita Kulathinal; Dario Gasbarra
In this paper, a class of tests is developed for comparing the cause-specific hazard rates of m competing risks simultaneously in K (> or = 2) groups. The data available for a unit are the failure time of the unit along with the identifier of the risk claiming the failure. In practice, the failure time data are generally right censored. The tests are based on the difference between the weighted averages of the cause-specific hazard rates corresponding to each risk. No assumption regarding the dependence of the competing risks is made. It is shown that the proposed test statistic has asymptotically chi-squared distribution. The proposed test is shown to be optimal for a specific type of local alternatives. The choice of weight function is also discussed. A simulation study is carried out using multivariate Gumbel distribution to compare the optimal weight function with a proposed weight function which is to be used in practice. Also, the proposed test is applied to real data on the termination of an intrauterine device.
Computational Statistics & Data Analysis | 2009
Juha Karvanen; Sangita Kulathinal; Dario Gasbarra
In gene-disease association studies, the cost of genotyping makes it economical to use a two-stage design where only a subset of the cohort is genotyped. At the first-stage, the follow-up data along with some risk factors or non-genetic covariates are collected for the cohort and a subset of the cohort is then selected for genotyping at the second-stage. Intuitively the selection of the subset for the second-stage could be carried out efficiently if the data collected at the first-stage are utilized. The information contained in the conditional probability of the genotype given the first-stage data and the initial estimates of the parameters of interest is being maximized for efficient selection of the subset. The proposed selection method is illustrated using the logistic regression and Coxs proportional hazards model and algorithms that can find optimal or nearly optimal designs in discrete design space are presented. Simulation comparisons between D-optimal design, extreme selection and case-cohort design suggest that D-optimal design is the most efficient in terms of variance of estimated parameters, but extreme selection may be a good alternative for practical study design.
IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2011
Dario Gasbarra; Sangita Kulathinal; Matti Pirinen; Mikko J. Sillanpää
We assume that allele frequency data have been extracted from several large DNA pools, each containing genetic material of up to hundreds of sampled individuals. Our goal is to estimate the haplotype frequencies among the sampled individuals by combining the pooled allele frequency data with prior knowledge about the set of possible haplotypes. Such prior information can be obtained, for example, from a database such as HapMap. We present a Bayesian haplotyping method for pooled DNA based on a continuous approximation of the multinomial distribution. The proposed method is applicable when the sizes of the DNA pools and/or the number of considered loci exceed the limits of several earlier methods. In the example analyses, the proposed model clearly outperforms a deterministic greedy algorithm on real data from the HapMap database. With a small number of loci, the performance of the proposed method is similar to that of an EM-algorithm, which uses a multinormal approximation for the pooled allele frequencies, but which does not utilize prior information about the haplotypes. The method has been implemented using Matlab and the code is available upon request from the authors.
Statistical Methods in Medical Research | 2015
Juha Mehtälä; Kari Auranen; Sangita Kulathinal
Alternating presence and absence of a medical condition in human subjects is often modelled as an outcome of underlying process dynamics. Longitudinal studies provide important insights into research questions involving such dynamics. This article concerns optimal designs for studies in which the dynamics are modelled as a binary continuous-time Markov process. Either one or both the transition rate parameters in the model are to be estimated with maximum precision from a sequence of observations made at discrete times on a number of subjects. The design questions concern the choice of time interval between observations, the initial state of each subject and the choice between number of subjects versus repeated observations per subject. Sequential designs are considered due to dependence of the designs on the model parameters. The optimal time spacing can be approximated by the reciprocal of the sum of the two rates. The initial distribution of the study subjects should be taken into account when relatively few repeated samples per subject are to be collected. A study with a reasonably large size should be designed in more than one phase because there are then enough observations to be spent in the first phase to revise the time spacing for the subsequent phases.
Journal of Probability and Statistics | 2012
Olli Saarela; Sangita Kulathinal; Juha Karvanen
Under cohort sampling designs, additional covariate data are collected on cases of a specific type and a randomly selected subset of noncases, primarily for the purpose of studying associations with a time-to-event response of interest. With such data available, an interest may arise to reuse them for studying associations between the additional covariate data and a secondary non-time-to-event response variable, usually collected for the whole study cohort at the outset of the study. Following earlier literature, we refer to such a situation as secondary analysis. We outline a general conditional likelihood approach for secondary analysis under cohort sampling designs and discuss the specific situations of case-cohort and nested case-control designs. We also review alternative methods based on full likelihood and inverse probability weighting. We compare the alternative methods for secondary analysis in two simulated settings and apply them in a real-data example.