Darfiana Nur
Flinders University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Darfiana Nur.
Computational Statistics & Data Analysis | 2009
Darfiana Nur; David Allingham; Judith Rousseau; Kerrie Mengersen; Ross McVinish
The sensitivity to the specification of the prior in a hidden Markov model describing homogeneous segments of DNA sequences is considered. An intron from the chimpanzee @a-fetoprotein gene, which plays an important role in embryonic development in mammals, is analysed. Three main aims are considered: (i) to assess the sensitivity to prior specification in Bayesian hidden Markov models for DNA sequence segmentation; (ii) to examine the impact of replacing the standard Dirichlet prior with a mixture Dirichlet prior; and (iii) to propose and illustrate a more comprehensive approach to sensitivity analysis, using importance sampling. It is obtained that (i) the posterior estimates obtained under a Bayesian hidden Markov model are indeed sensitive to the specification of the prior distributions; (ii) compared with the standard Dirichlet prior, the mixture Dirichlet prior is more flexible, less sensitive to the choice of hyperparameters and less constraining in the analysis, thus improving posterior estimates; and (iii) importance sampling was computationally feasible, fast and effective in allowing a richer sensitivity analysis.
Electronic Journal of Statistics | 2010
Dorota Gajda; Chantal Guihenneuc-Jouyaux; Judith Rousseau; Kerrie Mengersen; Darfiana Nur
Abstract: The Importance Sampling method is used as an alternative approach to MCMC in repeated Bayesian estimations. In the particular context of numerous data sets, MCMC algorithms have to be called on several times which may become computationally expensive. Since Importance Sampling requires a sample from a posteriodistribution, our idea is to use MCMC to generate only a certain number of Markov chains and use them later in the subsequent IS estimations. For each Importance Sampling procedure, the suitable chain is selected by one of three criteria we present here. The first and second criteria are based on the L1 norm of the difference between two posterior distributions and their Kullback-Leibler divergence respectively. The third criterion results from minimizing the variance of IS estimate. A supplementary automatic selection procedure is also proposed to choose those posterior for which Markov chains will be generated and to avoid arbitrary choice of importance functions. The featured methods are illustrated in simulation studies on three types of Poisson model: simple Poisson model, Poisson regression model and Poisson regression model with extra Poisson variability. Different parameter settings are considered.
International Journal of Environmental Research and Public Health | 2017
Supriya Mathew; Deepika Mathur; Anne B. Chang; Elizabeth McDonald; Gurmeet Singh; Darfiana Nur; Rolf Gerritsen
Preterm birth (born before 37 completed weeks of gestation) is one of the leading causes of death among children under 5 years of age. Several recent studies have examined the association between extreme temperature and preterm births, but there have been almost no such studies in arid Australia. In this paper, we explore the potential association between exposures to extreme temperatures during the last 3 weeks of pregnancy in a Central Australian town. An immediate effect of temperature exposure is observed with an increased relative risk of 1%–2% when the maximum temperature exceeded the 90th percentile of the summer season maximum temperature data. Delayed effects are also observed closer to 3 weeks before delivery when the relative risks tend to increase exponentially. Immediate risks to preterm birth are also observed for cold temperature exposures (0 to –6 °C), with an increased relative risk of up to 10%. In the future, Central Australia will face more hot days and less cold days due to climate change and hence the risks posed by extreme heat is of particular relevance to the community and health practitioners.
Computational Statistics & Data Analysis | 2001
Darfiana Nur; Rodney C. Wolff; Kerrie Mengersen
For the purpose of testing for stationarity in a time series, a phase randomisation procedure is reviewed and modified, and applied to a wide range of time-series models. These include linear stationary, linear non-stationary, non-linear stationary and non-linear non-stationary processes. Surrogate series are simulated using Standard and Rescaling methods. For all processes, the higher-order central moments of the original series are preserved in the surrogate series using the Rescaling method whereas under the Standard approach only the even central moments are preserved. The density of higher order cumulant estimates obtained under the Rescaling method exhibits unimodality when the process is stationary and multimodality otherwise. The primary aim is to develop a suite of diagnostic tests in order to assess the convergence of Markov Chain Monte Carlo algorithms. Applications of the method as a convergence diagnostic test of Markov Chain Monte Carlo are also discussed.
Communications in Statistics-theory and Methods | 2017
Pham Thi Thu Huong; Darfiana Nur; Alan Branford
ABSTRACT The joint models for longitudinal data and time-to-event data have recently received numerous attention in clinical and epidemiologic studies. Our interest is in modeling the relationship between event time outcomes and internal time-dependent covariates. In practice, the longitudinal responses often show non linear and fluctuated curves. Therefore, the main aim of this paper is to use penalized splines with a truncated polynomial basis to parameterize the non linear longitudinal process. Then, the linear mixed-effects model is applied to subject-specific curves and to control the smoothing. The association between the dropout process and longitudinal outcomes is modeled through a proportional hazard model. Two types of baseline risk functions are considered, namely a Gompertz distribution and a piecewise constant model. The resulting models are referred to as penalized spline joint models; an extension of the standard joint models. The expectation conditional maximization (ECM) algorithm is applied to estimate the parameters in the proposed models. To validate the proposed algorithm, extensive simulation studies were implemented followed by a case study. In summary, the penalized spline joint models provide a new approach for joint models that have improved the existing standard joint models.
Journal of Statistical Computation and Simulation | 2018
Pham Thi Thu Huong; Darfiana Nur; Hoa Pham; Alan Branford
ABSTRACT Joint models for longitudinal and time-to-event data have been applied in many different fields of statistics and clinical studies. However, the main difficulty these models have to face with is the computational problem. The requirement for numerical integration becomes severe when the dimension of random effects increases. In this paper, a modified two-stage approach has been proposed to estimate the parameters in joint models. In particular, in the first stage, the linear mixed-effects models and best linear unbiased predictorsare applied to estimate parameters in the longitudinal submodel. In the second stage, an approximation of the fully joint log-likelihood is proposed using the estimated the values of these parameters from the longitudinal submodel. Survival parameters are estimated bymaximizing the approximation of the fully joint log-likelihood. Simulation studies show that the approach performs well, especially when the dimension of random effects increases. Finally, we implement this approach on AIDS data.
Journal of Statistical Computation and Simulation | 2017
James A. Totterdell; Darfiana Nur; Kerrie Mengersen
ABSTRACT Segmentation models aim to partition compositionally heterogeneous domains into homogeneous segments which may be reflective of biological function. Due to the latent nature of the segments a natural approach to segmentation that has gained favour recently uses Bayesian hidden Markov models (HMMs). Concomitantly in the last few decades, the free R programming language has become a dominant tool for computational statistics, visualization and data science. Therefore, this paper aims to fully exploit R to fit a Bayesian HMM for DNA segmentation. The joint posterior distribution of parameters in the model to be considered is derived followed by the algorithms that can be used for estimation. Functions following these algorithms (Gibbs Sampling, Data Augmentation and Label Switching) are then fully implemented in R. The methodology is assessed through extensive simulation studies and then being applied to analyse Simian Vacuolating virus (SV40). It is concluded that: (1) the algorithms and functions in R can correctly estimate sequence segmentation if the HMM structure is assumed; (2) the performance of the model improves with sequence length; (3) R is reasonably fast for short to medium sequence lengths and number of segments and (4) the segmentation of SV40 appears to correspond with the two major transcripts, early and late, that regulate the expression of SV40 genes.
Communications in Statistics - Simulation and Computation | 2017
Glen Livingston; Darfiana Nur
ABSTRACT The main aim of this paper is to perform sensitivity analysis to the specification of prior distributions in a Bayesian analysis setting of STAR models. To achieve this aim, the joint posterior distribution of model order, coefficient, and implicit parameters in the logistic STAR model is first being presented. The conditional posterior distributions are then shown, followed by the design of a posterior simulator using a combination of Metropolis-Hastings, Gibbs Sampler, RJMCMC, and Multiple Try Metropolis algorithms, respectively. Following this, simulation studies and a case study on the prior sensitivity for the implicit parameters are being detailed at the end.
Journal of Computational and Graphical Statistics | 2013
Ross McVinish; Kerrie Mengersen; Darfiana Nur; Judith Rousseau; Chantal Guihenneuc-Jouyaux
Since its introduction in the early 1990s, the idea of using importance sampling (IS) with Markov chain Monte Carlo (MCMC) has found many applications. This article examines problems associated with its application to repeated evaluation of related posterior distributions with a particular focus on Bayesian model validation. We demonstrate that, in certain applications, the curse of dimensionality can be reduced by a simple modification of IS. In addition to providing new theoretical insight into the behavior of the IS approximation in a wide class of models, our result facilitates the implementation of computationally intensive Bayesian model checks. We illustrate the simplicity, computational savings, and potential inferential advantages of the proposed approach through two substantive case studies, notably computation of Bayesian p-values for linear regression models and simulation-based model checking. Supplementary materials including the Appendix and the R code for Section 3.1.2 are available online.
Journal of statistical theory and practice | 2008
Darfiana Nur; Gopalan Nair; N.D. Yatawara
Verifiable conditions are given for the existence of efficient estimation in Smooth Threshold Autoregressive models of order 1. The paper establishes local asymptotic normality in the semi-parametric setting which is then used to discuss adaptive and efficient estimates of the models. It is found that the adaptation is satisfied if the error densities are symmetric. Simulation results are presented to compare the conditional least squares estimate with the adaptive and efficient estimates for the models.