Henrik Nyman
Åbo Akademi University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Henrik Nyman.
Twin Research and Human Genetics | 2013
Ada Johansson; Patrick Jern; Pekka Santtila; Bettina von der Pahlen; Elias Eriksson; Lars Westberg; Henrik Nyman; Johan Pensar; Jukka Corander; N. Kenneth Sandnabba
The Genetics of Sexuality and Aggression (GSA) project was launched at the Abo Akademi University in Turku, Finland in 2005 and has so far undertaken two major population-based data collections involving twins and siblings of twins. To date, it consists of about 14,000 individuals (including 1,147 informative monozygotic twin pairs, 1,042 informative same-sex dizygotic twin pairs, 741 informative opposite-sex dizygotic twin pairs). Participants have been recruited through the Central Population Registry of Finland and were 18-49 years of age at the time of the data collections. Saliva samples for DNA genotyping (n = 4,278) and testosterone analyses (n = 1,168) were collected in 2006. The primary focus of the data collections has been on sexuality (both sexual functioning and sexual behavior) and aggressive behavior. This paper provides an overview of the data collections as well as an outline of the phenotypes and biological data assembled within the project. A detailed overview of publications can be found at the projects Web site: http://www.cebg.fi/.
Data Mining and Knowledge Discovery | 2015
Johan Pensar; Henrik Nyman; Timo Koski; Jukka Corander
We introduce a novel class of labeled directed acyclic graph (LDAG) models for finite sets of discrete variables. LDAGs generalize earlier proposals for allowing local structures in the conditional probability distribution of a node, such that unrestricted label sets determine which edges can be deleted from the underlying directed acyclic graph (DAG) for a given context. Several properties of these models are derived, including a generalization of the concept of Markov equivalence classes. Efficient Bayesian learning of LDAGs is enabled by introducing an LDAG-based factorization of the Dirichlet prior for the model parameters, such that the marginal likelihood can be calculated analytically. In addition, we develop a novel prior distribution for the model structures that can appropriately penalize a model for its labeling complexity. A non-reversible Markov chain Monte Carlo algorithm combined with a greedy hill climbing approach is used for illustrating the useful properties of LDAG models for both real and synthetic data sets.
hawaii international conference on system sciences | 2014
Henrik Nyman; Peter Sarlin
A lot of attention in supply chain management has been devoted to understanding customer requirements. What are customer priorities in terms of price and service level, and how can companies go about fulfilling these requirements in an optimal way? New manufacturing technology in the form of 3D printing is about to change some of the underlying assumptions for different supply chain set-ups. This paper explores opportunities and barriers of 3D printing technology, specifically in a supply chain context. We are proposing a set of principles that can act to bridge existing research on different supply chain strategies and 3D printing. With these principles, researchers and practitioners alike can better understand the opportunities and limitations of 3D printing in a supply chain management context.
Bayesian Analysis | 2014
Henrik Nyman; Johan Pensar; Timo Koski; Jukka Corander
Theory of graphical models has matured over more than three decades to provide the backbone for several classes of models that are used in a myriad of applications such as genetic mapping of diseases, credit risk evaluation, reliability and computer security. Despite their generic applicability and wide adoption, the constraints imposed by undirected graphical models and Bayesian networks have also been recognized to be unnecessarily stringent under certain circumstances. This observation has led to the proposal of several generalizations that aim at more relaxed constraints by which the models can impose local or context-specific dependence structures. Here we consider an additional class of such models, termed stratified graphical models. We develop a method for Bayesian learning of these models by deriving an analytical expression for the marginal likelihood of data under a specific subclass of decomposable stratified models. A non-reversible Markov chain Monte Carlo approach is further used to identify models that are highly supported by the posterior distribution over the model space. Our method is illustrated and compared with ordinary graphical models through application to several real and synthetic datasets.
International Journal of Approximate Reasoning | 2016
Johan Pensar; Henrik Nyman; Jarno Lintusaari; Jukka Corander
Bayesian networks are one of the most widely used tools for modeling multivariate systems. It has been demonstrated that more expressive models, which can capture additional structure in each conditional probability table (CPT), may enjoy improved predictive performance over traditional Bayesian networks despite having fewer parameters. Here we investigate this phenomenon for models of various degree of expressiveness on both extensive synthetic and real data. To characterize the regularities within CPTs in terms of independence relations, we introduce the notion of partial conditional independence (PCI) as a generalization of the well-known concept of context-specific independence (CSI). To model the structure of the CPTs, we use different graph-based representations which are convenient from a learning perspective. In addition to the previously studied decision trees and graphs, we introduce the concept of PCI-trees as a natural extension of the CSI-based trees. To identify plausible models we use the Bayesian score in combination with a greedy search algorithm. A comparison against ordinary Bayesian networks shows that models with local structures in general enjoy parametric sparsity and improved out-of-sample predictive performance, however, often it is necessary to regulate the model fit with an appropriate model structure prior to avoid overfitting in the learning process. The tree structures, in particular, lead to high quality models and suggest considerable potential for further exploration. We study the effect of including local structures in learning of Bayesian networks.We introduce partial conditional independence to characterize the restrictions.The local structures are modeled using various graph-based representations.The models are learned using a Bayesian score and a greedy search algorithm.In general, local structures improve the predictive accuracy of the learned models.
Computational Statistics | 2016
Henrik Nyman; Johan Pensar; Timo Koski; Jukka Corander
Log-linear models are the popular workhorses of analyzing contingency tables. A log-linear parameterization of an interaction model can be more expressive than a direct parameterization based on probabilities, leading to a powerful way of defining restrictions derived from marginal, conditional and context-specific independence. However, parameter estimation is often simpler under a direct parameterization, provided that the model enjoys certain decomposability properties. Here we introduce a cyclical projection algorithm for obtaining maximum likelihood estimates of log-linear parameters under an arbitrary context-specific graphical log-linear model, which needs not satisfy criteria of decomposability. We illustrate that lifting the restriction of decomposability makes the models more expressive, such that additional context-specific independencies embedded in real data can be identified. It is also shown how a context-specific graphical model can correspond to a non-hierarchical log-linear parameterization with a concise interpretation. This observation can pave way to further development of non-hierarchical log-linear models, which have been largely neglected due to their believed lack of interpretability.
Advanced Data Analysis and Classification | 2016
Henrik Nyman; Jie Xiong; Johan Pensar; Jukka Corander
An inductive probabilistic classification rule must generally obey the principles of Bayesian predictive inference, such that all observed and unobserved stochastic quantities are jointly modeled and the parameter uncertainty is fully acknowledged through the posterior predictive distribution. Several such rules have been recently considered and their asymptotic behavior has been characterized under the assumption that the observed features or variables used for building a classifier are conditionally independent given a simultaneous labeling of both the training samples and those from an unknown origin. Here we extend the theoretical results to predictive classifiers acknowledging feature dependencies either through graphical models or sparser alternatives defined as stratified graphical models. We show through experimentation with both synthetic and real data that the predictive classifiers encoding dependencies have the potential to substantially improve classification accuracy compared with both standard discriminative classifiers and the predictive classifiers based on solely conditionally independent features. In most of our experiments stratified graphical models show an advantage over ordinary graphical models.
Metallurgical and Materials Transactions B-process Metallurgy and Materials Processing Science | 2012
Henrik Nyman; Tarja Talonen; Antti Roine; Mikko Hupa; Jukka Corander
In chemistry and engineering, thermodynamic databases are widely used to obtain the basic properties of pure substances or mixtures. Large and reliable databases are the basis of all thermodynamic modeling of complex chemical processes or systems. However, the effort needed in the establishment, maintenance, and management of a database increases exponentially along with the size and scope of the database. Therefore, we developed a statistical modeling approach to assist an expert in the evaluation and management process, which can pinpoint various types of erroneous records in a database. We have applied this method to investigate the enthalpy, entropy, and heat capacity characteristics in a large commercial database for approximately 25,000 chemical species. Our highly successful results show that a statistical approach is a valuable tool (1) for the management of such databases and (2) to create enthalpy, entropy and heat capacity estimates for such species in which thermochemical data are not available.
Statistics and Computing | 2017
Tomi Janhunen; Martin Gebser; Jussi Rintanen; Henrik Nyman; Johan Pensar; Jukka Corander
Statistical model learning problems are traditionally solved using either heuristic greedy optimization or stochastic simulation, such as Markov chain Monte Carlo or simulated annealing. Recently, there has been an increasing interest in the use of combinatorial search methods, including those based on computational logic. Some of these methods are particularly attractive since they can also be successful in proving the global optimality of solutions, in contrast to stochastic algorithms that only guarantee optimality at the limit. Here we improve and generalize a recently introduced constraint-based method for learning undirected graphical models. The new method combines perfect elimination orderings with various strategies for solution pruning and offers a dramatic improvement both in terms of time and memory complexity. We also show that the method is capable of efficiently handling a more general class of models, called stratified/labeled graphical models, which have an astronomically larger model space.
Global Policy | 2015
Peter Sarlin; Henrik Nyman
The 2007--2008 financial crisis has paved the way for the use of macroprudential policies in supervising the financial system as a whole. This paper views macroprudential oversight in Europe as a process, a sequence of activities with the ultimate aim of safeguarding financial stability. To conceptualize a process in this context, we introduce the notion of a public collaborative process (PCP). PCPs involve multiple organizations with a common objective, where a number of dispersed organizations cooperate under various unstructured forms and take a collaborative approach to reaching the final goal. We argue that PCPs can and should essentially be managed using the tools and practices common for business processes. To this end, we conduct an assessment of process readiness for macroprudential oversight in Europe. Based upon interviews with key European policymakers and supervisors, we provide an analysis model to assess the maturity of five process enablers for macroprudential oversight. With the results of our analysis, we give clear recommendations on the areas that need further attention when macroprudential oversight is being developed, in addition to providing a general purpose framework for monitoring the impact of improvement efforts.