Featured Researches

Data Analysis Statistics And Probability

A unified and automated approach to attractor reconstruction

We present a fully automated method for the optimal state space reconstruction from univariate and multivariate time series. The proposed methodology generalizes the time delay embedding procedure by unifying two promising ideas in a symbiotic fashion. Using non-uniform delays allows the successful reconstruction of systems inheriting different time scales. In contrast to the established methods, the minimization of an appropriate cost function determines the embedding dimension without using a threshold parameter. Moreover, the method is capable of detecting stochastic time series and, thus, can handle noise contaminated input without adjusting parameters. The superiority of the proposed method is shown on some paradigmatic models and experimental data from chaotic chemical oscillators.

Read more
Data Analysis Statistics And Probability

ABCNet: An attention-based method for particle tagging

In high energy physics, graph-based implementations have the advantage of treating the input data sets in a similar way as they are collected by collider experiments. To expand on this concept, we propose a graph neural network enhanced by attention mechanisms called ABCNet. To exemplify the advantages and flexibility of treating collider data as a point cloud, two physically motivated problems are investigated: quark-gluon discrimination and pileup reduction. The former is an event-by-event classification while the latter requires each reconstructed particle to receive a classification score. For both tasks ABCNet shows an improved performance compared to other algorithms available.

Read more
Data Analysis Statistics And Probability

AMORPH: A statistical program for characterizing amorphous materials by X-ray diffraction

AMORPH utilizes a new Bayesian statistical approach to interpreting X-ray diffraction results of samples with both crystalline and amorphous components. AMORPH fits X-ray diffraction patterns with a mixture of narrow and wide components, simultaneously inferring all of the model parameters and quantifying their uncertainties. The program simulates background patterns previously applied manually, providing reproducible results, and significantly reducing inter- and intra-user biases. This approach allows for the quantification of amorphous and crystalline materials and for the characterization of the amorphous component, including properties such as the centre of mass, width, skewness, and nongaussianity of the amorphous component. Results demonstrate the applicability of this program for calculating amorphous contents of volcanic materials and independently modeling their properties in compositionally variable materials.

Read more
Data Analysis Statistics And Probability

Accounting for model errors in iterative ensemble smoothers

In the strong-constraint formulation of the history-matching problem, we assume that all the model errors relate to a selection of uncertain model input parameters. One does not account for additional model errors that could result from, e.g., excluded uncertain parameters, neglected physics in the model formulation, the use of an approximate model forcing, or discretization errors resulting from numerical approximations. If parameters with significant uncertainties are unaccounted for, there is a risk for an unphysical update, of some uncertain parameters, that compensates for errors in the omitted parameters. This paper gives the theoretical foundation for introducing model errors in ensemble methods for history matching. In particular, we explain procedures for practically including model errors in iterative ensemble smoothers like ESMDA and IES. Also, we demonstrate the impact of adding (or neglecting) model errors in the parameter-estimation problem.

Read more
Data Analysis Statistics And Probability

Accuracy and precision of the estimation of the number of missing levels in chaotic spectra using long-range correlations

We study the accuracy and precision for estimating the fraction of observed levels φ in quantum chaotic spectra through long-range correlations. We focus on the main statistics where theoretical formulas for the fraction of missing levels have been derived, the Δ 3 of Dyson and Mehta and the power spectrum of the δ n statistic. We use Monte Carlo simulations of the spectra from the diagonalization of Gaussian Orthogonal Ensemble matrices with a definite number of levels randomly taken out to fit the formulas and calculate the distribution of the estimators for different sizes of the spectrum and values of φ . A proper averaging of the power spectrum of the δ n statistic needs to be performed for avoiding systematic errors in the estimation. Once the proper averaging is made the estimation of the fraction of observed levels has quite good accuracy for the two methods even for the lowest dimensions we consider d=100 . However, the precision is generally better for the estimation using the power spectrum of the δ n as compared to the estimation using the Δ 3 statistic. This difference is clearly bigger for larger dimensions. Our results show that a careful analysis of the value of the fit in view of the ensemble distribution of the estimations is mandatory for understanding its actual significance and give a realistic error interval.

Read more
Data Analysis Statistics And Probability

Accurate reconstruction of EBSD datasets by a multimodal data approach using an evolutionary algorithm

A new method has been developed for the correction of the distortions and/or enhanced phase differentiation in Electron Backscatter Diffraction (EBSD) data. Using a multi-modal data approach, the method uses segmented images of the phase of interest (laths, precipitates, voids, inclusions) on images gathered by backscattered or secondary electrons of the same area as the EBSD map. The proposed approach then search for the best transformation to correct their relative distortions and recombines the data in a new EBSD file. Speckles of the features of interest are first segmented in both the EBSD and image data modes. The speckle extracted from the EBSD data is then meshed, and the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) is implemented to distort the mesh until the speckles superimpose. The quality of the matching is quantified via a score that is linked to the number of overlapping pixels in the speckles. The locations of the points of the distorted mesh are compared to those of the initial positions to create pairs of matching points that are used to calculate the polynomial function that describes the distortion the best. This function is then applied to un-distort the EBSD data, and the phase information is inferred using the data of the segmented speckle. Fast and versatile, this method does not require any human annotation and can be applied to large datasets and wide areas. Besides, this method requires very few assumptions concerning the shape of the distortion function. It can be used for the single compensation of the distortions or combined with the phase differentiation. The accuracy of this method is of the order of the pixel size. Some application examples in multiphase materials with feature sizes down to 1 μ m are presented, including Ti-6Al-4V Titanium alloy, Rene 65 and additive manufactured Inconel 718 Nickel-base superalloys.

Read more
Data Analysis Statistics And Probability

Active Importance Sampling for Variational Objectives Dominated by Rare Events: Consequences for Optimization and Generalization

Deep neural networks, when optimized with sufficient data, provide accurate representations of high-dimensional functions; in contrast, function approximation techniques that have predominated in scientific computing do not scale well with dimensionality. As a result, many high-dimensional sampling and approximation problems once thought intractable are being revisited through the lens of machine learning. While the promise of unparalleled accuracy may suggest a renaissance for applications that require parameterizing representations of complex systems, in many applications gathering sufficient data to develop such a representation remains a significant challenge. Here we introduce an approach that combines rare events sampling techniques with neural network optimization to optimize objective functions that are dominated by rare events. We show that importance sampling reduces the asymptotic variance of the solution to a learning problem, suggesting benefits for generalization. We study our algorithm in the context of learning dynamical transition pathways between two states of a system, a problem with applications in statistical physics and implications in machine learning theory. Our numerical experiments demonstrate that we can successfully learn even with the compounding difficulties of high-dimension and rare data.

Read more
Data Analysis Statistics And Probability

Adaptive Decision Making via Entropy Minimization

An agent choosing between various actions tends to take the one with the lowest cost. But this choice is arguably too rigid (not adaptive) to be useful in complex situations, e.g., where exploration-exploitation trade-off is relevant in creative task solving or when stated preferences differ from revealed ones. Here we study an agent who is willing to sacrifice a fixed amount of expected utility for adaptation. How can/ought our agent choose an optimal (in a technical sense) mixed action? We explore consequences of making this choice via entropy minimization, which is argued to be a specific example of risk-aversion. This recovers the ϵ -greedy probabilities known in reinforcement learning. We show that the entropy minimization leads to rudimentary forms of intelligent behavior: (i) the agent assigns a non-negligible probability to costly events; but (ii) chooses with a sizable probability the action related to less cost (lesser of two evils) when confronted with two actions with comparable costs; (iii) the agent is subject to effects similar to cognitive dissonance and frustration. Neither of these features are shown by entropy maximization.

Read more
Data Analysis Statistics And Probability

Adaptive covariance inflation in the ensemble Kalman filter by Gaussian scale mixtures

This paper studies multiplicative inflation: the complementary scaling of the state covariance in the ensemble Kalman filter (EnKF). Firstly, error sources in the EnKF are catalogued and discussed in relation to inflation; nonlinearity is given particular attention as a source of sampling error. In response, the "finite-size" refinement known as the EnKF-N is re-derived via a Gaussian scale mixture, again demonstrating how it yields adaptive inflation. Existing methods for adaptive inflation estimation are reviewed, and several insights are gained from a comparative analysis. One such adaptive inflation method is selected to complement the EnKF-N to make a hybrid that is suitable for contexts where model error is present and imperfectly parameterized. Benchmarks are obtained from experiments with the two-scale Lorenz model and its slow-scale truncation. The proposed hybrid EnKF-N method of adaptive inflation is found to yield systematic accuracy improvements in comparison with the existing methods, albeit to a moderate degree.

Read more
Data Analysis Statistics And Probability

Advances of Machine Learning in Molecular Modeling and Simulation

In this review, we highlight recent developments in the application of machine learning for molecular modeling and simulation. After giving a brief overview of the foundations, components, and workflow of a typical supervised learning approach for chemical problems, we showcase areas and state-of-the-art examples of their deployment. In this context, we discuss how machine learning relates to, supports, and augments more traditional physics-based approaches in computational research. We conclude by outlining challenges and future research directions that need to be addressed in order to make machine learning a mainstream chemical engineering tool.

Read more

Ready to get started?

Join us today