Featured Researches

Data Analysis Statistics And Probability

Interlaboratory consensus building challenge

The manuscript is about an interlaboratory comparison which involved eleven metrology institutes. It comprises four tasks: i) deriving a consensus value from these results; ii) evaluating the associated standard uncertainty; iii) producing a coverage interval that, with 95\% confidence, is believed to include the true value of which the consensus value is an estimate; iv) suggesting how the measurement result from NIST may be compared with the consensus value.

Read more
Data Analysis Statistics And Probability

Interpretable Conservation Law Estimation by Deriving the Symmetries of Dynamics from Trained Deep Neural Networks

Understanding complex systems with their reduced model is one of the central roles in scientific activities. Although physics has greatly been developed with the physical insights of physicists, it is sometimes challenging to build a reduced model of such complex systems on the basis of insights alone. We propose a novel framework that can infer the hidden conservation laws of a complex system from deep neural networks (DNNs) that have been trained with physical data of the system. The purpose of the proposed framework is not to analyze physical data with deep learning, but to extract interpretable physical information from trained DNNs. With Noether's theorem and by an efficient sampling method, the proposed framework infers conservation laws by extracting symmetries of dynamics from trained DNNs. The proposed framework is developed by deriving the relationship between a manifold structure of time-series dataset and the necessary conditions for Noether's theorem. The feasibility of the proposed framework has been verified in some primitive cases for which the conservation law is well known. We also apply the proposed framework to conservation law estimation for a more practical case that is a large-scale collective motion system in the metastable state, and we obtain a result consistent with that of a previous study.

Read more
Data Analysis Statistics And Probability

Iterative Bayesian Monte Carlo for nuclear data evaluation

In this work, we explore the use of an iterative Bayesian Monte Carlo (IBM) procedure for nuclear data evaluation within a Talys Evaluated Nuclear data Library (TENDL) framework. In order to identify the model and parameter combinations that reproduce selected experimental data, different physical models implemented within the TALYS code, were sampled and varied simultaneously to produce random input files with unique model combinations. All the models considered were assumed to be equal a priori. Parameters to these models were then varied simultaneously using the TALYS code system to produce a set of random ENDF files which were processed into x-y tables for comparison with selected experimental data from the EXFOR database within a Bayesian framework. To improve our fit to experimental data, we iteratively update our 'best' file - the file that maximises the likelihood function - by re-sampling model parameters around this file. The method proposed has been applied for the evaluation of p+Cd-111 and Co-59 between 1 - 100 MeV incident energy region. Finally, the adjusted files were compared with experimental data from the EXFOR database as well as with evaluations from the TENDL-2017 and JENDL-4.0/HE nuclear data libraries.

Read more
Data Analysis Statistics And Probability

Iterative procedure for network inference

When a network is reconstructed from data, two types of errors can occur: false positive and false negative errors about the presence or absence of links. In this paper, the vertex degree distribution of the true underlying network is analytically reconstructed using an iterative procedure. Such procedure is based on the inferred network and estimates for the probabilities α and β of type I and type II errors, respectively. The iteration procedure consists of choosing various values for α to perform the iteration steps of the network reconstruction. For the first step, the standard value for α of 0.05 can be chosen as an example. The result of this first step gives a first estimate of the network topology of interest. For the second iteration step the value for α is adjusted according to the findings of the first step. This procedure is iterated, ultimately leading to a reconstruction of the vertex degree distribution tailored to its previously unknown network topology.

Read more
Data Analysis Statistics And Probability

Iterative subtraction method for Feature Ranking

Training features used to analyse physical processes are often highly correlated and determining which ones are most important for the classification is a non-trivial tasks. For the use case of a search for a top-quark pair produced in association with a Higgs boson decaying to bottom-quarks at the LHC, we compare feature ranking methods for a classification BDT. Ranking methods, such as the BDT Selection Frequency commonly used in High Energy Physics and the Permutational Performance, are compared with the computationally expense Iterative Addition and Iterative Removal procedures, while the latter was found to be the most performant.

Read more
Data Analysis Statistics And Probability

KLT Picker: Particle Picking Using Data-Driven Optimal Templates

Particle picking is currently a critical step in the cryo-EM single particle reconstruction pipeline. Despite extensive work on this problem, for many data sets it is still challenging, especially for low SNR micrographs. We present the KLT (Karhunen Loeve Transform) picker, which is fully automatic and requires as an input only the approximated particle size. In particular, it does not require any manual picking. Our method is designed especially to handle low SNR micrographs. It is based on learning a set of optimal templates through the use of multi-variate statistical analysis via the Karhunen Loeve Transform. We evaluate the KLT picker on publicly available data sets and present high-quality results with minimal manual effort.

Read more
Data Analysis Statistics And Probability

KLTS: A rigorous method to compute the confidence intervals for the Three-Cornered Hat and for Groslambert Covariance

The three-cornered hat / Groslambert Covariance methods are widely used to estimate the stability of each individual clock in a set of three, but no method gives reliable confidence intervals for large integration times. We propose a new KLTS (Karhunen-Loève Tansform using Sufficient statistics) method which uses these estimators to take into account the statistics of all the measurements between the pairs of clocks in a Bayesian way. The resulting Cumulative Density Function (CDF) yields confidence intervals for each clock AVAR. This CDF provides also a stability estimator which is always positive. Checked by massive Monte-Carlo simulations, KLTS proves to be perfectly reliable even for one degree of freedom. An example of experimental measurement is given.

Read more
Data Analysis Statistics And Probability

Latent Representations of Dynamical Systems: When Two is Better Than One

A popular approach for predicting the future of dynamical systems involves mapping them into a lower-dimensional "latent space" where prediction is easier. We show that the information-theoretically optimal approach uses different mappings for present and future, in contrast to state-of-the-art machine-learning approaches where both mappings are the same. We illustrate this dichotomy by predicting the time-evolution of coupled harmonic oscillators with dissipation and thermal noise, showing how the optimal 2-mapping method significantly outperforms principal component analysis and all other approaches that use a single latent representation, and discuss the intuitive reason why two representations are better than one. We conjecture that a single latent representation is optimal only for time-reversible processes, not for e.g. text, speech, music or out-of-equilibrium physical systems.

Read more
Data Analysis Statistics And Probability

Learning Physics from Data: a Thermodynamic Interpretation

Experimental data bases are typically very large and high dimensional. To learn from them requires to recognize important features (a pattern), often present at scales different to that of the recorded data. Following the experience collected in statistical mechanics and thermodynamics, the process of recognizing the pattern (the learning process) can be seen as a dissipative time evolution driven by entropy from a detailed level of description to less detailed. This is the way thermodynamics enters machine learning. On the other hand, reversible (typically Hamiltonian) evolution is propagation within the levels of description, that is also to be recognized. This is how Poisson geometry enters machine learning. Learning to handle free surface liquids and damped rigid body rotation serves as an illustration.

Read more
Data Analysis Statistics And Probability

Learning from power system data stream: phasor-detective approach

Assuming access to synchronized stream of Phasor Measurement Unit (PMU) data over a significant portion of a power system interconnect, say controlled by an Independent System Operator (ISO), what can you extract about past, current and future state of the system? We have focused on answering this practical questions pragmatically - empowered with nothing but standard tools of data analysis, such as PCA, filtering and cross-correlation analysis. Quite surprisingly we have found that even during the quiet "no significant events" period this standard set of statistical tools allows the "phasor-detective" to extract from the data important hidden anomalies, such as problematic control loops at loads and wind farms, and mildly malfunctioning assets, such as transformers and generators. We also discuss and sketch future challenges a mature phasor-detective can possibly tackle by adding machine learning and physics modeling sophistication to the basic approach.

Read more

Ready to get started?

Join us today