Featured Researches

Data Analysis Statistics And Probability

Alternative to the application of PDG scale factors

The Particle Data Group recommends a set of procedures to be applied when discrepant data are to be combined. We introduce an alternative method based on a more general and solid statistical framework, providing a robust way to include possible unknown systematic effects interfering with experimental measurements or their theoretical interpretation. The limit of large data sets and practical cases of interest are discussed in detail.

Read more
Data Analysis Statistics And Probability

An Accurate Data Cleaning Procedure for Electron Cyclotron Emission Imaging on EAST Tokamak Based on Methodology of Machine Learning

A new data cleaning procedure for electron cyclotron emission imaging (ECEI) of EAST tokamak is constructed. Machine learning techniques, including SVM and Decision tree, are applied to identifying saturated, zero, and weak signals of ECEI raw data, which not only reduces the effort of researchers for data analysis, but also improves the accuracy of data preprocessing. To enhance the reliability of the procedure, proper training sets are sampled based on massive raw data from the experiments of ECEI on EAST tokamak. Window size of temporal signal, kernel function, and other model parameters are obtained after model training. Consequently, the recognition rates of saturated, zero, and weak signals in raw data are 99.4%, 99.86%, and 99.9%, respectively, which proves the accuracy of this procedure.

Read more
Data Analysis Statistics And Probability

An algorithm for the automatic deglitching of x-ray absorption spectroscopy data

Analysis of x-ray absorption spectroscopy (XAS) data often involves the removal of artifacts or glitches from the acquired signal, a process commonly known as deglitching. Glitches result either from specific orientations of monochromator crystals or from scattering by crystallites in the sample itself. Since the precise energy or wavelength location and the intensity of glitches in a spectrum cannot always be predicted, deglitching is often performed on a per spectrum basis by the analyst. Some routines have been proposed, but they are prone to arbitrary selection of spectral artifacts and are often inadequate for processing large data sets. Here we present a statistically robust algorithm, implemented as a Python program, for the automatic detection and removal of glitches that can be applied to a large number of spectra. It uses a Savitzky-Golay filter to smooth spectra and the generalized extreme Studentized deviate test to identify outliers. We achieve robust, repeatable, and selective removal of glitches using this algorithm.

Read more
Data Analysis Statistics And Probability

An efficient approach to global sensitivity analysis and parameter estimation for line gratings

Scatterometry is a fast, indirect and nondestructive optical method for the quality control in the production of lithography masks. Geometry parameters of line gratings are obtained from diffracted light intensities by solving an inverse problem. To comply with the upcoming need for improved accuracy and precision and thus for the reduction of uncertainties, typically computationally expansive forward models have been used. In this paper we use Bayesian inversion to estimate parameters from scatterometry measurements of a silicon line grating and determine the associated uncertainties. Since the direct application of Bayesian inference using Markov-Chain Monte Carlo methods to physics-based partial differential equation (PDE) model is not feasible due to high computational costs, we use an approximation of the PDE forward model based on a polynomial chaos expansion. The expansion provides not only a surrogate for the PDE forward model, but also Sobol indices for a global sensitivity analysis. Finally, we compare our results for the global sensitivity analysis with the uncertainties of estimated parameters.

Read more
Data Analysis Statistics And Probability

An encryption-decryption framework for validating single-particle imaging

We propose an encryption-decryption framework for validating diffraction intensity volumes reconstructed using single-particle imaging (SPI) with x-ray free-electron lasers (XFELs) when the ground truth volume is absent. This framework exploits each reconstructed volumes' ability to decipher latent variables (e.g. orientations) of unseen sentinel diffraction patterns. Using this framework, we quantify novel measures of orientation disconcurrence, inconsistency, and disagreement between the decryptions by two independently reconstructed volumes. We also study how these measures can be used to define data sufficiency and its relation to spatial resolution, and the practical consequences of focusing XFEL pulses to smaller foci. This framework overcomes critical ambiguities in using Fourier Shell Correlation (FSC) as a validation measure for SPI. Finally, we show how this encryption-decryption framework naturally leads to an information-theoretic reformulation of the resolving power of XFEL-SPI, which we hope will lead to principled frameworks for experiment and instrument design.

Read more
Data Analysis Statistics And Probability

An improved method for the estimation of the Gumbel distribution parameters

Usual estimation methods for the parameters of extreme values distribution employ only a few values, wasting a lot of information. More precisely, in the case of the Gumbel distribution, only the block maxima values are used. In this work, we propose a method to seize all the available information in order to increase the accuracy of the estimations. This intent can be achieved by taking advantage of the existing relationship between the parameters of the baseline distribution, which generates data from the full sample space, and the ones for the limit Gumbel distribution. In this way, an informative prior distribution can be obtained. Different statistical tests are used to compare the behaviour of our method with the standard one, showing that the proposed method performs well when dealing with very shortened available data. The empirical effectiveness of the approach is demonstrated through a simulation study and a case study. Reduction in the credible interval width and enhancement in parameter location show that the results with improved prior adapt to very shortened data better than standard method does.

Read more
Data Analysis Statistics And Probability

An iterative method to estimate the combinatorial background

The reconstruction of broad resonances is important for understanding the dynamics of heavy ion collisions. However, large combinatorial background makes this objective very challenging. In this work an innovative iterative method which identifies signal and background contributions without input models for normalization constants is presented. This technique is successfully validated on a simulated thermal cocktail of resonances. This demonstrates that the iterative procedure is a powerful tool to reconstruct multi-differentially inclusive resonant signals in high multiplicity events as produced in heavy ion collisions.

Read more
Data Analysis Statistics And Probability

An open-source, end-to-end workflow for multidimensional photoemission spectroscopy

Characterization of the electronic band structure of solid state materials is routinely performed using photoemission spectroscopy. Recent advancements in short-wavelength light sources and electron detectors give rise to multidimensional photoemission spectroscopy, allowing parallel measurements of the electron spectral function simultaneously in energy, two momentum components and additional physical parameters with single-event detection capability. Efficient processing of the photoelectron event streams at a rate of up to tens of megabytes per second will enable rapid band mapping for materials characterization. We describe an open-source workflow that allows user interaction with billion-count single-electron events in photoemission band mapping experiments, compatible with beamlines at 3 rd and 4 th generation light sources and table-top laser-based setups. The workflow offers an end-to-end recipe from distributed operations on single-event data to structured formats for downstream scientific tasks and storage to materials science database integration. Both the workflow and processed data can be archived for reuse, providing the infrastructure for documenting the provenance and lineage of photoemission data for future high-throughput experiments.

Read more
Data Analysis Statistics And Probability

Analysis of a bistable climate toy model with physics-based machine learning methods

We propose a comprehensive framework able to address both the predictability of the first and of the second kind for high-dimensional chaotic models. For this purpose, we analyse the properties of a newly introduced multistable climate toy model constructed by coupling the Lorenz '96 model with a zero-dimensional energy balance model. First, the attractors of the system are identified with Monte Carlo Basin Bifurcation Analysis. Additionally, we are able to detect the Melancholia state separating the two attractors. Then, Neural Ordinary Differential Equations are applied in order to predict the future state of the system in both of the identified attractors.

Read more
Data Analysis Statistics And Probability

Analytical approach to network inference: Investigating degree distribution

When the network is reconstructed, two types of errors can occur: false positive and false negative errors about the presence or absence of links. In this paper, the influence of these two errors on the vertex degree distribution is analytically analysed. Moreover, an analytic formula of the density of the biased vertex degree distribution is found. In the inverse problem, we find a reliable procedure to reconstruct analytically the density of the vertex degree distribution of any network based on the inferred network and estimates for the false positive and false negative errors based on, e.g., simulation studies.

Read more

Ready to get started?

Join us today