Featured Researches

Data Analysis Statistics And Probability

Maximum-likelihood parameter estimation in terahertz time-domain spectroscopy

We present a maximum-likelihood method for parameter estimation in terahertz time-domain spectroscopy. We derive the likelihood function for a parameterized frequency response function, given a pair of time-domain waveforms with known time-dependent noise amplitudes. The method provides parameter estimates that are superior to other commonly-used methods, and provides a reliable measure of the goodness of fit. We also develop a simple noise model that is parameterized by three dominant sources, and derive the likelihood function for their amplitudes in terms of a set of repeated waveform measurements. We demonstrate the method with applications to material characterization.

Read more
Data Analysis Statistics And Probability

Measures of spike train synchrony and directionality

Measures of spike train synchrony have become important tools in both experimental and theoretical neuroscience. Three time-resolved measures called the ISI-distance, the SPIKE-distance, and SPIKE-synchronization have already been successfully applied in many different contexts. These measures are time scale independent, since they consider all time scales as equally important. However, in real data one is typically less interested in the smallest time scales and a more adaptive approach is needed. Therefore, in the first part of this Chapter we describe recently introduced generalizations of the three measures, that gradually disregard differences in smaller time-scales. Besides similarity, another very relevant property of spike trains is the temporal order of spikes. In the second part of this chapter we address this property and describe a very recently proposed algorithm, which quantifies the directionality within a set of spike train. This multivariate approach sorts multiple spike trains from leader to follower and quantifies the consistency of the propagation patterns. Finally, all measures described in this chapter are freely available for download.

Read more
Data Analysis Statistics And Probability

Measuring topological descriptors of complex networks under uncertainty

Revealing the structural features of a complex system from the observed collective dynamics is a fundamental problem in network science. In order to compute the various topological descriptors commonly used to characterize the structure of a complex system (e.g. the degree, the clustering coefficient), it is usually necessary to completely reconstruct the network of relations between the subsystems. Several methods are available to detect the existence of interactions between the nodes of a network. By observing some physical quantities through time, the structural relationships are inferred using various discriminating statistics (e.g. correlations, mutual information, etc.). In this setting, the uncertainty about the existence of the edges is reflected in the uncertainty about the topological descriptors. In this study, we propose a novel methodological framework to evaluate this uncertainty, replacing the topological descriptors, even at the level of a single node, with appropriate probability distributions, eluding the reconstruction phase. Our theoretical framework agrees with the numerical experiments performed on a large set of synthetic and real-world networks. Our results provide a grounded framework for the analysis and the interpretation of widely used topological descriptors, such as degree centrality, clustering and clusters, in scenarios where the existence of network connectivity is statistically inferred or when the probabilities of existence π ij of the edges are known. To this purpose we also provide a simple and mathematically grounded process to transform the discriminating statistics into the probabilities π ij .

Read more
Data Analysis Statistics And Probability

Method of fractal diversity in data science problems

The parameter (SNR) is obtained for distinguishing the Gaussian function, the distribution of random variables in the absence of cross correlation, from other functions, which makes it possible to describe collective states with strong cross-correlation of data. The signal-to-noise ratio (SNR) in one-dimensional space is determined and a calculation algorithm based on the fractal variety of the Cantor dust in a closed loop is given. The algorithm is invariant for linear transformations of the initial data set, has renormalization-group invariance, and determines the intensity of cross-correlation (collective effect) of the data. The description of the collective state is universal and does not depend on the nature of the correlation of data, nor is the universality of the distribution of random variables in the absence of data correlation. The method is applicable for large sets of non-Gaussian or strange data obtained in information technology. In confirming the hypothesis of Koshland, the application of the method to the intensity data of digital X-ray diffraction spectra with the calculation of the collective effect makes it possible to identify a conformer exhibiting biological activity.

Read more
Data Analysis Statistics And Probability

Methods to Quantify Dislocation Behavior with Dark-field X-ray Microscopy Timescans of Single-Crystal Aluminum

Crystal defects play a large role in how materials respond to their surroundings, yet there are many uncertainties in how extended defects form, move, and interact deep beneath a material's surface. A newly developed imaging diagnostic, dark-field X-ray microscopy (DFXM) can now visualize the behavior of line defects, known as dislocations, in materials under varying conditions. DFXM images visualize dislocations by imaging the very subtle long-range distortions in the material's crystal lattice, which produce a characteristic adjoined pair of bright and dark regions. Full analysis of how these dislocations evolve can be used to refine material models, however, it requires quantitative characterization of the statistics of their shape, position and motion. In this paper, we present a semi-automated approach to effectively isolate, track, and quantify the behavior of dislocations as composite objects. This analysis drives the statistical characterization of the defects, to include dislocation velocity and orientation in the crystal, for example, and is demonstrated on DFXM images measuring the evolution of defects at 98 % of the melting temperature for single-crystal aluminum, collected at the European Synchrotron Radiation Facility.

Read more
Data Analysis Statistics And Probability

Mode hunting through active information

We propose a new method to find modes based on active information. We develop an algorithm that, when applied to the whole space, will say whether there are any modes present \textit{and} where they are; this algorithm will reduce the dimensionality without resorting to Principal Components; and more importantly, population-wise, will not detect modes when they are not present.

Read more
Data Analysis Statistics And Probability

Model selection in the average of inconsistent data: an analysis of the measured Planck-constant values

When the data do not conform to the hypothesis of a known sampling-variance, the fitting of a constant to a set of measured values is a long debated problem. Given the data, fitting would require to find what measurand value is the most trustworthy. Bayesian inference is here reviewed, to assign probabilities to the possible measurand values. Different hypothesis about the data variance are tested by Bayesian model comparison. Eventually, model selection is exemplified in deriving an estimate of the Planck constant.

Read more
Data Analysis Statistics And Probability

Modeling Aerial Gamma-Ray Backgrounds using Non-negative Matrix Factorization

Airborne gamma-ray surveys are useful for many applications, ranging from geology and mining to public health and nuclear security. In all these contexts, the ability to decompose a measured spectrum into a linear combination of background source terms can provide useful insights into the data and lead to improvements over techniques that use spectral energy windows. Multiple methods for the linear decomposition of spectra exist but are subject to various drawbacks, such as allowing negative photon fluxes or requiring detailed Monte Carlo modeling. We propose using Non-negative Matrix Factorization (NMF) as a data-driven approach to spectral decomposition. Using aerial surveys that include flights over water, we demonstrate that the mathematical approach of NMF finds physically relevant structure in aerial gamma-ray background, namely that measured spectra can be expressed as the sum of nearby terrestrial emission, distant terrestrial emission, and radon and cosmic emission. These NMF background components are compared to the background components obtained using Noise-Adjusted Singular Value Decomposition (NASVD), which contain negative photon fluxes and thus do not represent emission spectra in as straightforward a way. Finally, we comment on potential areas of research that are enabled by NMF decompositions, such as new approaches to spectral anomaly detection and data fusion.

Read more
Data Analysis Statistics And Probability

Modeling Smooth Backgrounds and Generic Localized Signals with Gaussian Processes

We describe a procedure for constructing a model of a smooth data spectrum using Gaussian processes rather than the historical parametric description. This approach considers a fuller space of possible functions, is robust at increasing luminosity, and allows us to incorporate our understanding of the underlying physics. We demonstrate the application of this approach to modeling the background to searches for dijet resonances at the Large Hadron Collider and describe how the approach can be used in the search for generic localized signals.

Read more
Data Analysis Statistics And Probability

Modeling correlated bursts by the bursty-get-burstier mechanism

Temporal correlations of time series or event sequences in natural and social phenomena have been characterized by power-law decaying autocorrelation functions with decaying exponent γ . Such temporal correlations can be understood in terms of power-law distributed interevent times with exponent α , and/or correlations between interevent times. The latter, often called correlated bursts, has recently been studied by measuring power-law distributed bursty trains with exponent β . A scaling relation between α and γ has been established for the uncorrelated interevent times, while little is known about the effects of correlated interevent times on temporal correlations. In order to study these effects, we devise the bursty-get-burstier model for correlated bursts, by which one can tune the degree of correlations between interevent times, while keeping the same interevent time distribution. We numerically find that sufficiently strong correlations between interevent times could violate the scaling relation between α and γ for the uncorrelated case. A non-trivial dependence of γ on β is also found for some range of α . The implication of our results is discussed in terms of the hierarchical organization of bursty trains at various timescales.

Read more

Ready to get started?

Join us today