Featured Researches

Applications

A Probabilistic Approach to Identifying Run Scoring Advantage in the Order of Playing Cricket

In the game of cricket, the result of coin toss is assumed to be one of the determinants of match outcome. The decision to bat first after winning the toss is often taken to make the best use of superior pitch conditions and set a big target for the opponent. However, the opponent may fail to show their natural batting performance in the second innings due to a number of factors, including deteriorated pitch conditions and excessive pressure of chasing a high target score. The advantage of batting first has been highlighted in the literature and expert opinions, however, the effect of batting and bowling order on match outcome has not been investigated well enough to recommend a solution to any potential bias. This study proposes a probability theory-based model to study venue-specific scoring and chasing characteristics of teams under different match outcomes. A total of 1117 one-day international matches held in ten popular venues are analyzed to show substantially high scoring advantage and likelihood when the winning team bat in the first innings. Results suggest that the same 'bat-first' winning team is very unlikely to score or chase such a high score if they were to bat in the second innings. Therefore, the coin toss decision may favor one team over the other. A Bayesian model is proposed to revise the target score for each venue such that the winning and scoring likelihood is equal regardless of the toss decision. The data and source codes have been shared publicly for future research in creating competitive match outcomes by eliminating the advantage of batting order in run scoring.

Read more
Applications

A Probabilistic Model for Predicting Shot Success in Football

Football forecasting models traditionally rate teams on past match results, that is based on the number of goals scored. Goals, however, involve a high element of chance and thus past results often do not reflect the performances of the teams. In recent years, it has become increasingly clear that accounting for other match events such as shots at goal can provide a better indication of the relative strengths of two teams than the number of goals scored. Forecast models based on this information have been shown to be successful in outperforming those based purely on match results. A notable weakness, however, is that this approach does not take into account differences in the probability of shot success among teams. A team that is more likely to score from a shot will need fewer shots to win a match, on average. In this paper, we propose a simple parametric model to predict the probability of a team scoring, given it has taken a shot at goal. We show that the resulting forecasts are able to outperform a model assuming an equal probability of shot success among all teams. We then show that the model can be combined with predictions of the number of shots achieved by each team, and can increase the skill of forecasts of both the match outcome and of whether the total number of goals in a match will exceed 2.5. We assess the performance of the forecasts alongside two betting strategies and find mixed evidence for improved performance.

Read more
Applications

A Recommendation and Risk Classification System for Connecting Rough Sleepers to Essential Outreach Services

Rough sleeping is a chronic problem faced by some of the most disadvantaged people in modern society. This paper describes work carried out in partnership with Homeless Link, a UK-based charity, in developing a data-driven approach to assess the quality of incoming alerts from members of the public aimed at connecting people sleeping rough on the streets with outreach service providers. Alerts are prioritised based on the predicted likelihood of successfully connecting with the rough sleeper, helping to address capacity limitations and to quickly, effectively, and equitably process all of the alerts that they receive. Initial evaluation concludes that our approach increases the rate at which rough sleepers are found following a referral by at least 15\% based on labelled data, implying a greater overall increase when the alerts with unknown outcomes are considered, and suggesting the benefit in a trial taking place over a longer period to assess the models in practice. The discussion and modelling process is done with careful considerations of ethics, transparency and explainability due to the sensitive nature of the data in this context and the vulnerability of the people that are affected.

Read more
Applications

A Registration-free approach for Statistical Process Control of 3D scanned objects via FEM

Recent work in on-line Statistical Process Control (SPC) of manufactured 3-dimensional (3-D) objects has been proposed based on the estimation of the spectrum of the Laplace-Beltrami (LB) operator, a differential operator that encodes the geometrical features of a manifold and is widely used in Machine Learning (i.e., Manifold Learning). The resulting spectra are an intrinsic geometrical feature of each part, and thus can be compared between parts avoiding the part to part registration (or "part localization") pre-processing or the need for equal size meshes, characteristics which are required in previous approaches for SPC of 3D parts. The recent spectral SPC methods, however, are limited to monitoring surface data from objects such that the scanned meshes have no boundaries, holes, or missing portions. In this paper, we extend spectral methods by first considering a more accurate and general estimator of the LB spectrum that is obtained by application of Finite Element Methods (FEM) to the solution of Helmholtz's equation with boundaries. It is shown how the new spectral FEM approach, while it retains the advantages of not requiring part localization/registration or equal size datasets scanned from each part, it provides more accurate spectrum estimates, which results in faster detection of out of control conditions than earlier methods, can be applied to both mesh or volumetric (solid) scans, and furthermore, it is shown how it can be applied to partial scans that result in open meshes (surface or volumetric) with boundaries, increasing the practical applicability of the methods. The present work brings SPC methods closer to contemporary research in Computer Graphics and Manifold Learning. MATLAB code that reproduces the examples of this paper is provided in the supplementary materials.

Read more
Applications

A Relationship Between SIR Model and Generalized Logistic Distribution with Applications to SARS and COVID-19

This paper shows that the generalized logistic distribution model is derived from the well-known compartment model, consisting of susceptible, infected and recovered compartments, abbreviated as the SIR model, under certain conditions. In the SIR model, there are uncertainties in predicting the final values for the number of infected population and the infectious parameter. However, by utilizing the information obtained from the generalized logistic distribution model, we can perform the SIR numerical computation more stably and more accurately. Applications to severe acute respiratory syndrome (SARS) and Coronavirus disease 2019 (COVID-19) using this combined method are also introduced.

Read more
Applications

A Rolling Optimized Nonlinear Grey Bernoulli Model RONGBM(1,1) and application in predicting total COVID-19 infected cases

The Nonlinear Grey Bernoulli Model NGBM(1, 1) is a recently developed grey model which has various applications in different fields, mainly due to its accuracy in handling small time-series datasets with nonlinear variations. In this paper, to fully improve the accuracy of this model, a novel model is proposed, namely Rolling Optimized Nonlinear Grey Bernoulli Model RONGBM(1, 1). This model combines the rolling mechanism with the simultaneous optimization of all model parameters (exponential, background value and initial condition). The accuracy of this new model has significantly been proven through forecasting Vietnam's GDP from 2013 to 2018, before it is applied to predict the total COVID-19 infected cases globally by day.

Read more
Applications

A Study on the Association between Maternal Childhood Trauma Exposure and Placental-fetal Stress Physiology during Pregnancy

It has been found that the effect of childhood trauma (CT) exposure may pass on to the next generation. Scientists have hypothesized that the association between CT exposure and placental-fetal stress physiology is the mechanism. A study was conducted to examine the hypothesis. To examine the association between CT exposure and placental corticotrophin-releasing hormone (pCRH), linear mixed effect model and hierarchical Bayesian linear model were constructed. In Bayesian inference, by providing conditionally conjugate priors, Gibbs sampler was used to draw MCMC samples. Piecewise linear mixed effect model was conducted in order to adjust to the dramatic change of pCRH at around week 20 into pregnancy. Pearson residual, QQ, ACF and trace plots were used to justify the model adequacy. Likelihood ratio test and DIC were utilized to model selection. The association between CT exposure and pCRH during pregnancy is obvious. The effect of CT exposure on pCRH varies dramatically over gestational age. Women with one childhood trauma would experience 11.9% higher in pCRH towards the end of pregnancy than those without childhood trauma. The increase rate of pCRH after week 20 is almost four-fold larger than that before week 20. Frequentist and Bayesian inference produce similar results. The findings support the hypothesis that the effect of CT exposure on pCRH over GA exists. The effect changes dramatically at around week 20 into pregnancy.

Read more
Applications

A Study on the Possible Effects of the Implementation of the Nordic Model in India on Crime Rates and Sexually Transmitted Diseases

Prostitution is one of the root causes of sex trafficking and the transmission of sexual diseases. The rules and regulations followed by the Indian government to regulate the same, fall under the umbrella of the abolitionism model. Neo-abolitionism (also known as the Nordic model) is a new legislative model that has been introduced by the Nordic countries to regulate prostitution. The purpose of this research paper is to examine the possible effects of the application of the Nordic model on the crime rates and the spread of sexually transmitted diseases in India. Further, we also aim to study the effects of the implementation of Neo-abolitionism in Sweden.

Read more
Applications

A Time To Event Framework For Multi-touch Attribution

Multi-touch attribution (MTA) estimates the relative contributions of the multiple ads a user may see prior to any observed conversions. Increasingly, advertisers also want to base budget and bidding decisions on these attributions, spending more on ads that drive more conversions. We describe two requirements for an MTA system to be suitable for this application: First, it must be able to handle continuously updated and incomplete data. Second, it must be sufficiently flexible to capture that an ad's effect will change over time. We describe an MTA system, consisting of a model for user conversion behavior and a credit assignment algorithm, that satisfies these requirements. Our model for user conversion behavior treats conversions as occurrences in an inhomogeneous Poisson process, while our attribution algorithm is based on iteratively removing the last ad in the path.

Read more
Applications

A Two-Stage Bayesian Semiparametric Model for Novelty Detection with Robust Prior Information

Novelty detection methods aim at partitioning the test units into already observed and previously unseen patterns. However, two significant issues arise: there may be considerable interest in identifying specific structures within the novelty, and contamination in the known classes could completely blur the actual separation between manifest and new groups. Motivated by these problems, we propose a two-stage Bayesian semiparametric novelty detector, building upon prior information robustly extracted from a set of complete learning units. We devise a general-purpose multivariate methodology that we also extend to handle functional data objects. We provide insights on the model behavior by investigating the theoretical properties of the associated semiparametric prior. From the computational point of view, we propose a suitable ΞΎ -sequence to construct an independent slice-efficient sampler that takes into account the difference between manifest and novelty components. We showcase our model performance through an extensive simulation study and applications on both multivariate and functional datasets, in which diverse and distinctive unknown patterns are discovered.

Read more

Ready to get started?

Join us today