Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where J. R. M. Hosking is active.

Publication


Featured researches published by J. R. M. Hosking.


Water Resources Research | 1993

Some statistics useful in regional frequency analysis

J. R. M. Hosking; James R. Wallis

Regional frequency analysis uses data from a number of measuring sites. A “region” is a group of sites each of which is assumed to have data drawn from the same frequency distribution. The analysis involves the assignment of sites to regions, testing whether the proposed regions are indeed homogeneous, and choice of suitable distributions to fit to each regions data. This paper describes three statistics useful in regional frequency analysis: a discordancy measure, for identifying unusual sites in a region; a heterogeneity measure, for assessing whether a proposed region is homogeneous; and a goodness-of-fit measure, for assessing whether a candidate distribution provides an adequate fit to the data. Tests based on the statistics provide objective backing for the decisions involved in regional frequency analysis. The statistics are based on the L moments [Hosking, 1990] of the at-site data.


Journal of Econometrics | 1996

Asymptotic distributions of the sample mean, autocovariances, and autocorrelations of long-memory time series☆

J. R. M. Hosking

We derive the asymptotic distributions of the sample mean, autocovariances, and autocorrelations for a time series whose autocovariance function “γκ” has the powerlaw decay γκ ∼ λκ−Σ, λ > 0, o < α < 1, as κ → ∞. The results differ in important respects from the corresponding results for short-memory processes, whose autocovariance functions are absolutely summable. For long-memory processes the variances of the sample mean, and of the sample autocovariances and autocorrelations for 0 < α ⩽ 12, are not of asymptotic order n−1. When 0 < α < 12 the asymptotic distributions of the sample autocovariances and autocorrelations are not Normal.


Hydrological Sciences Journal-journal Des Sciences Hydrologiques | 1985

An appraisal of the regional flood frequency procedure in the UK Flood Studies Report

J. R. M. Hosking; James R. Wallis; Eric F. Wood

ABSTRACT The algorithm for estimating the regional flood frequency hazard contained in the 1975 Natural Environment Research Council Flood Studies Report (FSR) can occasionally lead to upper quantile estimates that appear unrealistic when compared with engineering judgement. Tests with the FSR algorithm were made for several sets of observed flood sequences and a great variety of synthetic data in a Monte Carlo simulation study. Similar tests were conducted with many other regional and at-site flood frequency estimation procedures including a regional generalized extreme value distribution (GEV) procedure and a regional Wakeby distribution (WAK) procedure, both of which used biased probability weighted moments (PWM) in their formulation. For the Monte Carlo simulations, for which the true quantiles to be estimated were known, it was found that the FSR algorithm yielded quantile estimates that were always more variable, often by a factor of as much as 4 or 5, than those obtained by either the GEV/PWM or WA...


Ibm Journal of Research and Development | 1994

The four-parameter kappa distribution

J. R. M. Hosking

Many common probability distributions, including some that have attracted recent interest for flood-frequency analysis, may be regarded as special cases of a four-parameter distribution that generalizes the three-parameter kappa distribution of P.W. Mielke. This four-parameter kappa distribution can be fitted to experimental data or used as a source of artificial data in simulation studies. This paper describes some of the properties of the four-parameter kappa distribution, and gives an example in which it is applied to modeling the distribution of annual maximum precipitation data.


Data Mining and Knowledge Discovery | 1999

Partitioning Nominal Attributes in Decision Trees

Don Coppersmith; Se June Hong; J. R. M. Hosking

To find the optimal branching of a nominal attribute at a node in an L-ary decision tree, one is often forced to search over all possible L-ary partitions for the one that yields the minimum impurity measure. For binary trees (L = 2) when there are just two classes a short-cut search is possible that is linear in n, the number of distinct values of the attribute. For the general case in which the number of classes, k, may be greater than two, Burshtein et al. have shown that the optimal partition satisfies a condition that involves the existence of2L hyperplanes in the class probability space. We derive a property of the optimal partition for concave impurity measures (including in particular the Gini and entropy impurity measures) in terms of the existence ofL vectors in the dual of the class probability space, which implies the earlier condition.Unfortunately, these insights still do not offer a practical search method when n and k are large, even for binary trees. We therefore present a new heuristic search algorithm to find a good partition. It is based on ordering the attributes values according to their principal component scores in the class probability space, and is linear in n. We demonstrate the effectiveness of the new method through Monte Carlo simulation experiments and compare its performance against other heuristic methods.


knowledge discovery and data mining | 2009

Spatial-temporal causal modeling for climate change attribution

Aurelie C. Lozano; Hongfei Li; Alexandru Niculescu-Mizil; Yan Liu; Claudia Perlich; J. R. M. Hosking; Naoki Abe

Attribution of climate change to causal factors has been based predominantly on simulations using physical climate models, which have inherent limitations in describing such a complex and chaotic system. We propose an alternative, data centric, approach that relies on actual measurements of climate observations and human and natural forcing factors. Specifically, we develop a novel method to infer causality from spatial-temporal data, as well as a procedure to incorporate extreme value modeling into our method in order to address the attribution of extreme climate events, such as heatwaves. Our experimental results on a real world dataset indicate that changes in temperature are not solely accounted for by solar radiance, but attributed more significantly to CO2 and other greenhouse gases. Combined with extreme value modeling, we also show that there has been a significant increase in the intensity of extreme temperatures, and that such changes in extreme temperature are also attributable to greenhouse gases. These preliminary results suggest that our approach can offer a useful alternative to the simulation-based approach to climate modeling and attribution, and provide valuable insights from a fresh perspective.


Journal of Climate | 1993

Regional Precipitation Quantile Values for the Continental United States Computed from L-Moments

Nathaniel B. Guttman; J. R. M. Hosking; James R. Wallis

Abstract Precipitation quantile values have been computed for 9 probabilities, 8 durations, 12 starting months, and 1 1 1 regions across the United States. L-moment methodology has been used for the calculations. Discussed are the rationale for selecting the Pearson type III (gamma) and Wakeby distributions, and the confidence that can be placed in the quantile values. Results show that distribution functions become more asymmetrical as the duration decreases, indicating that the median may be a better measure of central tendency than the mean. Portraying the quantile values as a percentage of the median value leads to smooth spatial fields. Computation of quantile values was the first known large-scale application of L-moment methodology. In spite of the complexity of the techniques and the extensive use of personnel and computer resources, the results justify the procedures in terms of preparing easy to use probability statements that reflect underlying physical processes.


Water Resources Research | 1995

A Comparison of Unbiased and Plotting‐Position Estimators of L Moments

J. R. M. Hosking; James R. Wallis

Plotting-position estimators of L moments and L moment ratios have several disadvantages compared with the “unbiased” estimators. For general use, the “unbiased” estimators should be preferred. Plotting-position estimators may still be useful for estimating extreme upper tail quantiles in regional frequency analysis.


Future Generation Computer Systems | 1997

A statistical perspective on data mining

J. R. M. Hosking; Edwin P. D. Pednault; Madhu Sudan

Abstract Data mining can be regarded as a collection of methods for drawing inferences from data. The aims of data mining, and some of its methods, overlap with those of classical statistics. However, there are some philosophical and methodological differences. We examine these differences, and we describe three approaches to machine learning that have developed largely independently: classical statistics, Vapniks statistical learning theory, and computational learning theory. Comparing these approaches, we conclude that statisticians and data miners can profit by studying each others methods and using a judiciously chosen combination of them.


intelligent data analysis | 1997

Decomposition of Heterogeneous Classification Problems

Chidanand Apte; Se June Hong; J. R. M. Hosking; Jorge Lepre; Edwin P. D. Pednault; Barry K. Rosen

In some classification problems the feature space is heterogeneous in that the best features on which to base the classification are different in different parts of the feature space. In some other problems the classes can be divided into subsets such that distinguishing one subset of classes from another and classifying examples within the subsets require very different decision rules, involving different sets of features. In such heterogeneous problems, many modeling techniques including decision trees, rules, and neural networks evaluate the performance of alternative decision rules by averaging over the entire problem space, and are prone to generating a model that is suboptimal in any of the regions or subproblems. Better overall models can be obtained by splitting the problem appropriately and modeling each subproblem separately.This paper presents a new measure to determine the degree of dissimilarity between the decision surfaces of two given problems, and suggests a way to search for a strategic splitting of the feature space that identifies regions with different characteristics. We illustrate the concept using a multiplexor problem, and apply the method to a DNA classification problem.

Collaboration


Dive into the J. R. M. Hosking's collaboration.

Researchain Logo
Decentralizing Knowledge