Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Adrian Brown is active.

Publication


Featured researches published by Adrian Brown.


BMC Medical Informatics and Decision Making | 2014

Technical challenges of providing record linkage services for research

James H. Boyd; Sean M. Randall; Anna Ferrante; Jacqueline K. Bauer; Adrian Brown; James B. Semmens

BackgroundRecord linkage techniques are widely used to enable health researchers to gain event based longitudinal information for entire populations. The task of record linkage is increasingly being undertaken by specialised linkage units (SLUs). In addition to the complexity of undertaking probabilistic record linkage, these units face additional technical challenges in providing record linkage ‘as a service’ for research. The extent of this functionality, and approaches to solving these issues, has had little focus in the record linkage literature. Few, if any, of the record linkage packages or systems currently used by SLUs include the full range of functions required.MethodsThis paper identifies and discusses some of the functions that are required or undertaken by SLUs in the provision of record linkage services. These include managing routine, on-going linkage; storing and handling changing data; handling different linkage scenarios; accommodating ever increasing datasets. Automated linkage processes are one way of ensuring consistency of results and scalability of service.ResultsAlternative solutions to some of these challenges are presented. By maintaining a full history of links, and storing pairwise information, many of the challenges around handling ‘open’ records, and providing automated managed extractions are solved. A number of these solutions were implemented as part of the development of the National Linkage System (NLS) by the Centre for Data Linkage (part of the Population Health Research Network) in Australia.ConclusionsThe demand for, and complexity of, linkage services is growing. This presents as a challenge to SLUs as they seek to service the varying needs of dozens of research projects annually. Linkage units need to be both flexible and scalable to meet this demand. It is hoped the solutions presented here can help mitigate these difficulties.


Health Information Management Journal | 2016

Limited privacy protection and poor sensitivity Is it time to move on from the statistical linkage key-581?

Sean M. Randall; Anna Ferrante; James H. Boyd; Adrian Brown; James B. Semmens

Background: The statistical linkage key (SLK-581) is a common tool for record linkage in Australia, due to its ability to provide some privacy protection. However, newer privacy-preserving approaches may provide greater privacy protection, while allowing high-quality linkage. Objective: To evaluate the standard SLK-581, encrypted SLK-581 and a newer privacy-preserving approach using Bloom filters, in terms of both privacy and linkage quality. Method: Linkage quality was compared by conducting linkages on Australian health datasets using these three techniques and examining results. Privacy was compared qualitatively in relation to a series of scenarios where privacy breaches may occur. Results: The Bloom filter technique offered greater privacy protection and linkage quality compared to the SLK-based method commonly used in Australia. Conclusion: The adoption of new privacy-preserving methods would allow both greater confidence in research results, while significantly improving privacy protection.


BMC Medical Informatics and Decision Making | 2017

Evaluating privacy-preserving record linkage using cryptographic long-term keys and multibit trees on large medical datasets

Adrian Brown; Christian Borgs; Sean M. Randall; Rainer Schnell

BackgroundIntegrating medical data using databases from different sources by record linkage is a powerful technique increasingly used in medical research. Under many jurisdictions, unique personal identifiers needed for linking the records are unavailable. Since sensitive attributes, such as names, have to be used instead, privacy regulations usually demand encrypting these identifiers. The corresponding set of techniques for privacy-preserving record linkage (PPRL) has received widespread attention. One recent method is based on Bloom filters. Due to superior resilience against cryptographic attacks, composite Bloom filters (cryptographic long-term keys, CLKs) are considered best practice for privacy in PPRL. Real-world performance of these techniques using large-scale data is unknown up to now.MethodsUsing a large subset of Australian hospital admission data, we tested the performance of an innovative PPRL technique (CLKs using multibit trees) against a gold-standard derived from clear-text probabilistic record linkage. Linkage time and linkage quality (recall, precision and F-measure) were evaluated.ResultsClear text probabilistic linkage resulted in marginally higher precision and recall than CLKs. PPRL required more computing time but 5 million records could still be de-duplicated within one day. However, the PPRL approach required fine tuning of parameters.ConclusionsWe argue that increased privacy of PPRL comes with the price of small losses in precision and recall and a large increase in computational burden and setup time. These costs seem to be acceptable in most applied settings, but they have to be considered in the decision to apply PPRL. Further research on the optimal automatic choice of parameters is needed.


Australian and New Zealand Journal of Public Health | 2017

Understanding the origins of record linkage errors and how they affect research outcomes.

James H. Boyd; Anna Ferrante; Katie Irvine; Michael D. Smith; Elizabeth Moore; Adrian Brown; Sean M. Randall

Major investment in record linkage infrastructure in Australia and internationally reflects the strategic value of high-quality linked datasets. Dedicated record linkage units with secure environments and specialised linkage personnel have been established to support academic research, policy development and service design by government.1 The challenge for units creating linked data is to maximise linkage quality using a variety of matching and management techniques. However, it is also important that researchers understand processes around both data collection and linkage to ensure that they are aware of strengths and limitations of the data and methods used to bring together records. In this way, research study design can be optimised and potential for misinterpretation is reduced.2


Frontiers in Public Health | 2017

Ensuring privacy when integrating patient-based datasets: New methods and developments in record linkage

Adrian Brown; Anna Ferrante; Sean M. Randall; James H. Boyd; James B. Semmens

In an era where the volume of structured and unstructured digital data has exploded, there has been an enormous growth in the creation of data about individuals that can be used for understanding and treating disease. Joining these records together at an individual level provides a complete picture of a patient’s interaction with health services and allows better assessment of patient outcomes and effectiveness of treatment and services. Record linkage techniques provide an efficient and cost-effective method to bring individual records together as patient profiles. These linkage procedures bring their own challenges, especially relating to the protection of privacy. The development and implementation of record linkage systems that do not require the release of personal information can reduce the risks associated with record linkage and overcome legal barriers to data sharing. Current conceptual and experimental privacy-preserving record linkage (PPRL) models show promise in addressing data integration challenges. Enhancing and operationalizing PPRL protocols can help address the dilemma faced by some custodians between using data to improve quality of life and dealing with the ethical, legal, and administrative issues associated with protecting an individual’s privacy. These methods can reduce the risk to privacy, as they do not require personally identifying information to be shared. PPRL methods can improve the delivery of record linkage services to the health and broader research community.


BMC Health Services Research | 2018

Sociodemographic differences in linkage error: an examination of four large-scale datasets

Sean M. Randall; Adrian Brown; James H. Boyd; Rainer Schnell; Christian Borgs; Anna Ferrante

BackgroundRecord linkage is an important tool for epidemiologists and health planners. Record linkage studies will generally contain some level of residual record linkage error, where individual records are either incorrectly marked as belonging to the same individual, or incorrectly marked as belonging to separate individuals. A key question is whether errors in linkage quality are distributed evenly throughout the population, or whether certain subgroups will exhibit higher rates of error. Previous investigations of this issue have typically compared linked and un-linked records, which can conflate bias caused by record linkage error, with bias caused by missing records (data capture errors).MethodsFour large administrative datasets were individually de-duplicated, with results compared to an available ‘gold-standard’ benchmark, allowing us to avoid methodological issues with comparing linked and un-linked records. Results were compared by gender, age, geographic remoteness (major cities, regional or remote) and socioeconomic status.ResultsResults varied between datasets, and by sociodemographic characteristic. The most consistent findings were worse linkage quality for younger individuals (seen in all four datasets) and worse linkage quality for those living in remote areas (seen in three of four datasets). The linkage quality within sociodemographic categories varied between datasets, with the associations with linkage error reversed across different datasets due to quirks of the specific data collection mechanisms and data sharing practices.ConclusionsThese results suggest caution should be taken both when linking younger individuals and those in remote areas, and when analysing linked data from these subgroups. Further research is required to determine the ramifications of worse linkage quality in these subpopulations on research outcomes.


BMC Medical Research Methodology | 2017

Estimating parameters for probabilistic linkage of privacy-preserved datasets

Adrian Brown; Sean M. Randall; Anna Ferrante; James B. Semmens; James H. Boyd

BackgroundProbabilistic record linkage is a process used to bring together person-based records from within the same dataset (de-duplication) or from disparate datasets using pairwise comparisons and matching probabilities. The linkage strategy and associated match probabilities are often estimated through investigations into data quality and manual inspection. However, as privacy-preserved datasets comprise encrypted data, such methods are not possible. In this paper, we present a method for estimating the probabilities and threshold values for probabilistic privacy-preserved record linkage using Bloom filters.MethodsOur method was tested through a simulation study using synthetic data, followed by an application using real-world administrative data. Synthetic datasets were generated with error rates from zero to 20% error. Our method was used to estimate parameters (probabilities and thresholds) for de-duplication linkages. Linkage quality was determined by F-measure. Each dataset was privacy-preserved using separate Bloom filters for each field. Match probabilities were estimated using the expectation-maximisation (EM) algorithm on the privacy-preserved data. Threshold cut-off values were determined by an extension to the EM algorithm allowing linkage quality to be estimated for each possible threshold. De-duplication linkages of each privacy-preserved dataset were performed using both estimated and calculated probabilities. Linkage quality using the F-measure at the estimated threshold values was also compared to the highest F-measure. Three large administrative datasets were used to demonstrate the applicability of the probability and threshold estimation technique on real-world data.ResultsLinkage of the synthetic datasets using the estimated probabilities produced an F-measure that was comparable to the F-measure using calculated probabilities, even with up to 20% error. Linkage of the administrative datasets using estimated probabilities produced an F-measure that was higher than the F-measure using calculated probabilities. Further, the threshold estimation yielded results for F-measure that were only slightly below the highest possible for those probabilities.ConclusionsThe method appears highly accurate across a spectrum of datasets with varying degrees of error. As there are few alternatives for parameter estimation, the approach is a major step towards providing a complete operational approach for probabilistic linkage of privacy-preserved datasets.


First International Workshop on Population Informatics for Big Data | 2015

Grouping Methods for Ongoing Record Linkage

Sean M. Randall; James H. Boyd; Anna Ferrante; Adrian Brown; James B. Semmens

The grouping of record-pairs to determine which records belong to the same individual is an important part of the record linkage process. While a merge grouping approach is commonly used, other methods may be more appropriate when linking to a repository of previously linked data. In this paper, we applied a number of grouping strategies to three large scale hospital datasets (comprising around 27 million records), each with a known truth set. These datasets were linked against a created ‘repository’ whose quality was varied. Experimental results show that alternate grouping methods can yield very large benefits in linkage quality, especially when the quality of the underlying repository is high. Best link methods can remove between 25-90% of matching errors, depending on the characteristics of the underlying datasets.


BMC Health Services Research | 2015

Accuracy and completeness of patient pathways – the benefits of national data linkage in Australia

James H. Boyd; Sean M. Randall; Anna Ferrante; Jacqueline K. Bauer; Kevin McInneny; Adrian Brown; Katrina Spilsbury; Margo Gillies; James B. Semmens


International Journal for Population Data Science | 2017

High quality linkage using Multibit Trees for privacy-preserving blocking

Adrian Brown; Christian Borgs; Sean M. Randall; Rainer Schnell

Collaboration


Dive into the Adrian Brown's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Rainer Schnell

University of Duisburg-Essen

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Karey Iron

College of Physicians and Surgeons of Ontario

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge