Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Hossein Estiri is active.

Publication


Featured researches published by Hossein Estiri.


eGEMs (Generating Evidence & Methods to improve patient outcomes) | 2016

A Harmonized Data Quality Assessment Terminology and Framework for the Secondary Use of Electronic Health Record Data.

Michael Kahn; Tiffany J. Callahan; Juliana Barnard; Alan Bauck; Jeff Brown; Bruce N. Davidson; Hossein Estiri; Carsten Goerg; Erin Holve; Steven G. Johnson; Siaw-Teng Liaw; Marianne Hamilton-Lopez; Daniella Meeker; Toan C. Ong; Patrick B. Ryan; Ning Shang; Nicole Gray Weiskopf; Chunhua Weng; Meredith Nahm Zozus; Lisa M. Schilling

Objective: Harmonized data quality (DQ) assessment terms, methods, and reporting practices can establish a common understanding of the strengths and limitations of electronic health record (EHR) data for operational analytics, quality improvement, and research. Existing published DQ terms were harmonized to a comprehensive unified terminology with definitions and examples and organized into a conceptual framework to support a common approach to defining whether EHR data is ‘fit’ for specific uses. Materials and Methods: DQ publications, informatics and analytics experts, managers of established DQ programs, and operational manuals from several mature EHR-based research networks were reviewed to identify potential DQ terms and categories. Two face-to-face stakeholder meetings were used to vet an initial set of DQ terms and definitions that were grouped into an overall conceptual framework. Feedback received from data producers and users was used to construct a draft set of harmonized DQ terms and categories. Multiple rounds of iterative refinement resulted in a set of terms and organizing framework consisting of DQ categories, subcategories, terms, definitions, and examples. The harmonized terminology and logical framework’s inclusiveness was evaluated against ten published DQ terminologies. Results: Existing DQ terms were harmonized and organized into a framework by defining three DQ categories: (1) Conformance (2) Completeness and (3) Plausibility and two DQ assessment contexts: (1) Verification and (2) Validation. Conformance and Plausibility categories were further divided into subcategories. Each category and subcategory was defined with respect to whether the data may be verified with organizational data, or validated against an accepted gold standard, depending on proposed context and uses. The coverage of the harmonized DQ terminology was validated by successfully aligning to multiple published DQ terminologies. Discussion: Existing DQ concepts, community input, and expert review informed the development of a distinct set of terms, organized into categories and subcategories. The resulting DQ terms successfully encompassed a wide range of disparate DQ terminologies. Operational definitions were developed to provide guidance for implementing DQ assessment procedures. The resulting structure is an inclusive DQ framework for standardizing DQ assessment and reporting. While our analysis focused on the DQ issues often found in EHR data, the new terminology may be applicable to a wide range of electronic health data such as administrative, research, and patient-reported data. Conclusion: A consistent, common DQ terminology, organized into a logical framework, is an initial step in enabling data owners and users, patients, and policy makers to evaluate and communicate data quality findings in a well-defined manner with a shared vocabulary. Future work will leverage the framework and terminology to develop reusable data quality assessment and reporting methods.


Urban Geography | 2015

“Phasic” metropolitan settlers: a phase-based model for the distribution of households in US metropolitan regions

Hossein Estiri; Andy Krause; Mehdi P. Heris

In this article, we develop a model for explaining spatial patterns in the distribution of households across metropolitan regions in the United States. First, we use housing consumption and residential mobility theories to construct a hypothetical probability distribution function for the consumption of housing services across three phases of household life span. We then hypothesize a second probability distribution function for the offering of housing services based on the distance from city center(s) at the metropolitan scale. Intersecting the two hypothetical probability functions, we develop a phase-based model for the distribution of households in US metropolitan regions. We argue that phase one households (young adults) are more likely to reside in central city locations, whereas phase two and three households are more likely to select suburban locations, due to their respective housing consumption behaviors. We provide empirical validation of our theoretical model with the data from the 2010 US Census for 35 large metropolitan regions.


Housing Policy Debate | 2016

Household Energy Consumption and Housing Choice in the U.S. Residential Sector

Hossein Estiri

Energy use in residential buildings accounted for 21% of U.S. CO2 emissions in 2013. Efforts to reduce energy use in the residential sector have been overly focused on improving energy efficiency of buildings. This article incorporates housing policy debate into energy policy, hoping to provide new opportunities for planners to participate in residential energy policy. Using data from the latest Residential Energy Consumption Survey, structural equation modeling has been applied to isolate the direct and indirect effects of household and housing characteristics on residential energy use. Results show that more than 80% of a households indirect effect on energy consumption happens through the building characteristics, which is characterized as the housing choice effect on energy consumption. Planners can participate in residential energy management efforts by influencing housing needs and priorities of communities towards more sustainable compact housing units.


eGEMs (Generating Evidence & Methods to improve patient outcomes) | 2016

Extracting Electronic Health Record Data in a Practice-Based Research Network: Lessons Learned from Collaborations with Translational Researchers

Allison M. Cole; Kari A. Stephens; Gina A. Keppel; Hossein Estiri; Laura Mae Baldwin

Context: The widespread adoption of electronic health records (EHRs) offers significant opportunities to conduct research with clinical data from patients outside traditional academic research settings. Because EHRs are designed primarily for clinical care and billing, significant challenges are inherent in the use of EHR data for clinical and translational research. Efficient processes are needed for translational researchers to overcome these challenges. The Data QUEST Coordinating Center (DQCC), which oversees Data QUEST – a primary care EHR data sharing infrastructure – created processes that that guide EHR data extraction for clinical and translational research across these diverse practices. We describe these processes and their application in a case example. Case Description: The DQCC process for developing EHR data extractions not only supports researchers access to EHR data, but supports this access for the purpose of answering scientific questions. This process requires complex coordination across multiple domains, including: 1) understanding the context of EHR data; 2) creating and maintaining a governance structure to support exchange of EHR data; and 3) defining data parameters that are used in order to extract data from the EHR.1,2,3,4 We use the Northwest-Alaska Pharmacogenomics Research Network (NWA-PGRN) as a case example that focuses on pharmacogenomic discovery and clinical applications to describe the DQCC process. The NWA-PGRN collaborates with Data QUEST to explore ways to leverage primary care EHR data to support pharmacogenomics research. Findings: Preliminary analysis on the case example shows that initial decisions about how researchers define the study population can influence study outcomes. Major Themes and Conclusions: The experience of the DQCC demonstrates that Coordinating Centers provide expertise in helping researchers understand the context of EHR data, create and maintain governance structures, and guide the definition of parameters for data extractions. This expertise is critical to support research with EHR data. Replication of these strategies through Coordinating Centers may lead to more efficient translational research. Investigators must also consider the impact of initial decisions in defining study groups that may potentially affect outcomes. Acknowledgements We acknowledge the Northwest Alaska Pharmacogenomics Research Network group for supporting the infrastructure and data collection, and Imara West for her assistance in data cleaning and analysis. This project was funded by the National Institute of General Medical Science (U01 GM092676) and the National Center for Advancing Translational Sciences of the National Institutes of Health (UL1TR000423). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of HealthContext: The widespread adoption of electronic health records (EHRs) offers significant opportunities to conduct research with clinical data from patients outside traditional academic research settings. Because EHRs are designed primarily for clinical care and billing, significant challenges are inherent in the use of EHR data for clinical and translational research. Efficient processes are needed for translational researchers to overcome these challenges. The Data QUEST Coordinating Center (DQCC), which oversees Data Query Extraction Standardization Translation (Data QUEST) – a primary-care, EHR data-sharing infrastructure – created processes that guide EHR data extraction for clinical and translational research across these diverse practices. We describe these processes and their application in a case example. Case Description: The DQCC process for developing EHR data extractions not only supports researchers’ access to EHR data, but supports this access for the purpose of answering scientific questions. This process requires complex coordination across multiple domains, including the following: (1) understanding the context of EHR data; (2) creating and maintaining a governance structure to support exchange of EHR data; and (3) defining data parameters that are used in order to extract data from the EHR. We use the Northwest-Alaska Pharmacogenomics Research Network (NWA-PGRN) as a case example that focuses on pharmacogenomic discovery and clinical applications to describe the DQCC process. The NWA-PGRN collaborates with Data QUEST to explore ways to leverage primary-care EHR data to support pharmacogenomics research. Findings: Preliminary analysis on the case example shows that initial decisions about how researchers define the study population can influence study outcomes. Major Themes and Conclusions: The experience of the DQCC demonstrates that coordinating centers provide expertise in helping researchers understand the context of EHR data, create and maintain governance structures, and guide the definition of parameters for data extractions. This expertise is critical to supporting research with EHR data. Replication of these strategies through coordinating centers may lead to more efficient translational research. Investigators must also consider the impact of initial decisions in defining study groups that may potentially affect outcomes.


Urban Studies | 2018

A Cohort Location Model of household sorting in US metropolitan regions

Hossein Estiri; Andy Krause

In this paper we propose a household sorting model for the 50 largest US metropolitan regions and evaluate the model using 2010 Census data. To approximate residential locations for household cohorts, we specify a Cohort Location Model (CLM) built upon two principle assumptions about housing consumption and metropolitan development/land use patterns. According to our model, the expected distance from the household’s residential location to the city centre(s) increases with the age of the householder (as a proxy for changes in housing career over life span). The CLM provides a flexible housing-based explanation for household sorting patterns in US metropolitan regions. Results from our analysis on US metropolitan regions show that households headed by individuals under the age of 35 are the most common cohort in centrally located areas. We also found that households over 35 are most prevalent in peripheral locations, but their sorting was not statistically different across space.


Journal of the American Medical Informatics Association | 2018

Exploring completeness in clinical data research networks with DQe-c

Hossein Estiri; Kari A. Stephens; Jeffrey G. Klann; Shawn N. Murphy

Abstract Objective To provide an open source, interoperable, and scalable data quality assessment tool for evaluation and visualization of completeness and conformance in electronic health record (EHR) data repositories. Materials and Methods This article describes the tool’s design and architecture and gives an overview of its outputs using a sample dataset of 200u2009000 randomly selected patient records with an encounter since January 1, 2010, extracted from the Research Patient Data Registry (RPDR) at Partners HealthCare. All the code and instructions to run the tool and interpret its results are provided in the Supplementary Appendix. Results DQe-c produces a web-based report that summarizes data completeness and conformance in a given EHR data repository through descriptive graphics and tables. Results from running the tool on the sample RPDR data are organized into 4 sections: load and test details, completeness test, data model conformance test, and test of missingness in key clinical indicators. Discussion Open science, interoperability across major clinical informatics platforms, and scalability to large databases are key design considerations for DQe-c. Iterative implementation of the tool across different institutions directed us to improve the scalability and interoperability of the tool and find ways to facilitate local setup. Conclusion EHR data quality assessment has been hampered by implementation of ad hoc processes. The architecture and implementation of DQe-c offer valuable insights for developing reproducible and scalable data science tools to assess, manage, and process data in clinical data repositories.


Regional Studies | 2016

Differences in Residential Energy Use between US City and Suburban Households

Hossein Estiri

Estiri H. Differences in residential energy use between US city and suburban households, Regional Studies. This paper applies path analysis to household-level data from the US residential sector to study differences in energy consumption between self-identified city and suburban households. Results show that, on average, suburban households consume more energy in residential buildings than their city-dweller counterparts. This variation in energy consumption is due to differences in: (1) characteristics of the household and the housing unit, independently, and (2) interactions between the household and housing characteristics in the city and suburban households. Findings of this study provide new insights into how regional policies can be implemented differently in suburbs and cities to reduce energy consumption.


International Journal of Medical Informatics | 2016

Implementing partnership-driven clinical federated electronic health record data sharing networks

Kari A. Stephens; Nick Anderson; Ching Ping Lin; Hossein Estiri

OBJECTIVEnBuilding federated data sharing architectures requires supporting a range of data owners, effective and validated semantic alignment between data resources, and consistent focus on end-users. Establishing these resources requires development methodologies that support internal validation of data extraction and translation processes, sustaining meaningful partnerships, and delivering clear and measurable system utility. We describe findings from two federated data sharing case examples that detail critical factors, shared outcomes, and production environment results.nnnMETHODSnTwo federated data sharing pilot architectures developed to support network-based research associated with the University of Washingtons Institute of Translational Health Sciences provided the basis for the findings. A spiral model for implementation and evaluation was used to structure iterations of development and support knowledge share between the two network development teams, which cross collaborated to support and manage common stages.nnnRESULTSnWe found that using a spiral model of software development and multiple cycles of iteration was effective in achieving early network design goals. Both networks required time and resource intensive efforts to establish a trusted environment to create the data sharing architectures. Both networks were challenged by the need for adaptive use cases to define and test utility.nnnCONCLUSIONnAn iterative cyclical model of development provided a process for developing trust with data partners and refining the design, and supported measureable success in the development of new federated data sharing architectures.


Big Data Research | 2018

kluster: An Efficient Scalable Procedure for Approximating the Number of Clusters in Unsupervised Learning

Hossein Estiri; Behzad Abounia Omran; Shawn N. Murphy

Abstract The majority of the clinical observation data stored in large-scale Electronic Health Record (EHR) research data networks are unlabeled. Unsupervised clustering can provide invaluable tools for studying patient sub-groups in these data. Many of the popular unsupervised clustering algorithms are dependent on identifying the number of clusters. Multiple statistical methods are available to approximate the number of clusters in a dataset. However, available methods are computationally inefficient when applied to large amounts of data. Scalable analytical procedures are needed to extract knowledge from large clinical datasets. Using both simulated, clinical, and public data, we developed and tested the kluster procedure for approximating the number of clusters in a large clinical dataset. The kluster procedure iteratively applies four statistical cluster number approximation methods to small subsets of data that were drawn randomly with replacements and recommends the most frequent and mean number of clusters resulted from the iterations as the potential optimum number of clusters. Our results showed that the klusters most frequent product that iteratively applies a model-based clustering strategy using Bayesian Information Criterion (BIC) to samples of 200–500 data points, through 100 iterations, offers a reliable and scalable solution for approximating the number of clusters in unsupervised clustering. We provide the kluster procedure as an R package.


eGEMs (Generating Evidence & Methods to improve patient outcomes) | 2017

DQe-v: A Database-Agnostic Framework for Exploring Variability in Electronic Health Record Data Across Time and Site Location

Hossein Estiri; Kari A. Stephens

Data variability is a commonly observed phenomenon in Electronic Health Records (EHR) data networks. A common question asked in scientific investigations of EHR data is whether the cross-site and -time variability reflects an underlying data quality error at one or more contributing sites versus actual differences driven by various idiosyncrasies in the healthcare settings. Although research analysts and data scientists have commonly used various statistical methods to detect and account for variability in analytic datasets, self service tools to facilitate exploring cross-organizational variability in EHR data warehouses are lacking and could benefit from meaningful data visualizations. DQe-v, an interactive, database-agnostic tool for visually exploring variability in EHR data provides such a solution. DQe-v is built on an open source platform, R statistical software, with annotated scripts and a readme document that makes it fully reproducible. To illustrate and describe functionality of DQe-v, we describe the DQe-v’s readme document which includes a complete guide to installation, running the program, and interpretation of the outputs. We also provide annotated R scripts and an example dataset as supplemental materials. DQe-v offers a self service tool to visually explore data variability within EHR datasets irrespective of the data model. GitHub and CIELO offer hosting and distribution of the tool and can facilitate collaboration across any interested community of users as we target improving usability, efficiency, and interoperability.

Collaboration


Dive into the Hossein Estiri's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Andy Krause

University of Melbourne

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Hyunggu Jung

University of Washington

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Mehdi P. Heris

University of Colorado Denver

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge