Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ian Fore is active.

Publication


Featured researches published by Ian Fore.


Nature Genetics | 2017

Finding useful data across multiple biomedical data repositories using DataMed

Lucila Ohno-Machado; Susanna-Assunta Sansone; George Alter; Ian Fore; Jeffrey S. Grethe; Hua Xu; Alejandra Gonzalez-Beltran; Philippe Rocca-Serra; Anupama E. Gururaj; Elizabeth A. Bell; Ergin Soysal; Nansu Zong; Hyeoneui Kim

The value of broadening searches for data across multiple repositories has been identified by the biomedical research community. As part of the US National Institutes of Health (NIH) Big Data to Knowledge initiative, we work with an international community of researchers, service providers and knowledge experts to develop and test a data index and search engine, which are based on metadata extracted from various data sets in a range of repositories. DataMed is designed to be, for data, what PubMed has been for the scientific literature. DataMed supports the findability and accessibility of data sets. These characteristics—along with interoperability and reusability—compose the four FAIR principles to facilitate knowledge discovery in todays big data–intensive science landscape.


Scientific Data | 2017

DATS, the data tag suite to enable discoverability of datasets

Susanna-Assunta Sansone; Alejandra Gonzalez-Beltran; Philippe Rocca-Serra; George Alter; Jeffrey S. Grethe; Hua Xu; Ian Fore; Jared Lyle; Anupama E. Gururaj; Xiaoling Chen; Hyeoneui Kim; Nansu Zong; Yueling Li; Ruiling Liu; I. Burak Ozyurt; Lucila Ohno-Machado

Today’s science increasingly requires effective ways to find and access existing datasets that are distributed across a range of repositories. For researchers in the life sciences, discoverability of datasets may soon become as essential as identifying the latest publications via PubMed. Through an international collaborative effort funded by the National Institutes of Health (NIH)’s Big Data to Knowledge (BD2K) initiative, we have designed and implemented the DAta Tag Suite (DATS) model to support the DataMed data discovery index. DataMed’s goal is to be for data what PubMed has been for the scientific literature. Akin to the Journal Article Tag Suite (JATS) used in PubMed, the DATS model enables submission of metadata on datasets to DataMed. DATS has a core set of elements, which are generic and applicable to any type of dataset, and an extended set that can accommodate more specialized data types. DATS is a platform-independent model also available as an annotated serialization in schema.org, which in turn is widely used by major search engines like Google, Microsoft, Yahoo and Yandex.


Journal of the American Medical Informatics Association | 2012

Life sciences domain analysis model

Robert R. Freimuth; Elaine T. Freund; Lisa Schick; Mukesh K. Sharma; Grace A. Stafford; Baris E. Suzek; Joyce Hernandez; Jason Hipp; Jenny M. Kelley; Konrad Rokicki; Sue Pan; Andrew J. Buckler; Todd H. Stokes; Anna T. Fernandez; Ian Fore; Kenneth H. Buetow; Juli Klemm

Objective Meaningful exchange of information is a fundamental challenge in collaborative biomedical research. To help address this, the authors developed the Life Sciences Domain Analysis Model (LS DAM), an information model that provides a framework for communication among domain experts and technical teams developing information systems to support biomedical research. The LS DAM is harmonized with the Biomedical Research Integrated Domain Group (BRIDG) model of protocol-driven clinical research. Together, these models can facilitate data exchange for translational research. Materials and methods The content of the LS DAM was driven by analysis of life sciences and translational research scenarios and the concepts in the model are derived from existing information models, reference models and data exchange formats. The model is represented in the Unified Modeling Language and uses ISO 21090 data types. Results The LS DAM v2.2.1 is comprised of 130 classes and covers several core areas including Experiment, Molecular Biology, Molecular Databases and Specimen. Nearly half of these classes originate from the BRIDG model, emphasizing the semantic harmonization between these models. Validation of the LS DAM against independently derived information models, research scenarios and reference databases supports its general applicability to represent life sciences research. Discussion The LS DAM provides unambiguous definitions for concepts required to describe life sciences research. The processes established to achieve consensus among domain experts will be applied in future iterations and may be broadly applicable to other standardization efforts. Conclusions The LS DAM provides common semantics for life sciences research. Through harmonization with BRIDG, it promotes interoperability in translational science.


bioRxiv | 2016

DataMed: Finding useful data across multiple biomedical data repositories

Lucila Ohno-Machado; Susanna-Assunta Sansone; George Alter; Ian Fore; Jeffrey S. Grethe; Hua Xu; Alejandra Gonzalez-Beltran; Philippe Rocca-Serra; Ergin Soysal; Nansu Zong; Hyeoneui Kim

The value of broadening searches for data across multiple repositories has been identified by the biomedical research community. As part of the NIH Big Data to Knowledge initiative, we work with an international community of researchers, service providers and knowledge experts to develop and test a data index and search engine, which are based on metadata extracted from various datasets in a range of repositories. DataMed is designed to be, for data, what PubMed has been for the scientific literature. DataMed supports Findability and Accessibility of datasets. These characteristics - along with Interoperability and Reusability - compose the four FAIR principles to facilitate knowledge discovery in today’s big data-intensive science landscape.


Archive | 2010

The caBIG® Life Sciences Distribution

Juli Klemm; Anand Basu; Ian Fore; Aris Floratos; George Komatsoulis

caBIG® is a virtual network of organizations developing and adopting interoperable databases and analytical tools to facilitate translational cancer research (von Eschenbach and Buetow 2007). It is an open-source, open-access program, and all the tools and resources are freely available to the research community. The National Cancer Institute is developing resources to assist enterprise-wide adoption of the caBIG® tools. To this end, we have bundled mature software tools together to facilitate easy adoption and installation. The Life Sciences Distribution (LSD) is comprised of tools to support the continuum of translational research: caArray, for the management and annotation of microarray data; caTissue, to support the collection, annotation, and distribution of biospecimens; the Clinical Trials Object Data System, for the sharing of clinical trials information; the National Biomedical Imaging Archive, for annotation, storage, and sharing of in vivo images; cancer Genome Wide Association Studies, for publishing and mining data from GWAS studies; and geWorkbench, supporting the integrated analysis and annotation of expression and sequence data. All the LSD tools are connected to caGrid (Saltz et al. 2006), which makes it possible for the databases at multiple institutions to be interconnected to support data sharing and integration.


Journal of the American Medical Informatics Association | 2018

DataMed – an open source discovery index for finding biomedical datasets

Xiaoling Chen; Anupama E. Gururaj; Burak Ozyurt; Ruiling Liu; Ergin Soysal; Trevor Cohen; Firat Tiryaki; Yueling Li; Nansu Zong; Min Jiang; Deevakar Rogith; Mandana Salimi; Hyeoneui Kim; Philippe Rocca-Serra; Alejandra Gonzalez-Beltran; Claudiu Farcas; Todd R. Johnson; Ron Margolis; George Alter; Susanna-Assunta Sansone; Ian Fore; Lucila Ohno-Machado; Jeffrey S. Grethe; Hua Xu

Abstract Objective Finding relevant datasets is important for promoting data reuse in the biomedical domain, but it is challenging given the volume and complexity of biomedical data. Here we describe the development of an open source biomedical data discovery system called DataMed, with the goal of promoting the building of additional data indexes in the biomedical domain. Materials and Methods DataMed, which can efficiently index and search diverse types of biomedical datasets across repositories, is developed through the National Institutes of Health–funded biomedical and healthCAre Data Discovery Index Ecosystem (bioCADDIE) consortium. It consists of 2 main components: (1) a data ingestion pipeline that collects and transforms original metadata information to a unified metadata model, called DatA Tag Suite (DATS), and (2) a search engine that finds relevant datasets based on user-entered queries. In addition to describing its architecture and techniques, we evaluated individual components within DataMed, including the accuracy of the ingestion pipeline, the prevalence of the DATS model across repositories, and the overall performance of the dataset retrieval engine. Results and Conclusion Our manual review shows that the ingestion pipeline could achieve an accuracy of 90% and core elements of DATS had varied frequency across repositories. On a manually curated benchmark dataset, the DataMed search engine achieved an inferred average precision of 0.2033 and a precision at 10 (P@10, the number of relevant results in the top 10 search results) of 0.6022, by implementing advanced natural language processing and terminology services. Currently, we have made the DataMed system publically available as an open source package for the biomedical community.


Cancer Research | 2010

Abstract 99: The caBIG® life sciences distribution: Integrative tools to support translational research

Mervi Heiskanen; Juli Klemm; Rakesh Nagarajan; Mukesh K. Sharma; Chandrakant Talele; Stephen Goldstein; Chris Piepenbring; Shine Jacob; Eric Tavela; John P. Marple; Ian Fore; Timothy J. Andrews; Ngoc T. Nguyen; Aris Floratos; Anand Basu; Leslie Derr

Proceedings: AACR 101st Annual Meeting 2010‐‐ Apr 17‐21, 2010; Washington, DC The cancer Biomedical Informatics Grid® (caBIG®) is a collaborative network designed to accelerate the translation of discoveries from research to clinical care. This extensible informatics platform integrates diverse data types and supports interoperable software tools in the areas of clinical sciences, biospecimen management, imaging, and basic research. The Life Sciences Distribution (LSD) product bundle brings together a range of tools that support translational research, all connected via grid network technology. Tools included in the bundle support management and annotation of microarray data (caArray), biospecimens (caTissue Suite), imaging data (NBIA), clinical data (CTODS) and genome-wide association studies (caGWAS). LSD also includes integrative tools that allow scientist to search data across different data repositories connected to the grid, and analyze and integrate these data across different data types. geWorkbench supports integrative analysis and visualization of expression data, sequences, pathways and protein structures. cancer Bench-to-Bedside (caB2B) allows end-users to search biospecimens, images and array data across the grid and makes it possible to execute queries such as: “Are there any gene expression microarray data available from patients with Stage III lung cancer and are there corresponding in vivo images available for the affected patients?” Such a query would potentially span information federated across several instances of caArray, caTissue, and NBIA around the world. caIntegrator2 allows researchers to set up study specific custom web portals that bring together heterogeneous clinical, microarray and in vivo imaging data. This tool provides a graphical user interface to allow a study author to “point” to data of interest in systems on the grid and to then bring that data (or pointers to it, in the case of images) into the data warehouse. Once this information is in the caIntegrator2 environment, end user scientists can then run advanced queries, perform correlative outcomes analysis using Kaplan-Meier survival plots, and access analysis and visualization tools on and off the grid. All tools in the LSD are open source and community members are encouraged to participate and contribute. More information on the LSD suite of products, including installation packages, user and installation guides, and links to exemplar installations can be found at https://cabig.nci.nih.gov/adopt/LSD/ Citation Format: {Authors}. {Abstract title} [abstract]. In: Proceedings of the 101st Annual Meeting of the American Association for Cancer Research; 2010 Apr 17-21; Washington, DC. Philadelphia (PA): AACR; Cancer Res 2010;70(8 Suppl):Abstract nr 99.


Archive | 2016

Metadata Mapping In Biocaddie: Challenging Cases

Nansu Zong; Diana Guijiarro; Sze Nga Wong; Shao Ling Soh; Muhammad Khan; Hyeoneui Kim; Jeffrey S. Grethe; Burak Ozyurt; Hua Xu; Xiaoling Chen; Ruiling Liu; Anupama E. Gururaj; Ergin Soysal; Yueling Li; Claudiu Farcas; Alejandra Gonzalez-Beltran; Philippe Rocca-Serra; Ian Fore; Ronald Margolis; George Alter; Susanna-Assunta Sansone; Lucila Ohno-Machado


AMIA | 2016

Development of DataMed, a Data Discovery Index Prototype by bioCADDIE: Laying the Groundwork for Biomedical Data Discovery.

Hua Xu; Jeffrey S. Grethe; Xiaoling Chen; Ruiling Liu; Ergin Soysal; Anupama E. Gururaj; Yueling Li; Ibrahim Burak Ozyurt; Hyeoneui Kim; Trevor Cohen; Todd R. Johnson; Mandana Salimi; Saeid Pournejati; Min Jiang; Claudiu Farcas; Alejandra González Beltrán; Philippe Rocca-Serra; Muhamamd F. Amith; Cui Tao; Ian Fore; Ronald Margolis; George Alter; Susanna-Assunta Sansone; Lucila Ohno-Machado


AMIA | 2016

A Scalable Dataset Indexing Infrastructure for the bioCADDIE Data Discovery System.

Jeffrey S. Grethe; Ibrahim Burak Ozyurt; Hua Xu; Xiaoling Chen; Ruiling Liu; Ergin Soysal; Anupama E. Gururaj; Hyeoneui Kim; Trevor Cohen; Todd R. Johnson; Mandana Salimi; Saeid Pournejati; Min Jiang; Claudiu Farcas; Alejandra González Beltrán; Philippe Rocca-Serra; Muhamamd F. Amith; Cui Tao; Ian Fore; Ronald Margolis; George Alter; Susanna-Assunta Sansone; Lucila Ohno-Machado

Collaboration


Dive into the Ian Fore's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Hua Xu

University of Texas Health Science Center at Houston

View shared research outputs
Top Co-Authors

Avatar

Hyeoneui Kim

University of California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Anupama E. Gururaj

University of Texas Health Science Center at Houston

View shared research outputs
Top Co-Authors

Avatar

Ergin Soysal

University of Texas Health Science Center at Houston

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge