Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Arzucan Özgür is active.

Publication


Featured researches published by Arzucan Özgür.


intelligent systems in molecular biology | 2008

Identifying gene-disease associations using centrality on a literature mined gene-interaction network

Arzucan Özgür; Thuy Vu; Güneş Erkan; Dragomir R. Radev

Motivation: Understanding the role of genetics in diseases is one of the most important aims of the biological sciences. The completion of the Human Genome Project has led to a rapid increase in the number of publications in this area. However, the coverage of curated databases that provide information manually extracted from the literature is limited. Another challenge is that determining disease-related genes requires laborious experiments. Therefore, predicting good candidate genes before experimental analysis will save time and effort. We introduce an automatic approach based on text mining and network analysis to predict gene-disease associations. We collected an initial set of known disease-related genes and built an interaction network by automatic literature mining based on dependency parsing and support vector machines. Our hypothesis is that the central genes in this disease-specific network are likely to be related to the disease. We used the degree, eigenvector, betweenness and closeness centrality metrics to rank the genes in the network. Results: The proposed approach can be used to extract known and to infer unknown gene-disease associations. We evaluated the approach for prostate cancer. Eigenvector and degree centrality achieved high accuracy. A total of 95% of the top 20 genes ranked by these methods are confirmed to be related to prostate cancer. On the other hand, betweenness and closeness centrality predicted more genes whose relation to the disease is currently unknown and are candidates for experimental study. Availability: A web-based system for browsing the disease-specific gene-interaction networks is available at: http://gin.ncibi.org Contact: [email protected]


Nucleic Acids Research | 2009

Michigan molecular interactions r2: from interacting proteins to pathways.

V. Glenn Tarcea; Terry E. Weymouth; Alexander S. Ade; Aaron V. Bookvich; Jing Gao; Vasudeva Mahavisno; Zach Wright; Adriane Chapman; Magesh Jayapandian; Arzucan Özgür; Yuanyuan Tian; James D. Cavalcoli; Barbara Mirel; Jignesh M. Patel; Dragomir R. Radev; Brian D. Athey; David J. States; H. V. Jagadish

Molecular interaction data exists in a number of repositories, each with its own data format, molecule identifier and information coverage. Michigan molecular interactions (MiMI) assists scientists searching through this profusion of molecular interaction data. The original release of MiMI gathered data from well-known protein interaction databases, and deep merged this information while keeping track of provenance. Based on the feedback received from users, MiMI has been completely redesigned. This article describes the resulting MiMI Release 2 (MiMIr2). New functionality includes extension from proteins to genes and to pathways; identification of highlighted sentences in source publications; seamless two-way linkage with Cytoscape; query facilities based on MeSH/GO terms and other concepts; approximate graph matching to find relevant pathways; support for querying in bulk; and a user focus-group driven interface design. MiMI is part of the NIHs; National Center for Integrative Biomedical Informatics (NCIBI) and is publicly available at: http://mimi.ncibi.org.


international symposium on computer and information sciences | 2005

Text categorization with class-based and corpus-based keyword selection

Arzucan Özgür; Levent Özgür; Tunga Güngör

In this paper, we examine the use of keywords in text categorization with SVM. In contrast to the usual belief, we reveal that using keywords instead of all words yields better performance both in terms of accuracy and time. Unlike the previous studies that focus on keyword selection metrics, we compare the two approaches for keyword selection. In corpus-based approach, a single set of keywords is selected for all classes. In class-based approach, a distinct set of keywords is selected for each class. We perform the experiments with the standard Reuters-21578 dataset, with both boolean and tf-idf weighting. Our results show that although tf-idf weighting performs better, boolean weighting can be used where time and space resources are limited. Corpus-based approach with 2000 keywords performs the best. However, for small number of keywords, class-based approach outperforms the corpus-based approach with the same number of keywords.


Bioinformatics | 2013

PHISTO: pathogen–host interaction search tool

Saliha Durmuş Tekir; Tunahan Çakır; Emre Ardıç; Ali Semih Sayılırbaş; Gökhan Konuk; Mithat Konuk; Hasret Sarıyer; Azat Uğurlu; İlknur Karadeniz; Arzucan Özgür; Fatih Erdogan Sevilgen; Kutlu O. Ulgen

SUMMARY Knowledge of pathogen-host protein interactions is required to better understand infection mechanisms. The pathogen-host interaction search tool (PHISTO) is a web-accessible platform that provides relevant information about pathogen-host interactions (PHIs). It enables access to the most up-to-date PHI data for all pathogen types for which experimentally verified protein interactions with human are available. The platform also offers integrated tools for visualization of PHI networks, graph-theoretical analysis of targeted human proteins, BLAST search and text mining for detecting missing experimental methods. PHISTO will facilitate PHI studies that provide potential therapeutic targets for infectious diseases. AVAILABILITY http://www.phisto.org. CONTACT [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.


Journal of Biomedical Semantics | 2011

Mining of vaccine-associated IFN-γ gene interaction networks using the Vaccine Ontology.

Arzucan Özgür; Zuoshuang Xiang; Dragomir R. Radev; Yongqun He

BackgroundInterferon-gamma (IFN-γ) is vital in vaccine-induced immune defense against bacterial and viral infections and tumor. Our recent study demonstrated the power of a literature-based discovery method in extraction and comparison of the IFN-γ and vaccine-mediated gene interaction networks. The Vaccine Ontology (VO) contains a hierarchy of vaccine names. It is hypothesized that the application of VO will enhance the prediction of IFN-γ and vaccine-mediated gene interaction network.ResultsIn this study, 186 specific vaccine names listed in the Vaccine Ontology (VO) and their semantic relations were used for possible improved retrieval of the IFN-γ and vaccine associated gene interactions. The application of VO allows discovery of 38 more genes and 60 more interactions. Comparison of different layers of IFN-γ networks and the example BCG vaccine-induced subnetwork led to generation of new hypotheses. By analyzing all discovered genes using centrality metrics, 32 genes were ranked high in the VO-based IFN-γ vaccine network using four centrality scores. Furthermore, 28 specific vaccines were found to be associated with these top 32 genes. These specific vaccine-gene associations were further used to generate a network of vaccine-vaccine associations. The BCG and LVS vaccines are found to be the most central vaccines in the vaccine-vaccine association network.ConclusionOur results demonstrate that the combined usages of biomedical ontologies and centrality-based literature mining are able to significantly facilitate discovery of gene interaction networks and gene-concept associations.AvailabilityVO is available at: http://www.violinet.org/vaccineontology; and the SVM edit kernel for gene interaction extraction is available at: http://www.violinet.org/ifngvonet/int_ext_svm.zip


empirical methods in natural language processing | 2009

Detecting Speculations and their Scopes in Scientific Text

Arzucan Özgür; Dragomir R. Radev

Distinguishing speculative statements from factual ones is important for most biomedical text mining applications. We introduce an approach which is based on solving two sub-problems to identify speculative sentence fragments. The first sub-problem is identifying the speculation keywords in the sentences and the second one is resolving their linguistic scopes. We formulate the first sub-problem as a supervised classification task, where we classify the potential keywords as real speculation keywords or not by using a diverse set of linguistic features that represent the contexts of the keywords. After detecting the actual speculation keywords, we use the syntactic structures of the sentences to determine their scopes.


Frontiers in Microbiology | 2015

A review on computational systems biology of pathogen-host interactions.

Saliha Durmuş; Tunahan Çakır; Arzucan Özgür; Reinhard Guthke

Pathogens manipulate the cellular mechanisms of host organisms via pathogen–host interactions (PHIs) in order to take advantage of the capabilities of host cells, leading to infections. The crucial role of these interspecies molecular interactions in initiating and sustaining infections necessitates a thorough understanding of the corresponding mechanisms. Unlike the traditional approach of considering the host or pathogen separately, a systems-level approach, considering the PHI system as a whole is indispensable to elucidate the mechanisms of infection. Following the technological advances in the post-genomic era, PHI data have been produced in large-scale within the last decade. Systems biology-based methods for the inference and analysis of PHI regulatory, metabolic, and protein–protein networks to shed light on infection mechanisms are gaining increasing demand thanks to the availability of omics data. The knowledge derived from the PHIs may largely contribute to the identification of new and more efficient therapeutics to prevent or cure infections. There are recent efforts for the detailed documentation of these experimentally verified PHI data through Web-based databases. Despite these advances in data archiving, there are still large amounts of PHI data in the biomedical literature yet to be discovered, and novel text mining methods are in development to unearth such hidden data. Here, we review a collection of recent studies on computational systems biology of PHIs with a special focus on the methods for the inference and analysis of PHI networks, covering also the Web-based databases and text-mining efforts to unravel the data hidden in the literature.


BioMed Research International | 2010

Literature-based discovery of IFN-gamma and vaccine-mediated gene interaction networks.

Arzucan Özgür; Zuoshuang Xiang; Dragomir R. Radev; Yongqun He

Interferon-gamma (IFN-γ) regulates various immune responses that are often critical for vaccine-induced protection. In order to annotate the IFN-γ-related gene interaction network from a large amount of IFN-γ research reported in the literature, a literature-based discovery approach was applied with a combination of natural language processing (NLP) and network centrality analysis. The interaction network of human IFN-γ (Gene symbol: IFNG) and its vaccine-specific subnetwork were automatically extracted using abstracts from all articles in PubMed. Four network centrality metrics were further calculated to rank the genes in the constructed networks. The resulting generic IFNG network contains 1060 genes and 26313 interactions among these genes. The vaccine-specific subnetwork contains 102 genes and 154 interactions. Fifty six genes such as TNF, NFKB1, IL2, IL6, and MAPK8 were ranked among the top 25 by at least one of the centrality methods in one or both networks. Gene enrichment analysis indicated that these genes were classified in various immune mechanisms such as response to extracellular stimulus, lymphocyte activation, and regulation of apoptosis. Literature evidence was manually curated for the IFN-γ relatedness of 56 genes and vaccine development relatedness for 52 genes. This study also generated many new hypotheses worth further experimental studies.


international conference on machine learning and applications | 2014

Improving Named Entity Recognition for Morphologically Rich Languages Using Word Embeddings

Hakan Demir; Arzucan Özgür

In this paper, we addressed the Named Entity Recognition (NER) problem for morphologically rich languages by employing a semi-supervised learning approach based on neural networks. We adopted a fast unsupervised method for learning continuous vector representations of words, and used these representations along with language independent features to develop a NER system. We evaluated our system for the highly inflectional Turkish and Czech languages. We improved the state-of-the-art F-score obtained for Turkish without using gazetteers by 2.26% and for Czech by 1.53%. Unlike the previous state-of-the-art systems developed for these languages, our system does not make use of any language dependent features. Therefore, we believe it can easily be applied to other morphologically rich languages.


Journal of Biomedical Semantics | 2012

Identification of fever and vaccine-associated gene interaction networks using ontology-based literature mining

Junguk Hur; Arzucan Özgür; Zuoshuang Xiang; Yongqun He

BackgroundFever is one of the most common adverse events of vaccines. The detailed mechanisms of fever and vaccine-associated gene interaction networks are not fully understood. In the present study, we employed a genome-wide, Centrality and Ontology-based Network Discovery using Literature data (CONDL) approach to analyse the genes and gene interaction networks associated with fever or vaccine-related fever responses.ResultsOver 170,000 fever-related articles from PubMed abstracts and titles were retrieved and analysed at the sentence level using natural language processing techniques to identify genes and vaccines (including 186 Vaccine Ontology terms) as well as their interactions. This resulted in a generic fever network consisting of 403 genes and 577 gene interactions. A vaccine-specific fever sub-network consisting of 29 genes and 28 gene interactions was extracted from articles that are related to both fever and vaccines. In addition, gene-vaccine interactions were identified. Vaccines (including 4 specific vaccine names) were found to directly interact with 26 genes. Gene set enrichment analysis was performed using the genes in the generated interaction networks. Moreover, the genes in these networks were prioritized using network centrality metrics. Making scientific discoveries and generating new hypotheses were possible by using network centrality and gene set enrichment analyses. For example, our study found that the genes in the generic fever network were more enriched in cell death and responses to wounding, and the vaccine sub-network had more gene enrichment in leukocyte activation and phosphorylation regulation. The most central genes in the vaccine-specific fever network are predicted to be highly relevant to vaccine-induced fever, whereas genes that are central only in the generic fever network are likely to be highly relevant to generic fever responses. Interestingly, no Toll-like receptors (TLRs) were found in the gene-vaccine interaction network. Since multiple TLRs were found in the generic fever network, it is reasonable to hypothesize that vaccine-TLR interactions may play an important role in inducing fever response, which deserves a further investigation.ConclusionsThis study demonstrated that ontology-based literature mining is a powerful method for analyzing gene interaction networks and generating new scientific hypotheses.

Collaboration


Dive into the Arzucan Özgür's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yongqun He

University of Michigan

View shared research outputs
Top Co-Authors

Avatar

Junguk Hur

University of North Dakota

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge