Farshad Fotouhi
Wayne State University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Farshad Fotouhi.
Genome Biology | 2007
Jodi R Parrish; Jingkai Yu; Guozhen Liu; Julie A Hines; Jason E. Chan; Bernie A Mangiola; Huamei Zhang; Svetlana Pacifico; Farshad Fotouhi; Victor J. DiRita; Trey Ideker; Phillip C. Andrews; Russell L. Finley
BackgroundData from large-scale protein interaction screens for humans and model eukaryotes have been invaluable for developing systems-level models of biological processes. Despite this value, only a limited amount of interaction data is available for prokaryotes. Here we report the systematic identification of protein interactions for the bacterium Campylobacter jejuni, a food-borne pathogen and a major cause of gastroenteritis worldwide.ResultsUsing high-throughput yeast two-hybrid screens we detected and reproduced 11,687 interactions. The resulting interaction map includes 80% of the predicted C. jejuni NCTC11168 proteins and places a large number of poorly characterized proteins into networks that provide initial clues about their functions. We used the map to identify a number of conserved subnetworks by comparison to protein networks from Escherichia coli and Saccharomyces cerevisiae. We also demonstrate the value of the interactome data for mapping biological pathways by identifying the C. jejuni chemotaxis pathway. Finally, the interaction map also includes a large subnetwork of putative essential genes that may be used to identify potential new antimicrobial drug targets for C. jejuni and related organisms.ConclusionThe C. jejuni protein interaction map is one of the most comprehensive yet determined for a free-living organism and nearly doubles the binary interactions available for the prokaryotic kingdom. This high level of coverage facilitates pathway mapping and function prediction for a large number of C. jejuni proteins as well as orthologous proteins from other organisms. The broad coverage also facilitates cross-species comparisons for the identification of evolutionarily conserved subnetworks of protein interactions.
BMC Bioinformatics | 2004
Yi Lu; Shiyong Lu; Farshad Fotouhi; Youping Deng; Susan J. Brown
BackgroundIn recent years, clustering algorithms have been effectively applied in molecular biology for gene expression data analysis. With the help of clustering algorithms such as K-means, hierarchical clustering, SOM, etc, genes are partitioned into groups based on the similarity between their expression profiles. In this way, functionally related genes are identified. As the amount of laboratory data in molecular biology grows exponentially each year due to advanced technologies such as Microarray, new efficient and effective methods for clustering must be developed to process this growing amount of biological data.ResultsIn this paper, we propose a new clustering algorithm, Incremental Genetic K-means Algorithm (IGKA). IGKA is an extension to our previously proposed clustering algorithm, the Fast Genetic K-means Algorithm (FGKA). IGKA outperforms FGKA when the mutation probability is small. The main idea of IGKA is to calculate the objective value Total Within-Cluster Variation (TWCV) and to cluster centroids incrementally whenever the mutation probability is small. IGKA inherits the salient feature of FGKA of always converging to the global optimum. C program is freely available at http://database.cs.wayne.edu/proj/FGKA/index.htm.ConclusionsOur experiments indicate that, while the IGKA algorithm has a convergence pattern similar to FGKA, it has a better time performance when the mutation probability decreases to some point. Finally, we used IGKA to cluster a yeast dataset and found that it increased the enrichment of genes of similar function within the cluster.
acm symposium on applied computing | 2004
Yi Lu; Shiyong Lu; Farshad Fotouhi; Youping Deng; Susan J. Brown
In this paper, we propose a new clustering algorithm called Fast Genetic K-means Algorithm (FGKA). FGKA is inspired by the Genetic K-means Algorithm (GKA) proposed by Krishna and Murty in 1999 but features several improvements over GKA. Our experiments indicate that, while K-means algorithm might converge to a local optimum, both FGKA and GKA always converge to the global optimum eventually but FGKA runs much faster than GKA.
IEEE Transactions on Services Computing | 2009
Cui Lin; Shiyong Lu; Xubo Fei; Artem Chebotko; Darshan Pai; Zhaoqiang Lai; Farshad Fotouhi; Jing Hua
Scientific workflows have recently emerged as a new paradigm for scientists to formalize and structure complex and distributed scientific processes to enable and accelerate many scientific discoveries. In contrast to business workflows, which are typically control flow oriented, scientific workflows tend to be dataflow oriented, introducing a new set of requirements for system development. These requirements demand a new architectural design for scientific workflow management systems (SWFMSs). Although several SWFMSs have been developed that provide much experience for future research and development, a study from an architectural perspective is still missing. The main contributions of this paper are: 1) based on a comprehensive survey of the literature and identification of key requirements for SWFMSs, we propose the first reference architecture for SWFMSs; 2) according to the reference architecture, we further propose a service-oriented architecture for View (a VIsual sciEntific Workflow management system); 3) we implemented View to validate the feasibility of the proposed architectures; and 4) we present a View-based scientific workflow application system (SWFAS), called FiberFlow, to showcase the application of our View system.
data and knowledge engineering | 2009
Artem Chebotko; Shiyong Lu; Farshad Fotouhi
Most existing RDF stores, which serve as metadata repositories on the Semantic Web, use an RDBMS as a backend to manage RDF data. This motivates us to study the problem of translating SPARQL queries into equivalent SQL queries, which further can be optimized and evaluated by the relational query engine and their results can be returned as SPARQL query solutions. The main contributions of our research are: (i) We formalize a relational algebra based semantics of SPARQL, which bridges the gap between SPARQL and SQL query languages, and prove that our semantics is equivalent to the mapping-based semantics of SPARQL; (ii) Based on this semantics, we propose the first provably semantics preserving SPARQL-to-SQL translation for SPARQL triple patterns, basic graph patterns, optional graph patterns, alternative graph patterns, and value constraints; (iii) Our translation algorithm is generic and can be directly applied to existing RDBMS-based RDF stores; and (iv) We outline a number of simplifications for the SPARQL-to-SQL translation to generate simpler and more efficient SQL queries and extend our defined semantics and translation to support the bag semantics of a SPARQL query solution. The experimental study showed that our proposed generic translation can serve as a good alternative to existing schema dependent translations in terms of efficient query evaluation and/or ensured query result correctness.
ieee international conference on services computing | 2008
Cui Lin; Shiyong Lu; Zhaoqiang Lai; Artem Chebotko; Xubo Fei; Jing Hua; Farshad Fotouhi
Scientific workflows have recently emerged as a new paradigm for scientists to formalize and structure complex and distributed scientific processes to enable and accelerate many scientific discoveries. In contrast to business workflows, which are typically control flow oriented, scientific workflows tend to be dataflow oriented, introducing a new set of requirements for system development. These requirements demand a new architectural design for scientific workflow management systems (SWFMSs). Although several SWFMSs have been developed that provide much experience for future research and development, a study from an architectural perspective is still missing. The main contributions of this paper are: (i) based on a comprehensive survey of the literature and identification of key requirements for SWFMSs, we propose the first reference architecture for SWFMSs, (ii) in compliance with the reference architecture, we further propose a service-oriented architecture for VIEW (a VIsual sciEntificWorkflow management system), (iii) we implement VIEW to validate the feasibility of the proposed architectures, and (iv) we present two case studies to showcase the applications of our VIEW system.
acm multimedia | 2005
Changbo Yang; Ming Dong; Farshad Fotouhi
In an annotated image database, keywords are usually associated with images instead of individual regions, which poses a major challenge for any region based image annotation algorithm. In this paper, we propose to learn the correspondence between image regions and keywords through Multiple-Instance Learning (MIL). After a representative image region has been learned for a given keyword, we consider image annotation as a problem of image classification, in which each keyword is treated as a distinct class label. The classification problem is then addressed using the Bayesian framework. The proposed image annotation method is evaluated on an image database with 5,000 images.
Information Systems | 2007
Mustafa Atay; Artem Chebotko; Dapeng Liu; Shiyong Lu; Farshad Fotouhi
Storing and querying XML documents using a RDBMS is a challenging problem since one needs to resolve the conflict between the hierarchical, ordered nature of the XML data model and the flat, unordered nature of the relational data model. This conflict can be resolved by the following XML-to-Relational mappings: schema mapping, data mapping and query mapping. In this paper, we propose: (i) a lossless schema mapping algorithm to generate a database schema from a DTD, which makes several improvements over existing algorithms, (ii) two linear data mapping algorithms based on DOM and SAX, respectively, to map ordered XML data to relational data. To our best knowledge, there is no published linear schema-based data mapping algorithm for mapping ordered XML data to relational data. Experimental results are presented to show that our algorithms are efficient and scalable.
international conference on data mining | 2006
Manjeet Rege; Ming Dong; Farshad Fotouhi
In this paper, we present a novel graph theoretic approach to the problem of document-word co-clustering. In our approach, documents and words are modeled as the two vertices of a bipartite graph. We then propose isoperimetric co-clustering algorithm (ICA) - a new method for partitioning the document-word bipartite graph. ICA requires a simple solution to a sparse system of linear equations instead of the eigenvalue or SVD problem in the popular spectral co-clustering approach. Our extensive experiments performed on publicly available datasets demonstrate the advantages of ICA over spectral approach in terms of the quality, efficiency and stability in partitioning the document-word bipartite graph.
data and knowledge engineering | 2010
Artem Chebotko; Shiyong Lu; Xubo Fei; Farshad Fotouhi
Provenance metadata has become increasingly important to support scientific discovery reproducibility, result interpretation, and problem diagnosis in scientific workflow environments. The provenance management problem concerns the efficiency and effectiveness of the modeling, recording, representation, integration, storage, and querying of provenance metadata. Our approach to provenance management seamlessly integrates the interoperability, extensibility, and inference advantages of Semantic Web technologies with the storage and querying power of an RDBMS to meet the emerging requirements of scientific workflow provenance management. In this paper, we elaborate on the design of a relational RDF store, called RDFProv, which is optimized for scientific workflow provenance querying and management. Specifically, we propose: i) two schema mapping algorithms to map an OWL provenance ontology to a relational database schema that is optimized for common provenance queries; ii) three efficient data mapping algorithms to map provenance RDF metadata to relational data according to the generated relational database schema, and iii) a schema-independent SPARQL-to-SQL translation algorithm that is optimized on-the-fly by using the type information of an instance available from the input provenance ontology and the statistics of the sizes of the tables in the database. Experimental results are presented to show that our algorithms are efficient and scalable. The comparison with two popular relational RDF stores, Jena and Sesame, and two commercial native RDF stores, AllegroGraph and BigOWLIM, showed that our optimizations result in improved performance and scalability for provenance metadata management. Finally, our case study for provenance management in a real-life biological simulation workflow showed the production quality and capability of the RDFProv system. Although presented in the context of scientific workflow provenance management, many of our proposed techniques apply to general RDF data management as well.