Zafer Aydin | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Zafer Aydin is active.

Explore More

Publication

Featured researches published by Zafer Aydin.

BMC Bioinformatics | 2006

Protein secondary structure prediction for a single-sequence using hidden semi-Markov models

Zafer Aydin; Yucel Altunbasak; Mark Borodovsky

BackgroundThe accuracy of protein secondary structure prediction has been improving steadily towards the 88% estimated theoretical limit. There are two types of prediction algorithms: Single-sequence prediction algorithms imply that information about other (homologous) proteins is not available, while algorithms of the second type imply that information about homologous proteins is available, and use it intensively. The single-sequence algorithms could make an important contribution to studies of proteins with no detected homologs, however the accuracy of protein secondary structure prediction from a single-sequence is not as high as when the additional evolutionary information is present.ResultsIn this paper, we further refine and extend the hidden semi-Markov model (HSMM) initially considered in the BSPSS algorithm. We introduce an improved residue dependency model by considering the patterns of statistically significant amino acid correlation at structural segment borders. We also derive models that specialize on different sections of the dependency structure and incorporate them into HSMM. In addition, we implement an iterative training method to refine estimates of HSMM parameters. The three-state-per-residue accuracy and other accuracy measures of the new method, IPSSP, are shown to be comparable or better than ones for BSPSS as well as for PSIPRED, tested under the single-sequence condition.ConclusionsWe have shown that new dependency models and training methods bring further improvements to single-sequence protein secondary structure prediction. The results are obtained under cross-validation conditions using a dataset with no pair of sequences having significant sequence similarity. As new sequences are added to the database it is possible to augment the dependency structure and obtain even higher accuracy. Current and future advances should contribute to the improvement of function prediction for orphan proteins inscrutable to current similarity search methods.

IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2011

Bayesian Models and Algorithms for Protein β-Sheet Prediction

Zafer Aydin; Yucel Altunbasak; Hakan Erdogan

Prediction of the 3D structure greatly benefits from the information related to secondary structure, solvent accessibility, and nonlocal contacts that stabilize a proteins structure. We address the problem of β-sheet prediction defined as the prediction of β-strand pairings, interaction types (parallel or antiparallel), and β-residue interactions (or contact maps). We introduce a Bayesian approach for proteins with six or less β-strands in which we model the conformational features in a probabilistic framework by combining the amino acid pairing potentials with a priori knowledge of β-strand organizations. To select the optimum β-sheet architecture, we significantly reduce the search space by heuristics that enforce the amino acid pairs with strong interaction potentials. In addition, we find the optimum pairwise alignment between β-strands using dynamic programming in which we allow any number of gaps in an alignment to model β-bulges more effectively. For proteins with more than six β-strands, we first compute β-strand pairings using the BetaPro method. Then, we compute gapped alignments of the paired β-strands and choose the interaction types and β-residue pairings with maximum alignment scores. We performed a 10-fold cross-validation experiment on the BetaSheet916 set and obtained significant improvements in the prediction accuracy.

BMC Bioinformatics | 2011

Learning sparse models for a dynamic Bayesian network classifier of protein secondary structure

Zafer Aydin; Ajit Paul Singh; Jeff A. Bilmes; William Stafford Noble

BackgroundProtein secondary structure prediction provides insight into protein function and is a valuable preliminary step for predicting the 3D structure of a protein. Dynamic Bayesian networks (DBNs) and support vector machines (SVMs) have been shown to provide state-of-the-art performance in secondary structure prediction. As the size of the protein database grows, it becomes feasible to use a richer model in an effort to capture subtle correlations among the amino acids and the predicted labels. In this context, it is beneficial to derive sparse models that discourage over-fitting and provide biological insight.ResultsIn this paper, we first show that we are able to obtain accurate secondary structure predictions. Our per-residue accuracy on a well established and difficult benchmark (CB513) is 80.3%, which is comparable to the state-of-the-art evaluated on this dataset. We then introduce an algorithm for sparsifying the parameters of a DBN. Using this algorithm, we can automatically remove up to 70-95% of the parameters of a DBN while maintaining the same level of predictive accuracy on the SD576 set. At 90% sparsity, we are able to compute predictions three times faster than a fully dense model evaluated on the SD576 set. We also demonstrate, using simulated data, that the algorithm is able to recover true sparse structures with high accuracy, and using real data, that the sparse model identifies known correlation structure (local and non-local) related to different classes of secondary structure elements.ConclusionsWe present a secondary structure prediction method that employs dynamic Bayesian networks and support vector machines. We also introduce an algorithm for sparsifying the parameters of the dynamic Bayesian network. The sparsification approach yields a significant speed-up in generating predictions, and we demonstrate that the amino acid correlations identified by the algorithm correspond to several known features of protein secondary structure. Datasets and source code used in this study are available at http://noble.gs.washington.edu/proj/pssp.

BMC Bioinformatics | 2010

Using machine learning to speed up manual image annotation: application to a 3D imaging protocol for measuring single cell gene expression in the developing C. elegans embryo.

Zafer Aydin; John I. Murray; Robert H. Waterston; William Stafford Noble

BackgroundImage analysis is an essential component in many biological experiments that study gene expression, cell cycle progression, and protein localization. A protocol for tracking the expression of individual C. elegans genes was developed that collects image samples of a developing embryo by 3-D time lapse microscopy. In this protocol, a program called StarryNite performs the automatic recognition of fluorescently labeled cells and traces their lineage. However, due to the amount of noise present in the data and due to the challenges introduced by increasing number of cells in later stages of development, this program is not error free. In the current version, the error correction (i.e., editing) is performed manually using a graphical interface tool named AceTree, which is specifically developed for this task. For a single experiment, this manual annotation task takes several hours.ResultsIn this paper, we reduce the time required to correct errors made by StarryNite. We target one of the most frequent error types (movements annotated as divisions) and train a support vector machine (SVM) classifier to decide whether a division call made by StarryNite is correct or not. We show, via cross-validation experiments on several benchmark data sets, that the SVM successfully identifies this type of error significantly. A new version of StarryNite that includes the trained SVM classifier is available at http://starrynite.sourceforge.net.ConclusionsWe demonstrate the utility of a machine learning approach to error annotation for StarryNite. In the process, we also provide some general methodologies for developing and validating a classifier with respect to a given pattern recognition task.

IEEE Signal Processing Magazine | 2006

A signal processing application in genomic research: protein secondary structure prediction

Zafer Aydin; Yucel Altunbasak

The digital nature of genomic information makes it suitable for the application of signal processing techniques to better analyze and understand the characteristics of DNA, proteins, and their interaction. Prediction of genes, protein structure, and protein function greatly utilize pattern recognition techniques, in which hidden Markov models, neural networks, and support vector machines play a central role. Signal processing offers a variety of methods from pattern recognition and network analysis for the diagnosis and therapy of genetic diseases. In this paper, we focus on protein secondary structure prediction and discuss the problems in single sequence setting.

IEEE Transactions on Signal Processing | 2007

Bayesian Protein Secondary Structure Prediction With Near-Optimal Segmentations

Zafer Aydin; Yucel Altunbasak; Hakan Erdogan

Secondary structure prediction is an invaluable tool in determining the 3-D structure and function of proteins. Typically, protein secondary structure prediction methods suffer from low accuracy in beta-strand predictions, where nonlocal interactions play a significant role. There is a considerable need to model such long- range interactions that contribute to the stabilization of a protein molecule. In this paper, we introduce an alternative decoding technique for the hidden semi-Markov model (HSMM) originally employed in the BSPSS algorithm, and further developed in the IPSSP algorithm. The proposed method is based on the N-best paradigm where a set of most likely segmentations is computed. To generate suboptimal segmentations (i.e., alternative prediction sequences), we developed two N-best search algorithms. The first one is an A* stack decoder algorithm that extends paths (or hypotheses) by one symbol at each iteration. The second algorithm locally keeps the end positions of the highest scoring K previous segments and performs backtracking. Both algorithms employ the hidden semi- Markov model described in Aydin etal. [5], and use Viterbi scoring to compute the N-best list. The availability of near-optimal segmentations and the utilization of the Viterbi scoring enable the sequences to be rescored using more complex dependency models that characterize nonlocal interactions in beta-sheets. After the score update, one can either keep the segmentations to be employed in 3-D structure prediction or predict the secondary structure by applying a weighted voting procedure to a set of top scoring M ges 1 segmentations. The accuracy measures of the N-best method when used to predict the secondary structure are shown to be comparable or better than the classical Viterbi decoder (MAP estimator), tested under the single-sequence condition. When no rescoring is applied, the stack decoder algorithm with sufficiently large M improves the overall sensitivity measure (Q3) of the Viterbi algorithm by 1.1%. At the same M value, the N-best Viterbi algorithm improves the Q3 measure by 0.25% as well as the sensitivity measures specific for each secondary structure type (Qobs alpha, Qobs beta, Qobs L). When the sequences are rescored using the posterior probability distribution computed by the posterior decoding algorithm (MPM estimator), N-best Viterbi improves the Q3 measure of the Viterbi algorithm by 2.6%. The rescored N-best list approach also enables us to generate suboptimal segmentations that are valid sequences (i.e., realizable from the hidden semi-Markov model). Although the N-best algorithms and the score update procedure brought significant improvements over the Viterbi algorithm, they were not able to outperform the posterior decoding algorithm in the single-sequence condition. Further improvements in the prediction accuracy should be possible with the incorporation of sophisticated models for nonlocal interactions and other physical constraints that stabilize the overall structure of a protein.

Multimedia Tools and Applications | 2015

BAUM-2: a multilingual audio-visual affective face database

Cigdem Eroglu Erdem; Çiğdem Turan; Zafer Aydin

Access to audio-visual databases, which contain enough variety and are richly annotated is essential to assess the performance of algorithms in affective computing applications, which require emotion recognition from face and/or speech data. Most databases available today have been recorded under tightly controlled environments, are mostly acted and do not contain speech data. We first present a semi-automatic method that can extract audio-visual facial video clips from movies and TV programs in any language. The method is based on automatic detection and tracking of faces in a movie until the face is occluded or a scene cut occurs. We also created a video-based database, named as BAUM-2, which consists of annotated audio-visual facial clips in several languages. The collected clips simulate real-world conditions by containing various head poses, illumination conditions, accessories, temporary occlusions and subjects with a wide range of ages. The proposed semi-automatic affective clip extraction method can easily be used to extend the database to contain clips in other languages. We also created an image based facial expression database from the peak frames of the video clips, which is named as BAUM-2i. Baseline image and video-based facial expression recognition results using state-of-the art features and classifiers indicate that facial expression recognition under tough and close-to-natural conditions is quite challenging.

2015 3rd International Istanbul Smart Grid Congress and Fair (ICSG) | 2015

Short term electricity load forecasting: A case study of electric utility market in Turkey

Muhammed Yasin Ishik; Tolga Goze; İhsan Özcan; Vehbi Cagri Gungor; Zafer Aydin

With the recent developments in energy sector, the pricing of electricity is now governed by the spot market where a variety of market mechanisms are effective. After the new legislation of market liberalization in Turkey, competition-based on hourly price has received a growing interest in the energy market, which necessitated generators and electric utility companies to add new dimensions to their scope of operation: short-term load and price forecasting. The field has several opportunities though not free from challenges. The dynamic behavior of the market price has caused the electric load to become variable and non-stationary. Furthermore, the number of nodes, in which the load must be predicted, is not constant anymore and can no longer be estimated by experts alone. In this competitive scenario, statistical forecasting methods that can automatically and accurately process thousands of data samples are essential. The purpose of this study is to demonstrate the importance of short-term load forecasting, how it has received a growing interest in Turkey and to propose an artificial neural network that can forecast the short term electricity load. Through detailed performance evaluations, we demonstrate that our forecasting method is capable of predicting the hourly load accurately.

ieee conference on antenna measurements applications | 2014

Design of a tri band 5-fingers shaped microstrip patch antenna with an adjustable resistor

Ashrf Aoad; Zafer Aydin; Erdal Korkmaz

This paper presents a tri band 5-fingers shaped microstrip patch antenna, which resonates initially at dual band of 3.2 GHz and 5.2 GHz frequencies for VSWR <; 2. The antenna is modified by adding an adjustable resistor between the conductor and the reflecting plane giving a third resonant frequency of 2.4 GHz. A decrease in the return loss at 2.4 GHz is observed by modifying the value of the resistance. Impedance bandwidth and the resonant frequencies are examined with respect to the variability of the parameters of the antenna and the position of the adjustable resistor. The size of the antenna has been reduced by 57.9% in length and 14.06% in width. The proposed antenna can be used for 4G, WLAN, and Wi-MAX. The antenna is designed and optimized by using the commercial CST software.

international conference of the ieee engineering in medicine and biology society | 2007

Training Set Reduction Methods for Protein Secondary Structure Prediction in Single-Sequence Condition

Zafer Aydin; Yucel Altunbasak; Isa Kemal Pakatci; Hakan Erdogan

Orphan proteins are characterized by the lack of significant sequence similarity to database proteins. To infer the functional properties of the orphans, more elaborate techniques that utilize structural information are required. In this regard, the protein structure prediction gains considerable importance. Secondary structure prediction algorithms designed for orphan proteins (also known as single-sequence algorithms) cannot utilize multiple alignments or alignment profiles, which are derived from similar proteins. This is a limiting factor for the prediction accuracy. One way to improve the performance of a single-sequence algorithm is to perform re-training. In this approach, first, the models used by the algorithm are trained by a representative set of proteins and a secondary structure prediction is computed. Then, using a distance measure, the original training set is refined by removing proteins that are dissimilar to the given protein. This step is followed by the re-estimation of the model parameters and the prediction of the secondary structure. In this paper, we compare training set reduction methods that are used to re-train the hidden semi- Markov models employed by the IPSSP algorithm [1]. We found that the composition based reduction method has the highest performance compared to the alignment based and the Chou- Fasman based reduction methods. In addition, threshold-based reduction performed better than the reduction technique that selects the first 80% of the dataset proteins.

Explore More