Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Neil Arvin Bretaña is active.

Publication


Featured researches published by Neil Arvin Bretaña.


Nucleic Acids Research | 2013

dbPTM 3.0: an informative resource for investigating substrate site specificity and functional association of protein post-translational modifications

Cheng Tsung Lu; Kai Yao Huang; Min Gang Su; Tzong-Yi Lee; Neil Arvin Bretaña; Wen Chi Chang; Yi-Ju Chen; Yu-Ju Chen; Hsien-Da Huang

Protein modification is an extremely important post-translational regulation that adjusts the physical and chemical properties, conformation, stability and activity of a protein; thus altering protein function. Due to the high throughput of mass spectrometry (MS)-based methods in identifying site-specific post-translational modifications (PTMs), dbPTM (http://dbPTM.mbc.nctu.edu.tw/) is updated to integrate experimental PTMs obtained from public resources as well as manually curated MS/MS peptides associated with PTMs from research articles. Version 3.0 of dbPTM aims to be an informative resource for investigating the substrate specificity of PTM sites and functional association of PTMs between substrates and their interacting proteins. In order to investigate the substrate specificity for modification sites, a newly developed statistical method has been applied to identify the significant substrate motifs for each type of PTMs containing sufficient experimental data. According to the data statistics in dbPTM, >60% of PTM sites are located in the functional domains of proteins. It is known that most PTMs can create binding sites for specific protein-interaction domains that work together for cellular function. Thus, this update integrates protein–protein interaction and domain–domain interaction to determine the functional association of PTM sites located in protein-interacting domains. Additionally, the information of structural topologies on transmembrane (TM) proteins is integrated in dbPTM in order to delineate the structural correlation between the reported PTM sites and TM topologies. To facilitate the investigation of PTMs on TM proteins, the PTM substrate sites and the structural topology are graphically represented. Also, literature information related to PTMs, orthologous conservations and substrate motifs of PTMs are also provided in the resource. Finally, this version features an improved web interface to facilitate convenient access to the resource.


BMC Bioinformatics | 2011

PlantPhos: using maximal dependence decomposition to identify plant phosphorylation sites with substrate site specificity

Tzong-Yi Lee; Neil Arvin Bretaña; Cheng-Tsung Lu

BackgroundProtein phosphorylation catalyzed by kinases plays crucial regulatory roles in intracellular signal transduction. Due to the difficulty in performing high-throughput mass spectrometry-based experiment, there is a desire to predict phosphorylation sites using computational methods. However, previous studies regarding in silico prediction of plant phosphorylation sites lack the consideration of kinase-specific phosphorylation data. Thus, we are motivated to propose a new method that investigates different substrate specificities in plant phosphorylation sites.ResultsExperimentally verified phosphorylation data were extracted from TAIR9-a protein database containing 3006 phosphorylation data from the plant species Arabidopsis thaliana. In an attempt to investigate the various substrate motifs in plant phosphorylation, maximal dependence decomposition (MDD) is employed to cluster a large set of phosphorylation data into subgroups containing significantly conserved motifs. Profile hidden Markov model (HMM) is then applied to learn a predictive model for each subgroup. Cross-validation evaluation on the MDD-clustered HMMs yields an average accuracy of 82.4% for serine, 78.6% for threonine, and 89.0% for tyrosine models. Moreover, independent test results using Arabidopsis thaliana phosphorylation data from UniProtKB/Swiss-Prot show that the proposed models are able to correctly predict 81.4% phosphoserine, 77.1% phosphothreonine, and 83.7% phosphotyrosine sites. Interestingly, several MDD-clustered subgroups are observed to have similar amino acid conservation with the substrate motifs of well-known kinases from Phospho.ELM-a database containing kinase-specific phosphorylation data from multiple organisms.ConclusionsThis work presents a novel method for identifying plant phosphorylation sites with various substrate motifs. Based on cross-validation and independent testing, results show that the MDD-clustered models outperform models trained without using MDD. The proposed method has been implemented as a web-based plant phosphorylation prediction tool, PlantPhos http://csb.cse.yzu.edu.tw/PlantPhos/. Additionally, two case studies have been demonstrated to further evaluate the effectiveness of PlantPhos.


intelligent systems in molecular biology | 2011

Exploiting maximal dependence decomposition to identify conserved motifs from a group of aligned signal sequences

Tzong-Yi Lee; Zong-Qing Lin; Sheng-Jen Hsieh; Neil Arvin Bretaña; Cheng-Tsung Lu

UNLABELLED Bioinformatics research often requires conservative analyses of a group of sequences associated with a specific biological function (e.g. transcription factor binding sites, micro RNA target sites or protein post-translational modification sites). Due to the difficulty in exploring conserved motifs on a large-scale sequence data involved with various signals, a new method, MDDLogo, is developed. MDDLogo applies maximal dependence decomposition (MDD) to cluster a group of aligned signal sequences into subgroups containing statistically significant motifs. In order to extract motifs that contain a conserved biochemical property of amino acids in protein sequences, the set of 20 amino acids is further categorized according to their physicochemical properties, e.g. hydrophobicity, charge or molecular size. MDDLogo has been demonstrated to accurately identify the kinase-specific substrate motifs in 1221 human phosphorylation sites associated with seven well-known kinase families from Phospho.ELM. Moreover, in a set of plant phosphorylation data-lacking kinase information, MDDLogo has been applied to help in the investigation of substrate motifs of potential kinases and in the improvement of the identification of plant phosphorylation sites with various substrate specificities. In this study, MDDLogo is comparable with another well-known motif discover tool, Motif-X. CONTACT [email protected]


Addiction | 2014

A prospective study of hepatitis C incidence in Australian prisoners

Fabio Luciani; Neil Arvin Bretaña; Suzy Teutsch; Janaki Amin; Libby Topp; Gregory J. Dore; Lisa Maher; Kate Dolan; Andrew Lloyd

AIMS To document the relationships between injecting drug use, imprisonment and hepatitis C virus (HCV) infection. DESIGN Prospective cohort study. SETTING Multiple prisons in New South Wales, Australia. PARTICIPANTS HCV seronegative prisoners with a life-time history of injecting drug use (IDU) were enrolled and followed prospectively (n = 210) by interview and HCV antibody and RNA testing 6-12-monthly for up to 4 years when in prison. MEASUREMENTS HCV incidence was calculated using the person-years method. Cox regression was used to identify predictors of incident infection using time-dependent covariates. RESULTS Almost half the cohort reported IDU during follow-up (103 subjects; 49.1%) and 65 (31%) also reported sharing of the injecting apparatus. There were 38 HCV incident cases in 269.94 person-years (py) of follow-up with an estimated incidence of 14.08 per 100 py [confidence interval (CI) = 9.96-19.32]. Incident infection was associated independently with Indigenous background, injecting daily or more and injecting heroin. Three subjects were RNA-positive and antibody-negative at the incident time-point, indicating early infection, which provided a second incidence estimate of 9.4%. Analysis of continuously incarcerated subjects (n = 114) followed over 126.73 py, identified 13 new HCV infections (10.26 per 100 py, CI = 5.46-17.54), one of which was an early infection case. Bleach-cleansing of injecting equipment and opioid substitution treatment were not associated with a significant reduction in incidence. CONCLUSIONS In New South Wales, Australia, imprisonment is associated with high rates of hepatitis C virus transmission. More effective harm reduction interventions are needed to control HCV in prison settings.


PLOS ONE | 2012

Identifying protein phosphorylation sites with kinase substrate specificity on human viruses.

Neil Arvin Bretaña; Cheng-Tsung Lu; Chiu-Yun Chiang; Min-Gang Su; Kai-Yao Huang; Tzong-Yi Lee; Shun-Long Weng

Viruses infect humans and progress inside the body leading to various diseases and complications. The phosphorylation of viral proteins catalyzed by host kinases plays crucial regulatory roles in enhancing replication and inhibition of normal host-cell functions. Due to its biological importance, there is a desire to identify the protein phosphorylation sites on human viruses. However, the use of mass spectrometry-based experiments is proven to be expensive and labor-intensive. Furthermore, previous studies which have identified phosphorylation sites in human viruses do not include the investigation of the responsible kinases. Thus, we are motivated to propose a new method to identify protein phosphorylation sites with its kinase substrate specificity on human viruses. The experimentally verified phosphorylation data were extracted from virPTM – a database containing 301 experimentally verified phosphorylation data on 104 human kinase-phosphorylated virus proteins. In an attempt to investigate kinase substrate specificities in viral protein phosphorylation sites, maximal dependence decomposition (MDD) is employed to cluster a large set of phosphorylation data into subgroups containing significantly conserved motifs. The experimental human phosphorylation sites are collected from Phospho.ELM, grouped according to its kinase annotation, and compared with the virus MDD clusters. This investigation identifies human kinases such as CK2, PKB, CDK, and MAPK as potential kinases for catalyzing virus protein substrates as confirmed by published literature. Profile hidden Markov model is then applied to learn a predictive model for each subgroup. A five-fold cross validation evaluation on the MDD-clustered HMMs yields an average accuracy of 84.93% for Serine, and 78.05% for Threonine. Furthermore, an independent testing data collected from UniProtKB and Phospho.ELM is used to make a comparison of predictive performance on three popular kinase-specific phosphorylation site prediction tools. In the independent testing, the high sensitivity and specificity of the proposed method demonstrate the predictive effectiveness of the identified substrate motifs and the importance of investigating potential kinases for viral protein phosphorylation sites.


Journal of Computer-aided Molecular Design | 2011

Carboxylator: incorporating solvent-accessible surface area for identifying protein carboxylation sites

Cheng-Tsung Lu; Shu-An Chen; Neil Arvin Bretaña; Tzu-Hsiu Cheng; Tzong-Yi Lee

In proteins, glutamate (Glu) residues are transformed into γ-carboxyglutamate (Gla) residues in a process called carboxylation. The process of protein carboxylation catalyzed by γ-glutamyl carboxylase is deemed to be important due to its involvement in biological processes such as blood clotting cascade and bone growth. There is an increasing interest within the scientific community to identify protein carboxylation sites. However, experimental identification of carboxylation sites via mass spectrometry-based methods is observed to be expensive, time-consuming, and labor-intensive. Thus, we were motivated to design a computational method for identifying protein carboxylation sites. This work aims to investigate the protein carboxylation by considering the composition of amino acids that surround modification sites. With the implication of a modified residue prefers to be accessible on the surface of a protein, the solvent-accessible surface area (ASA) around carboxylation sites is also investigated. Radial basis function network is then employed to build a predictive model using various features for identifying carboxylation sites. Based on a five-fold cross-validation evaluation, a predictive model trained using the combined features of amino acid sequence (AA20D), amino acid composition, and ASA, yields the highest accuracy at 0.874. Furthermore, an independent test done involving data not included in the cross-validation process indicates that in silico identification is a feasible means of preliminary analysis. Additionally, the predictive method presented in this work is implemented as Carboxylator (http://csb.cse.yzu.edu.tw/Carboxylator/), a web-based tool for identifying carboxylated proteins with modification sites in order to help users in investigating γ-glutamyl carboxylation.


Emerging Infectious Diseases | 2015

Transmission of Hepatitis C Virus among Prisoners, Australia, 2005-2012.

Neil Arvin Bretaña; Lies Boelen; Rowena A. Bull; Suzy Teutsch; Peter A. White; Andrew Lloyd; Fabio Luciani

Ongoing transmission is associated with drug injection.


BMC Bioinformatics | 2015

Characterization and identification of ubiquitin conjugation sites with E3 ligase recognition specificities

Van Nui Nguyen; Kai Yao Huang; Chien Hsun Huang; Tzu Hao Chang; Neil Arvin Bretaña; K. Robert Lai; Julia Tzu Ya Weng; Tzong-Yi Lee

BackgroundIn eukaryotes, ubiquitin-conjugation is an important mechanism underlying proteasome-mediated degradation of proteins, and as such, plays an essential role in the regulation of many cellular processes. In the ubiquitin-proteasome pathway, E3 ligases play important roles by recognizing a specific protein substrate and catalyzing the attachment of ubiquitin to a lysine (K) residue. As more and more experimental data on ubiquitin conjugation sites become available, it becomes possible to develop prediction models that can be scaled to big data. However, no development that focuses on the investigation of ubiquitinated substrate specificities has existed. Herein, we present an approach that exploits an iteratively statistical method to identify ubiquitin conjugation sites with substrate site specificities.ResultsIn this investigation, totally 6259 experimentally validated ubiquitinated proteins were obtained from dbPTM. After having filtered out homologous fragments with 40% sequence identity, the training data set contained 2658 ubiquitination sites (positive data) and 5532 non-ubiquitinated sites (negative data). Due to the difficulty in characterizing the substrate site specificities of E3 ligases by conventional sequence logo analysis, a recursively statistical method has been applied to obtain significant conserved motifs. The profile hidden Markov model (profile HMM) was adopted to construct the predictive models learned from the identified substrate motifs. A five-fold cross validation was then used to evaluate the predictive model, achieving sensitivity, specificity, and accuracy of 73.07%, 65.46%, and 67.93%, respectively. Additionally, an independent testing set, completely blind to the training data of the predictive model, was used to demonstrate that the proposed method could provide a promising accuracy (76.13%) and outperform other ubiquitination site prediction tool.ConclusionA case study demonstrated the effectiveness of the characterized substrate motifs for identifying ubiquitination sites. The proposed method presents a practical means of preliminary analysis and greatly diminishes the total number of potential targets required for further experimental confirmation. This method may help unravel their mechanisms and roles in E3 recognition and ubiquitin-mediated protein degradation.


BMC Bioinformatics | 2013

ViralPhos: incorporating a recursively statistical method to predict phosphorylation sites on virus proteins

Kai-Yao Huang; Cheng-Tsung Lu; Neil Arvin Bretaña; Tzong-Yi Lee; Tzu-Hao Chang

BackgroundThe phosphorylation of virus proteins by host kinases is linked to viral replication. This leads to an inhibition of normal host-cell functions. Further elucidation of phosphorylation in virus proteins is required in order to aid in drug design and treatment. However, only a few studies have investigated substrate motifs in identifying virus phosphorylation sites. Additionally, existing bioinformatics tool do not consider potential host kinases that may initiate the phosphorylation of a virus protein.Results329 experimentally verified phosphorylation fragments on 111 virus proteins were collected from virPTM. These were clustered into subgroups of significantly conserved motifs using a recursively statistical method. Two-layered Support Vector Machines (SVMs) were then applied to train a predictive model for the identified substrate motifs. The SVM models were evaluated using a five-fold cross validation which yields an average accuracy of 0.86 for serine, and 0.81 for threonine. Furthermore, the proposed method is shown to perform at par with three other phosphorylation site prediction tools: PPSP, KinasePhos 2.0 and GPS 2.1.ConclusionIn this study, we propose a computational method, ViralPhos, which aims to investigate virus substrate site motifs and identify potential phosphorylation sites on virus proteins. We identified informative substrate motifs that matched with several well-studied kinase groups as potential catalytic kinases for virus protein substrates. The identified substrate motifs were further exploited to identify potential virus phosphorylation sites. The proposed method is shown to be capable of predicting virus phosphorylation sites and has been implemented as a web server http://csb.cse.yzu.edu.tw/ViralPhos/.


PLOS ONE | 2011

Incorporating Evolutionary Information and Functional Domains for Identifying RNA Splicing Factors in Humans

Justin Bo Kai Hsu; Neil Arvin Bretaña; Tzong-Yi Lee; Hsien-Da Huang

Regulation of pre-mRNA splicing is achieved through the interaction of RNA sequence elements and a variety of RNA-splicing related proteins (splicing factors). The splicing machinery in humans is not yet fully elucidated, partly because splicing factors in humans have not been exhaustively identified. Furthermore, experimental methods for splicing factor identification are time-consuming and lab-intensive. Although many computational methods have been proposed for the identification of RNA-binding proteins, there exists no development that focuses on the identification of RNA-splicing related proteins so far. Therefore, we are motivated to design a method that focuses on the identification of human splicing factors using experimentally verified splicing factors. The investigation of amino acid composition reveals that there are remarkable differences between splicing factors and non-splicing proteins. A support vector machine (SVM) is utilized to construct a predictive model, and the five-fold cross-validation evaluation indicates that the SVM model trained with amino acid composition could provide a promising accuracy (80.22%). Another basic feature, amino acid dipeptide composition, is also examined to yield a similar predictive performance to amino acid composition. In addition, this work presents that the incorporation of evolutionary information and domain information could improve the predictive performance. The constructed models have been demonstrated to effectively classify (73.65% accuracy) an independent data set of human splicing factors. The result of independent testing indicates that in silico identification could be a feasible means of conducting preliminary analyses of splicing factors and significantly reducing the number of potential targets that require further in vivo or in vitro confirmation.

Collaboration


Dive into the Neil Arvin Bretaña's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Andrew Lloyd

University of New South Wales

View shared research outputs
Top Co-Authors

Avatar

Fabio Luciani

University of New South Wales

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Suzy Teutsch

University of New South Wales

View shared research outputs
Researchain Logo
Decentralizing Knowledge