Jhih-Wei Jian
Academia Sinica
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jhih-Wei Jian.
Proceedings of the National Academy of Sciences of the United States of America | 2014
Hung-Pin Peng; Kuo Hao Lee; Jhih-Wei Jian; An-Suei Yang
Significance Natural antibodies perform their biological function by recognizing all sorts of foreign proteins—seemly unlimited structural and sequence diversities in antigens can be recognized by a limited repertoire of antibodies, for which the sequence and structure are relatively homogeneous. We found that the energetically critical epitope portions are largely composed of backbone atoms, side-chain carbons, and hydrogen bond donors/acceptors. These key components are ubiquitous on protein surfaces and can be recognized by the enriched aromatic side chains and, to a lesser extent, short-chain hydrophilic residues on the antibody paratopes; antibodies, with relatively limited sequence and structural diversities in the antigen binding sites, can recognize unlimited protein antigens through recognizing the common physicochemical features on all protein surfaces. Natural antibodies are frequently elicited to recognize diverse protein surfaces, where the sequence features of the epitopes are frequently indistinguishable from those of nonepitope protein surfaces. It is not clearly understood how the paratopes are able to recognize sequence-wise featureless epitopes and how a natural antibody repertoire with limited variants can recognize seemingly unlimited protein antigens foreign to the host immune system. In this work, computational methods were used to predict the functional paratopes with the 3D antibody variable domain structure as input. The predicted functional paratopes were reasonably validated by the hot spot residues known from experimental alanine scanning measurements. The functional paratope (hot spot) predictions on a set of 111 antibody–antigen complex structures indicate that aromatic, mostly tyrosyl, side chains constitute the major part of the predicted functional paratopes, with short-chain hydrophilic residues forming the minor portion of the predicted functional paratopes. These aromatic side chains interact mostly with the epitope main chain atoms and side-chain carbons. The functional paratopes are surrounded by favorable polar atomistic contacts in the structural paratope–epitope interfaces; more that 80% these polar contacts are electrostatically favorable and about 40% of these polar contacts form direct hydrogen bonds across the interfaces. These results indicate that a limited repertoire of antibodies bearing paratopes with diverse structural contours enriched with aromatic side chains among short-chain hydrophilic residues can recognize all sorts of protein surfaces, because the determinants for antibody recognition are common physicochemical features ubiquitously distributed over all protein surfaces.
PLOS ONE | 2012
Chung-Ming Yu; Hung-Pin Peng; Ing-Chien Chen; Yu-Ching Lee; Jun-Bo Chen; Keng-Chang Tsai; Ching-Tai Chen; Jeng-Yih Chang; Ei-Wen Yang; Po-Chiang Hsu; Jhih-Wei Jian; Hung-Ju Hsu; Hung-Ju Chang; Wen-Lian Hsu; Kai-Fa Huang; Alex Che Ma; An-Suei Yang
Protein-protein interactions are critical determinants in biological systems. Engineered proteins binding to specific areas on protein surfaces could lead to therapeutics or diagnostics for treating diseases in humans. But designing epitope-specific protein-protein interactions with computational atomistic interaction free energy remains a difficult challenge. Here we show that, with the antibody-VEGF (vascular endothelial growth factor) interaction as a model system, the experimentally observed amino acid preferences in the antibody-antigen interface can be rationalized with 3-dimensional distributions of interacting atoms derived from the database of protein structures. Machine learning models established on the rationalization can be generalized to design amino acid preferences in antibody-antigen interfaces, for which the experimental validations are tractable with current high throughput synthetic antibody display technologies. Leave-one-out cross validation on the benchmark system yielded the accuracy, precision, recall (sensitivity) and specificity of the overall binary predictions to be 0.69, 0.45, 0.63, and 0.71 respectively, and the overall Matthews correlation coefficient of the 20 amino acid types in the 24 interface CDR positions was 0.312. The structure-based computational antibody design methodology was further tested with other antibodies binding to VEGF. The results indicate that the methodology could provide alternatives to the current antibody technologies based on animal immune systems in engineering therapeutic and diagnostic antibodies against predetermined antigen epitopes.
PLOS ONE | 2012
Ching-Tai Chen; Hung-Pin Peng; Jhih-Wei Jian; Keng-Chang Tsai; Jeng-Yih Chang; Ei-Wen Yang; Jun-Bo Chen; Shinn-Ying Ho; Wen-Lian Hsu; An-Suei Yang
Protein-protein interactions are key to many biological processes. Computational methodologies devised to predict protein-protein interaction (PPI) sites on protein surfaces are important tools in providing insights into the biological functions of proteins and in developing therapeutics targeting the protein-protein interaction sites. One of the general features of PPI sites is that the core regions from the two interacting protein surfaces are complementary to each other, similar to the interior of proteins in packing density and in the physicochemical nature of the amino acid composition. In this work, we simulated the physicochemical complementarities by constructing three-dimensional probability density maps of non-covalent interacting atoms on the protein surfaces. The interacting probabilities were derived from the interior of known structures. Machine learning algorithms were applied to learn the characteristic patterns of the probability density maps specific to the PPI sites. The trained predictors for PPI sites were cross-validated with the training cases (consisting of 432 proteins) and were tested on an independent dataset (consisting of 142 proteins). The residue-based Matthews correlation coefficient for the independent test set was 0.423; the accuracy, precision, sensitivity, specificity were 0.753, 0.519, 0.677, and 0.779 respectively. The benchmark results indicate that the optimized machine learning models are among the best predictors in identifying PPI sites on protein surfaces. In particular, the PPI site prediction accuracy increases with increasing size of the PPI site and with increasing hydrophobicity in amino acid composition of the PPI interface; the core interface regions are more likely to be recognized with high prediction confidence. The results indicate that the physicochemical complementarity patterns on protein surfaces are important determinants in PPIs, and a substantial portion of the PPI sites can be predicted correctly with the physicochemical complementarity features based on the non-covalent interaction data derived from protein interiors.
PLOS ONE | 2012
Keng-Chang Tsai; Jhih-Wei Jian; Ei-Wen Yang; Po-Chiang Hsu; Hung-Pin Peng; Ching Tai Chen; Jun-Bo Chen; Jeng-Yih Chang; Wen-Lian Hsu; An-Suei Yang
Non-covalent protein-carbohydrate interactions mediate molecular targeting in many biological processes. Prediction of non-covalent carbohydrate binding sites on protein surfaces not only provides insights into the functions of the query proteins; information on key carbohydrate-binding residues could suggest site-directed mutagenesis experiments, design therapeutics targeting carbohydrate-binding proteins, and provide guidance in engineering protein-carbohydrate interactions. In this work, we show that non-covalent carbohydrate binding sites on protein surfaces can be predicted with relatively high accuracy when the query protein structures are known. The prediction capabilities were based on a novel encoding scheme of the three-dimensional probability density maps describing the distributions of 36 non-covalent interacting atom types around protein surfaces. One machine learning model was trained for each of the 30 protein atom types. The machine learning algorithms predicted tentative carbohydrate binding sites on query proteins by recognizing the characteristic interacting atom distribution patterns specific for carbohydrate binding sites from known protein structures. The prediction results for all protein atom types were integrated into surface patches as tentative carbohydrate binding sites based on normalized prediction confidence level. The prediction capabilities of the predictors were benchmarked by a 10-fold cross validation on 497 non-redundant proteins with known carbohydrate binding sites. The predictors were further tested on an independent test set with 108 proteins. The residue-based Matthews correlation coefficient (MCC) for the independent test was 0.45, with prediction precision and sensitivity (or recall) of 0.45 and 0.49 respectively. In addition, 111 unbound carbohydrate-binding protein structures for which the structures were determined in the absence of the carbohydrate ligands were predicted with the trained predictors. The overall prediction MCC was 0.49. Independent tests on anti-carbohydrate antibodies showed that the carbohydrate antigen binding sites were predicted with comparable accuracy. These results demonstrate that the predictors are among the best in carbohydrate binding site predictions to date.
Structure | 2014
Hung-Ju Chang; Jhih-Wei Jian; Hung-Ju Hsu; Yu-Ching Lee; Hong-Sen Chen; Jhong-Jhe You; Shin-Chen Hou; Chih-Yun Shao; Yen-Ju Chen; Kuo Ping Chiu; Hung-Pin Peng; Kuo Hao Lee; An-Suei Yang
Protein loops are frequently considered as critical determinants in protein structure and function. Recent advances in high-throughput methods for DNA sequencing and thermal stability measurement have enabled effective exploration of sequence-structure-function relationships in local protein regions. Using these data-intensive technologies, we investigated the sequence-structure-function relationships of six complementarity-determining regions (CDRs) and ten non-CDR loops in the variable domains of a model vascular endothelial growth factor (VEGF)-binding single-chain antibody variable fragment (scFv) whose sequence had been optimized via a consensus-sequence approach. The results show that only a handful of residues involving long-range tertiary interactions distant from the antigen-binding site are strongly coupled with antigen binding. This implies that the loops are passive regions in protein folding; the essential sequences of these regions are dictated by conserved tertiary interactions and the consensus local loop-sequence features contribute little to protein stability and function.
Structure | 2014
Hung-Ju Hsu; Kuo Hao Lee; Jhih-Wei Jian; Hung-Ju Chang; Chung-Ming Yu; Yu-Ching Lee; Ing-Chien Chen; Hung-Pin Peng; Chih Yuan Wu; Yu-Feng Huang; Chih-Yun Shao; Kuo Ping Chiu; An-Suei Yang
Protein structural stability and biological functionality are dictated by the formation of intradomain cores and interdomain interfaces, but the intricate sequence-structure-function interrelationships in the packing of protein cores and interfaces remain difficult to elucidate due to the intractability of enumerating all packing possibilities and assessing the consequences of all the variations. In this work, groups of β strand residues of model antibody variable domains were randomized with saturated mutagenesis and the functional variants were selected for high-throughput sequencing and high-throughput thermal stability measurements. The results show that the sequence preferences of the intradomain hydrophobic core residues are strikingly flexible among hydrophobic residues, implying that these residues are coupled indirectly with antigen binding through energetic stabilization of the protein structures. By contrast, the interdomain interface residues are directly coupled with antigen binding. The interdomain interface should be treated as an integral part of the antigen-binding site.
Scientific Reports | 2015
Chao-Ping Tung; Ing-Chien Chen; Chung-Ming Yu; Hung-Pin Peng; Jhih-Wei Jian; Shiou-Hwa Ma; Yu-Ching Lee; Jia-Tsrong Jan; An-Suei Yang
Broadly neutralizing antibodies developed from the IGHV1–69 germline gene are known to bind to the stem region of hemagglutinin in diverse influenza viruses but the sequence determinants for the antigen recognition, including neutralization potency and binding affinity, are not clearly understood. Such understanding could inform designs of synthetic antibody libraries targeting the stem epitope on hemagglutinin, leading to artificially designed antibodies that are functionally advantageous over antibodies from natural antibody repertoires. In this work, the sequence space of the complementarity determining regions of a broadly neutralizing antibody (F10) targeting the stem epitope on the hemagglutinin of a strain of H1N1 influenza virus was systematically explored; the elucidated antibody-hemagglutinin recognition principles were used to design a phage-displayed antibody library, which was then used to discover neutralizing antibodies against another strain of H1N1 virus. More than 1000 functional antibody candidates were selected from the antibody library and were shown to neutralize the corresponding strain of influenza virus with up to 7 folds higher potency comparing with the parent F10 antibody. The antibody library could be used to discover functionally effective antibodies against other H1N1 influenza viruses, supporting the notion that target-specific antibody libraries can be designed and constructed with systematic sequence-function information.
Scientific Reports | 2015
Hong-Sen Chen; Shin-Chen Hou; Jhih-Wei Jian; King-Siang Goh; San-Tai Shen; Yu-Ching Lee; Jhong-Jhe You; Hung-Pin Peng; Wen-Chih Kuo; Shui-Tsung Chen; Ming-Chi Peng; Andrew H.-J. Wang; Chung-Ming Yu; Ing-Chien Chen; Chao-Ping Tung; Tzu-Han Chen; Kuo Ping Chiu; Che Ma; Chih Yuan Wu; Sheng-Wei Lin; An-Suei Yang
Humoral immunity against diverse pathogens is rapidly elicited from natural antibody repertoires of limited complexity. But the organizing principles underlying the antibody repertoires that facilitate this immunity are not well-understood. We used HER2 as a model immunogen and reverse-engineered murine antibody response through constructing an artificial antibody library encoded with rudimentary sequence and structural characteristics learned from high throughput sequencing of antibody variable domains. Antibodies selected in vitro from the phage-displayed synthetic antibody library bound to the model immunogen with high affinity and specificities, which reproduced the specificities of natural antibody responses. We conclude that natural antibody structural repertoires are shaped to allow functional antibodies to be encoded efficiently, within the complexity limit of an individual antibody repertoire, to bind to diverse protein antigens with high specificity and affinity. Phage-displayed synthetic antibody libraries, in conjunction with high-throughput sequencing, can thus be designed to replicate natural antibody responses and to generate novel antibodies against diverse antigens.
PLOS ONE | 2016
Jhih-Wei Jian; Pavadai Elumalai; Thejkiran Pitti; Chih Yuan Wu; Keng-Chang Tsai; Jeng-Yih Chang; Hung-Pin Peng; An-Suei Yang
Predicting ligand binding sites (LBSs) on protein structures, which are obtained either from experimental or computational methods, is a useful first step in functional annotation or structure-based drug design for the protein structures. In this work, the structure-based machine learning algorithm ISMBLab-LIG was developed to predict LBSs on protein surfaces with input attributes derived from the three-dimensional probability density maps of interacting atoms, which were reconstructed on the query protein surfaces and were relatively insensitive to local conformational variations of the tentative ligand binding sites. The prediction accuracy of the ISMBLab-LIG predictors is comparable to that of the best LBS predictors benchmarked on several well-established testing datasets. More importantly, the ISMBLab-LIG algorithm has substantial tolerance to the prediction uncertainties of computationally derived protein structure models. As such, the method is particularly useful for predicting LBSs not only on experimental protein structures without known LBS templates in the database but also on computationally predicted model protein structures with structural uncertainties in the tentative ligand binding sites.
BMC Cancer | 2018
Yong Alison Wang; Jhih-Wei Jian; Chen-Fang Hung; Hung-Pin Peng; Chi-Fan Yang; Hung-Chun Skye Cheng; An-Suei Yang
BackgroundIt is unclear whether germline breast cancer susceptibility gene mutations affect breast cancer related outcomes. We wanted to evaluate mutation patterns in 20 breast cancer susceptibility genes and correlate the mutations with clinical characteristics to determine the effects of these germline mutations on breast cancer prognosis.MethodsThe study cohort included 480 ethnic Chinese individuals in Taiwan with at least one of the six clinical risk factors for hereditary breast cancer: family history of breast or ovarian cancer, young age of onset for breast cancer, bilateral breast cancer, triple negative breast cancer, both breast and ovarian cancer, and male breast cancer. PCR-enriched amplicon-sequencing on a next generation sequencing platform was used to determine the germline DNA sequences of all exons and exon-flanking regions of the 20 genes. Protein-truncating variants were identified as pathogenic.ResultsWe detected a 13.5% carrier rate of pathogenic germline mutations, with BRCA2 being the most prevalent and the non-BRCA genes accounting for 38.5% of the mutation carriers. BRCA mutation carriers were more likely to be diagnosed of breast cancer with lymph node involvement (66.7% vs 42.6%; P = 0.011), and had significantly worse breast cancer specific outcomes. The 5-year disease-free survival was 73.3% for BRCA mutation carriers and 91.1% for non-carriers (hazard ratio for recurrence or death 2.42, 95% CI 1.29–4.53; P = 0.013). After adjusting for clinical prognostic factors, BRCA mutation remained an independent poor prognostic factor for cancer recurrence or death (adjusted hazard ratio 3.04, 95% CI 1.40–6.58; P = 0.005). Non-BRCA gene mutation carriers did not exhibit any significant difference in cancer characteristics or outcomes compared to those without detected mutations. Among the risk factors for hereditary breast cancer, the odds of detecting a germline mutation increased significantly with having bilateral breast cancer (adjusted odds ratio 3.27, 95% CI 1.64–6.51; P = 0.0008) or having more than one risk factor (odds ratio 2.07, 95% CI 1.22–3.51; P = 0.007).ConclusionsWithout prior knowledge of the mutation status, BRCA mutation carriers had more advanced breast cancer on initial diagnosis and worse cancer-related outcomes. Optimal approach to breast cancer treatment for BRCA mutation carriers warrants further investigation.