Yu-Da Lin
National Kaohsiung University of Applied Sciences
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yu-Da Lin.
PLOS ONE | 2012
Li-Yeh Chuang; Yu-Da Lin; Hsueh-Wei Chang; Cheng-Hong Yang
Background Possible single nucleotide polymorphism (SNP) interactions in breast cancer are usually not investigated in genome-wide association studies. Previously, we proposed a particle swarm optimization (PSO) method to compute these kinds of SNP interactions. However, this PSO does not guarantee to find the best result in every implement, especially when high-dimensional data is investigated for SNP–SNP interactions. Methodology/Principal Findings In this study, we propose IPSO algorithm to improve the reliability of PSO for the identification of the best protective SNP barcodes (SNP combinations and genotypes with maximum difference between cases and controls) associated with breast cancer. SNP barcodes containing different numbers of SNPs were computed. The top five SNP barcode results are retained for computing the next SNP barcode with a one-SNP-increase for each processing step. Based on the simulated data for 23 SNPs of six steroid hormone metabolisms and signalling-related genes, the performance of our proposed IPSO algorithm is evaluated. Among 23 SNPs, 13 SNPs displayed significant odds ratio (OR) values (1.268 to 0.848; p<0.05) for breast cancer. Based on IPSO algorithm, the jointed effect in terms of SNP barcodes with two to seven SNPs show significantly decreasing OR values (0.84 to 0.57; p<0.05 to 0.001). Using PSO algorithm, two to four SNPs show significantly decreasing OR values (0.84 to 0.77; p<0.05 to 0.001). Based on the results of 20 simulations, medians of the maximum differences for each SNP barcode generated by IPSO are higher than by PSO. The interquartile ranges of the boxplot, as well as the upper and lower hinges for each n-SNP barcode (n = 3∼10) are more narrow in IPSO than in PSO, suggesting that IPSO is highly reliable for SNP barcode identification. Conclusions/Significance Overall, the proposed IPSO algorithm is robust to provide exact identification of the best protective SNP barcodes for breast cancer.
Kaohsiung Journal of Medical Sciences | 2012
Cheng-Hong Yang; Li-Yeh Chuang; Yu-Huei Cheng; Yu-Da Lin; Chun-Lin Wang; Cheng-Hao Wen; Hsueh-Wei Chang
Cancers often involve the synergistic effects of gene–gene interactions, but identifying these interactions remains challenging. Here, we present an odds ratio‐based genetic algorithm (OR‐GA) that is able to solve the problems associated with the simultaneous analysis of multiple independent single nucleotide polymorphisms (SNPs) that are associated with oral cancer. The SNP interactions between four SNPs—namely rs1799782, rs2040639, rs861539, rs2075685, and belonging to four genes (XRCC1, XRCC2, XRCC3, and XRCC4)—were tested in this study, respectively. The GA decomposes the SNPs sets into different SNP combinations with their corresponding genotypes (called SNP barcodes). The GA can effectively identify a specific SNP barcode that has an optimized fitness value and uses this to calculate the difference between the case and control groups. The SNP barcodes with a low fitness value are naturally removed from the population. Using two to four SNPs, the best SNP barcodes with maximum differences in occurrence between the case and control groups were generated by GA algorithm. Subsequently, the OR provides a quantitative measure of the multiple SNP synergies between the oral cancer and control groups by calculating the risk related to the best SNP barcodes and others. When these were compared to their corresponding non‐SNP barcodes, the estimated ORs for oral cancer were found to be great than 1 [approx. 1.72–2.23; confidence intervals (CIs): 0.94–5.30, p < 0.03–0.07] for various specific SNP barcodes with two to four SNPs. In conclusion, the proposed OR‐GA method successfully generates SNP barcodes, which allow oral cancer risk to be evaluated and in the process the OR‐GA method identifies possible SNP–SNP interactions.
IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2013
Cheng-Hong Yang; Yu-Da Lin; Li-Yeh Chaung; Hsueh-Wei Chang
Genetic association is a challenging task for the identification and characterization of genes that increase the susceptibility to common complex multifactorial diseases. To fully execute genetic studies of complex diseases, modern geneticists face the challenge of detecting interactions between loci. A genetic algorithm (GA) is developed to detect the association of genotype frequencies of cancer cases and noncancer cases based on statistical analysis. An improved genetic algorithm (IGA) is proposed to improve the reliability of the GA method for high-dimensional SNP-SNP interactions. The strategy offers the top five results to the random population process, in which they guide the GA toward a significant search course. The IGA increases the likelihood of quickly detecting the maximum ratio difference between cancer cases and noncancer cases. The study systematically evaluates the joint effect of 23 SNP combinations of six steroid hormone metabolisms, and signaling-related genes involved in breast carcinogenesis pathways were systematically evaluated, with IGA successfully detecting significant ratio differences between breast cancer cases and noncancer cases. The possible breast cancer risks were subsequently analyzed by odds-ratio (OR) and risk-ratio analysis. The estimated OR of the best SNP barcode is significantly higher than 1 (between 1.15 and 7.01) for specific combinations of two to 13 SNPs. Analysis results support that the IGA provides higher ratio difference values than the GA between breast cancer cases and noncancer cases over 3-SNP to 13-SNP interactions. A more specific SNP-SNP interaction profile for the risk of breast cancer is also provided.
Molecular Biology Reports | 2013
Shyh-Jong Wu; Li-Yeh Chuang; Yu-Da Lin; Wen-Hsien Ho; Fu-Tien Chiang; Cheng-Hong Yang; Hsueh-Wei Chang
Most non-significant individual single nucleotide polymorphisms (SNPs) were undiscovered in hypertension association studies. Their possible SNP–SNP interactions were usually ignored and leaded to missing heritability. In present study, we proposed a particle swarm optimization (PSO) algorithm to analyze the SNP–SNP interaction associated with hypertension. Genotype dataset of eight SNPs of renin-angiotensin system genes for 130 non-hypertension and 313 hypertension subjects were included. Without SNP–SNP interaction, most individual SNPs were non-significant difference between the hypertension and non-hypertension groups. For SNP–SNP interaction, PSO can select the SNP combinations involving different SNP numbers, namely the best SNP barcodes, to show the maximum frequency difference between non-hypertension and hypertension groups. After computation, the best PSO-generated SNP barcodes were dominant in non-hypertension in terms of the occurrences of frequency differences between non-hypertension and hypertension groups. The OR values of the best SNP barcodes involving 2–8 SNPs were 0.705–0.334, suggesting that these SNP barcodes were protective against hypertension. In conclusion, this study demonstrated that non-significant SNPs may generate the joint effect in association study. Our proposed PSO algorithm is effective to identify the best protective SNP barcodes against hypertension.
Omics A Journal of Integrative Biology | 2015
Cheng-Hong Yang; Yu-Da Lin; Ching Yui Yen; Li Yeh Chuang; Hsueh-Wei Chang
Oral cancer is the sixth most common cancer worldwide with a high mortality rate. Biomarkers that anticipate susceptibility, prognosis, or response to treatments are much needed. Oral cancer is a polygenic disease involving complex interactions among genetic and environmental factors, which require multifaceted analyses. Here, we examined in a dataset of 103 oral cancer cases and 98 controls from Taiwan the association between oral cancer risk and the DNA repair genes X-ray repair cross-complementing group (XRCCs) 1-4, and the environmental factors of smoking, alcohol drinking, and betel quid (BQ) chewing. We employed logistic regression, multifactor dimensionality reduction (MDR), and hierarchical interaction graphs for analyzing gene-gene (G×G) and gene-environment (G×E) interactions. We identified a significantly elevated risk of the XRCC2 rs2040639 heterozygous variant among smokers [adjusted odds ratio (OR) 3.7, 95% confidence interval (CI)=1.1-12.1] and alcohol drinkers [adjusted OR=5.7, 95% CI=1.4-23.2]. The best two-factor based G×G interaction of oral cancer included the XRCC1 rs1799782 and XRCC2 rs2040639 [OR=3.13, 95% CI=1.66-6.13]. For the G×E interaction, the estimated OR of oral cancer for two (drinking-BQ chewing), three (XRCC1-XRCC2-BQ chewing), four (XRCC1-XRCC2-age-BQ chewing), and five factors (XRCC1-XRCC2-age-drinking-BQ chewing) were 32.9 [95% CI=14.1-76.9], 31.0 [95% CI=14.0-64.7], 49.8 [95% CI=21.0-117.7] and 82.9 [95% CI=31.0-221.5], respectively. Taken together, the genotypes of XRCC1 rs1799782 and XRCC2 rs2040639 DNA repair genes appear to be significantly associated with oral cancer. These were enhanced by exposure to certain environmental factors. The observations presented here warrant further research in larger study samples to examine their relevance for routine clinical care in oncology.
PLOS ONE | 2013
Cheng-Hong Yang; Yu-Da Lin; Li-Yeh Chuang; Jin-Bor Chen; Hsueh-Wei Chang
Background Determining the complex relationship between diseases, polymorphisms in human genes and environmental factors is challenging. Multifactor dimensionality reduction (MDR) has proven capable of effectively detecting statistical patterns of epistasis. However, MDR has its weakness in accurately assigning multi-locus genotypes to either high-risk and low-risk groups, and does generally not provide accurate error rates when the case and control data sets are imbalanced. Consequently, results for classification error rates and odds ratios (OR) may provide surprising values in that the true positive (TP) value is often small. Methodology/Principal Findings To address this problem, we introduce a classifier function based on the ratio between the percentage of cases in case data and the percentage of controls in control data to improve MDR (MDR-ER) for multi-locus genotypes to be classified correctly into high-risk and low-risk groups. In this study, a real data set with different ratios of cases to controls (1∶4) was obtained from the mitochondrial D-loop of chronic dialysis patients in order to test MDR-ER. The TP and TN values were collected from all tests to analyze to what degree MDR-ER performed better than MDR. Conclusions/Significance Results showed that MDR-ER can be successfully used to detect the complex associations in imbalanced data sets.
Mitochondrion | 2013
Jin-Bor Chen; Li-Yeh Chuang; Yu-Da Lin; Chia-Wei Liou; Tsu-Kung Lin; Wen-Chin Lee; Ben-Chung Cheng; Hsueh-Wei Chang; Cheng-Hong Yang
Chronic dialysis association study involving individual single nucleotide polymorphisms (SNPs) in the mitochondrial displacement loop (D-loop) has previously been reported. However, possible SNP-SNP interactions for SNPs in the D-loop which could be associated with a reduced risk for chronic dialysis were not investigated. The purpose of this study was to propose an effective algorithm to identify protective SNP-SNP interactions in the D-loop from chronic dialysis patients. We introduce ISGA that uses an initialization strategy for genetic algorithms (GA) to improve the computational analysis for protective SNP-SNP interactions. ISGA generates genotype patterns with combined SNPs (SNP barcodes) for chronic dialysis. Using our previously reported 77 SNPs in the D-loop, the algorithm-generated protective SNP barcodes for chronic dialysis were evaluated. ISGA provides the SNP barcodes with the maximum frequency differences of occurrence between the cases and controls. The identified SNP barcodes with the lowest odds ratio (OR) values were regarded as the best preventive SNP barcodes against chronic dialysis. The best ISGA-generated SNP barcodes (two to nine SNPs) are more closely associated with the prevention of chronic dialysis when more SNPs are chosen (OR=0.64 to 0.32; 95% confidence interval=0.882 to 0.198). The cumulative effects of SNP-SNP interactions were more dominant in ISGA rather than in GA without the initialization strategy. We provide a fast identification of chronic dialysis-associated protective SNP barcodes and demonstrate that the SNP-SNP interactions may have a cumulative effect on prediction for chronic dialysis.
Cancer Cell International | 2014
Wei Chiao Chang; Yong Yuan Fang; Hsueh-Wei Chang; Li Yeh Chuang; Yu-Da Lin; Ming Feng Hou; Cheng-Hong Yang
BackgroundORAI1 channels play an important role for breast cancer progression and metastasis. Previous studies indicated the strong correlation between breast cancer and individual single nucleotide polymorphisms (SNPs) of ORAI1 gene. However, the possible SNP-SNP interaction of ORAI1 gene was not investigated.ResultsTo develop the complex analyses of SNP-SNP interaction, we propose a genetic algorithm (GA) to detect the model of breast cancer association between five SNPs (rs12320939, rs12313273, rs7135617, rs6486795 and rs712853) of ORAI1 gene. For individual SNPs, the differences between case and control groups in five SNPs of ORAI1 gene were not significant. In contrast, GA-generated SNP models show that 2-SNP (rs12320939-GT/rs6486795-CT), 3-SNP (rs12320939-GT/rs12313273-TT/rs6486795-TC), 5-SNP (rs12320939-GG/rs12313273-TC/rs7135617-TT/rs6486795-TT/rs712853-TT) have higher risks for breast cancer in terms of odds ratio analysis (1.357, 1.689, and 13.148, respectively).ConclusionTaken together, the cumulative effects of SNPs of ORAI1 gene in breast cancer association study were well demonstrated in terms of GA-generated SNP models.
Mitochondrial DNA | 2014
Jin-Bor Chen; Li-Yeh Chuang; Yu-Da Lin; Chia-Wei Liou; Tsu-Kung Lin; Wen-Chin Lee; Ben-Chung Cheng; Hsueh-Wei Chang; Cheng-Hong Yang
Abstract Background and aims: Single nucleotide polymorphism (SNP) interaction analysis can simultaneously evaluate the complex SNP interactions present in complex diseases. However, it is less commonly applied to evaluate the predisposition of chronic dialysis and its computational analysis remains challenging. In this study, we aimed to improve the analysis of SNP–SNP interactions within the mitochondrial D-loop in chronic dialysis. Material & method: The SNP–SNP interactions between 77 reported SNPs within the mitochondrial D-loop in chronic dialysis study were evaluated in terms of SNP barcodes (different SNP combinations with their corresponding genotypes). We propose a genetic algorithm (GA) to generate SNP barcodes. The χ2 values were then calculated by the occurrences of the specific SNP barcodes and their non-specific combinations between cases and controls. Results: Each SNP barcode (2- to 7-SNP) with the highest value in the χ2 test was regarded as the best SNP barcode (11.304 to 23.310; p < 0.001). The best GA-generated SNP barcodes (2- to 7-SNP) were significantly associated with chronic dialysis (odds ratio [OR] = 1.998 to 3.139; p < 0.001). The order of influence for SNPs was the same as the order of their OR values for chronic dialysis in terms of 2- to 7-SNP barcodes. Conclusion: Taken together, we propose an effective algorithm to address the SNP–SNP interactions and demonstrated that many non-significant SNPs within the mitochondrial D-loop may play a role in jointed effects to chronic dialysis susceptibility.
soft computing | 2016
Cheng-Hong Yang; Sin-Hua Moi; Yu-Da Lin; Li-Yeh Chuang
Abstract Detecting genetic association models between single nucleotide polymorphisms (SNPs) in various disease-related genes can help to understand susceptibility to disease. Statistical tools have been widely used to detect significant genetic association models, according to their related statistical values, including odds ratio (OR), chi-square test (χ2), p-value, etc. However, the high number of computations entailed in such operations may limit the capacity of such statistical tools to detect high-order genetic associations. In this study, we propose lsGA algorithm, a genetic algorithm based on local search method, to detect significant genetic association models amongst large numbers of SNP combinations. We used two disease models to simulate the large data sets considering the minor allele frequency (MAF), number of SNPs, and number of samples. The three-order epistasis models were evaluated by chi-square test (χ2) to evaluate the significance (P-value < 0.05). Analysis results showed that lsGA provided higher chi-square test values than that of GA. Simple linear regression indicated that lsGA provides a significant advantage over GA, providing the highest β values and significant p-value.