Chih-Hua Tai
National Taipei University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Chih-Hua Tai.
knowledge discovery and data mining | 2011
Chih-Hua Tai; Philip S. Yu; De-Nian Yang; Ming-Syan Chen
Due to the rich information in graph data, the technique for privacy protection in published social networks is still in its infancy, as compared to the protection in relational databases. In this paper we identify a new type of attack called a friendship attack. In a friendship attack, an adversary utilizes the degrees of two vertices connected by an edge to re-identify related victims in a published social network data set. To protect against such attacks, we introduce the concept of k2-degree anonymity, which limits the probability of a vertex being re-identified to 1/k. For the k2-degree anonymization problem, we propose an Integer Programming formulation to find optimal solutions in small-scale networks. We also present an efficient heuristic approach for anonymizing large-scale social networks against friendship attacks. The experimental results demonstrate that the proposed approaches can preserve much of the characteristics of social networks.
knowledge discovery and data mining | 2010
Chih-Hua Tai; Philip S. Yu; Ming-Syan Chen
For any outsourcing service, privacy is a major concern. This paper focuses on outsourcing frequent itemset mining and examines the issue on how to protect privacy against the case where the attackers have precise knowledge on the supports of some items. We propose a new approach referred to as k-support anonymity to protect each sensitive item with k-1 other items of similar support. To achieve k-support anonymity, we introduce a pseudo taxonomy tree and have the third party mine the generalized frequent itemsets under the corresponding generalized association rules instead of association rules. The pseudo taxonomy is a construct to facilitate hiding of the original items, where each original item can map to either a leaf node or an internal node in the taxonomy tree. The rationale for this approach is that with a taxonomy tree, the k nodes to satisfy the k-support anonymity may be any k nodes in the taxonomy tree with the appropriate supports. So this approach can provide more candidates for k-support anonymity with limited fake items as only the leaf nodes, not the internal nodes, of the taxonomy tree need to appear in the transactions. Otherwise for the association rule mining, the k nodes to satisfy the k-support anonymity have to correspond to the leaf nodes in the taxonomy tree. This is far more restricted. The challenge is thus on how to generate the pseudo taxonomy tree to facilitate k-support anonymity and to ensure the conservation of original frequent itemsets. The experimental results showed that our methods of k-support anonymity can achieve very good privacy protection with moderate storage overhead.
IEEE Transactions on Knowledge and Data Engineering | 2014
Chih-Hua Tai; Peng Jui Tseng; Philip S. Yu; Ming-Syan Chen
Social networks model the social activities between individuals, which change as time goes by. In light of useful information from such dynamic networks, there is a continuous demand for privacy-preserving data sharing with analyzers, collaborators or customers. In this paper, we address the privacy risks of identity disclosures in sequential releases of a dynamic network. To prevent privacy breaches, we proposed novel kw-structural diversity anonymity, where k is an appreciated privacy level and w is a time period that an adversary can monitor a victim to collect the attack knowledge. We also present a heuristic algorithm for generating releases satisfying kw-structural diversity anonymity so that the adversary cannot utilize his knowledge to reidentify the victim and take advantages. The evaluations on both real and synthetic data sets show that the proposed algorithm can retain much of the characteristics of the networks while confirming the privacy protection.
knowledge discovery and data mining | 2007
Chih-Hua Tai; Bi-Ru Dai; Ming-Syan Chen
Spatial clustering has been identified as an important technique in data mining owing to its various applications. In the conventional spatial clustering methods, data points are clustered mainly according to their geographic attributes. In real applications, however, the obtained data points consist of not only geographic attributes but also non-geographic ones. In general, geographic attributes indicate the data locations and non-geographic attributes show the characteristics of data points. It is thus infeasible, by using conventional spatial clustering methods, to partition the geographic space such that similar data points are grouped together. In this paper, we propose an effective and efficient algorithm, named incremental clustering toward the Bound INformation of Geography and Optimization spaces, abbreviated as BINGO, to solve the problem. The proposed BINGO algorithm combines the information in both geographic and non-geographic attributes by constructing a summary structure and possesses incremental clustering capability by appropriately adjusting this structure. Furthermore, most parameters in algorithm BINGO are determined automatically so that it is easy to be applied to applications without resorting to extra knowledge. Experiments on synthetic are performed to validate the effectiveness and the efficiency of algorithm BINGO.
IEEE Transactions on Knowledge and Data Engineering | 2014
Chih-Hua Tai; Philip S. Yu; De-Nian Yang; Ming-Syan Chen
As an increasing number of social networking data is published and shared for commercial and research purposes, privacy issues about the individuals in social networks have become serious concerns. Vertex identification, which identifies a particular user from a network based on background knowledge such as vertex degree, is one of the most important problems that have been addressed. In reality, however, each individual in a social network is inclined to be associated with not only a vertex identity but also a community identity, which can represent the personal privacy information sensitive to the public, such as political party affiliation. This paper first addresses the new privacy issue, referred to as community identification, by showing that the community identity of a victim can still be inferred even though the social network is protected by existing anonymity schemes. For this problem, we then propose the concept of structural diversity to provide the anonymity of the community identities. The k-Structural Diversity Anonymization (k-SDA) is to ensure sufficient vertices with the same vertex degree in at least k communities in a social network. We propose an Integer Programming formulation to find optimal solutions to k-SDA and also devise scalable heuristics to solve large-scale instances of k-SDA from different perspectives. The performance studies on real data sets from various perspectives demonstrate the practical utility of the proposed privacy scheme and our anonymization approaches.
international conference on multimedia and expo | 2008
Chih-Hua Tai; De-Nian Yang; Lung Tsai Lin; Ming-Syan Chen
Applications with geo-tagged photos have drawn lots of attentions in recent years. However, most previous works consider the GPS information of each photo individually to improve the metadata of the corresponding photo. In this paper, we leverage GPS data in a series of scenic photos and propose the personalized scenic itinerary recommendation system. Our system provides personalized suggestion of a sequence of following visiting spots when each user takes the photo of the current scenic spot. The recommendation is based on the data mining techniques to extract and differentiate the preferences of various users. Our system is designed to provide a new location service for geo-tagged photo management.
IEEE Journal of Biomedical and Health Informatics | 2016
Chih-Hua Tai; Zheng-Han Tan; Yue-Shan Chang
Online posts not only represent the records of peoples lives but also reveal their satisfaction with life and relationships as well as potential mental illnesses. The detection of (strong or general) negative as well as (strong or general) positive feelings of people from online posts can keep us from carelessly missing their important moments, difficult or great, due to the overloaded information in the daily life and lead to a better society. Therefore, in this paper, we build a Feeling Distinguisher system based on supervised Latent Dirichlet Allocation (sLDA), Latent Dirichlet Allocation, and SentiWordNet methodologies for detecting a persons intention and intensity of feelings through the analysis of his/her online posts. Experimental results on posts collected from five social network websites demonstrate the effectiveness of FeD. The performance of FeD is about 1.08-1.18 folds that of SVM and sLDA.
international conference on data mining | 2011
Chih-Hua Tai; Peng Jui Tseng; Philip S. Yu; Ming-Syan Chen
Privacy in social network data publishing is always an important concern. Nowadays most prior privacy protection techniques focus on static social networks. However, there are additional privacy disclosures in dynamic social networks due to the sequential publications. In this paper, we first show that the risks of vertex and community re-identification exist in a dynamic social network, even if the release at each time instance is protected by a static anonymity scheme. To prevent vertex and community re-identification in a dynamic social network, we propose novel dynamic k^w-structural diversity anonymity, where w is the time that an adversary can monitor a victim. This scheme extends the k-structural diversity anonymity to a dynamic scenario. We also present a heuristic to anonymize the releases of networks to satisfy the proposed privacy scheme. The evaluations show that our approach can retain much of the characteristics of the networks while confirming the privacy protection.
international symposium on biometrics and security technologies | 2013
Chih-Hua Tai; Jen Wei Huang; Meng Hao Chung
As the age of big data evolves, outsourcing of data mining tasks to multi-cloud environments has become a popular trend. To ensure the data privacy in outsourcing of mining tasks, the concept of support anonymity was proposed to hide sensitive information about patterns. Existing methods that tackle the privacy issues, however, do not address the related parallel mining techniques. To fill this gap, we refer to a pseudo-taxonomy based technique, called as k-support anonymity, and improve it into multi-cloud environments. This has several advantages. First, outsourcing to multi-cloud environments can meet the requirement of great computational resources in big data mining, and also parallelize the mining tasks for better efficiency. Second, the data that we send out to a cloud can be partial. An assaulter who gets the data in one cloud can never re-construct the original data. That means it is more difficult for an assailant to violate the privacy in outsourced data. Experimental results also demonstrated that our approaches can achieve good protection and better computation efficiency.
systems, man and cybernetics | 2015
Chih-Hua Tai; Zheng-Han Tan; Yung-Sheng Lin; Yue-Shan Chang
Due to the emergence of social platforms, people tend to posting their diaries and feeling online for sharing with others. In this paper, we aim to predict whether a user is getting depressed or not through his blog posts on the Internet. For this purpose, we use Latent Dirichlet Allocation (LDA) to find out top frequency words appearing in a users diaries and use SentiWordNet to calculate the emotion score of the user. Experimental results show that our method is useful in the diagnosis of mental disorder detection in social platforms.