Wen-Yen Chen
University of California, Santa Barbara
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Wen-Yen Chen.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2011
Wen-Yen Chen; Yangqiu Song; Hongjie Bai; Chih-Jen Lin; Edward Y. Chang
Spectral clustering algorithms have been shown to be more effective in finding clusters than some traditional algorithms, such as k-means. However, spectral clustering suffers from a scalability problem in both memory use and computational time when the size of a data set is large. To perform clustering on large data sets, we investigate two representative ways of approximating the dense similarity matrix. We compare one approach by sparsifying the matrix with another by the Nyström method. We then pick the strategy of sparsifying the matrix via retaining nearest neighbors and investigate its parallelization. We parallelize both memory use and computation on distributed computers. Through an empirical study on a document data set of 193,844 instances and a photo data set of 2,121,863, we show that our parallel algorithm can effectively handle large problems.
international world wide web conferences | 2009
Wen-Yen Chen; Jon-Chyuan Chu; Junyi Luan; Hongjie Bai; Yi Wang; Edward Y. Chang
Users of social networking services can connect with each other by forming communities for online interaction. Yet as the number of communities hosted by such websites grows over time, users have even greater need for effective community recommendations in order to meet more users. In this paper, we investigate two algorithms from very different domains and evaluate their effectiveness for personalized community recommendation. First is association rule mining (ARM), which discovers associations between sets of communities that are shared across many users. Second is latent Dirichlet allocation (LDA), which models user-community co-occurrences using latent aspects. In comparing LDA with ARM, we are interested in discovering whether modeling low-rank latent structure is more effective for recommendations than directly mining rules from the observed data. We experiment on an Orkut data set consisting of 492,104 users and 118,002 communities. Our empirical comparisons using the top-k recommendations metric show that LDA performs consistently better than ARM for the community recommendation task when recommending a list of 4 or more communities. However, for recommendation lists of up to 3 communities, ARM is still a bit better. We analyze examples of the latent information learned by LDA to explain this finding. To efficiently handle the large-scale data set, we parallelize LDA on distributed computers and demonstrate our parallel implementations scalability with varying numbers of machines.
knowledge discovery and data mining | 2008
Wen-Yen Chen; Dong Zhang; Edward Y. Chang
Rapid growth in the amount of data available on social networking sites has made information retrieval increasingly challenging for users. In this paper, we propose a collaborative filtering method, Combinational Collaborative Filtering (CCF), to perform personalized community recommendations by considering multiple types of co-occurrences in social data at the same time. This filtering method fuses semantic and user information, then applies a hybrid training strategy that combines Gibbs sampling and Expectation-Maximization algorithm. To handle the large-scale dataset, parallel computing is used to speed up the model training. Through an empirical study on the Orkut dataset, we show CCF to be both effective and scalable.
european conference on machine learning | 2008
Yangqiu Song; Wen-Yen Chen; Hongjie Bai; Chih-Jen Lin; Edward Y. Chang
Spectral clustering algorithm has been shown to be more effective in finding clusters than most traditional algorithms. However, spectral clustering suffers from a scalability problem in both memory use and computational time when a dataset size is large. To perform clustering on large datasets, we propose to parallelize both memory use and computation on distributed computers. Through an empirical study on a large document dataset of 193,844 data instances and a large photo dataset of 637,137, we demonstrate that our parallel algorithm can effectively alleviate the scalability problem.
acm multimedia | 2006
Benjamin N. Lee; Wen-Yen Chen; Edward Y. Chang
In this work we present the details of the implementation of Fotofiti(FF), a website that provides automatic semantic annotation of digital photographs, event management and social network integration. We describe our technique for real-time online semantic annotation using global features from both content and context. Classification experiments using various learning techniques were performed on a realworld data-set. Additionally, a scalable landmark recognition system which utilizes local features is discussed.
acm multimedia | 2006
Benjamin N. Lee; Wen-Yen Chen; Edward Y. Chang
In this work, we present Fotofiti(FF), a web-based personal photo organizer with automatic image annotation, event management and social network integration. We describe our technique for real-time online semantic annotation of user photos. Additionally, a landmark recognition system which utilizes local features is discussed.
acm multimedia | 2006
Wen-Yen Chen; Benjamin N. Lee; Edward Y. Chang
Fotowiki (FW) is a wiki-based map service that integrates visual and textual information with map. FW divides a geographical area into sub-areas. An individual responsible for providing information about a sub-area enters collected data into a wiki page. FW uploads distributed wiki-pages, and overlays the information on the map. This demonstration shows FWs architecture and functionalities.
Archive | 2011
Wen-Yen Chen; Zhichen Xu
Neurocomputing | 2012
Jiang Bian; Yi Chang; Yun Fu; Wen-Yen Chen
Scaling up Machine Learning: Parallel and Distributed Approaches | 2011
Wen-Yen Chen; Yangqiu Song; Hongjie Bai; Chih-Jen Lin; Edward Y. Chang