Dazhen Lin
Xiamen University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Dazhen Lin.
Multimedia Systems | 2016
Donglin Cao; Rongrong Ji; Dazhen Lin; Shaozi Li
Since classical public sentiment analysis systems for microblog are based on the text sentiment analysis, it is difficult to determine the sentiment of short text without clear sentiment words in microblog posts. Fortunately, a lot of microblog posts contain images which also represent users’ sentiment. To fully understand users’ sentiment, we propose a cross-media public sentiment analysis system for microblog. The best advantage of this novel system is the unified cross-media public sentiment analysis framework which fuses the text sentiment and image sentiment not only from sentiment results, but also from sentiment ontology. To enhance presentation effects, this system presents sentiment results from macroscopic view and microscopic view which details the sentiment results in region, topic, microblog content and user diffusion. In our knowledge, this is the first unified cross-media public sentiment analysis system.
Multimedia Tools and Applications | 2016
Donglin Cao; Rongrong Ji; Dazhen Lin; Shaozi Li
With a growing number of images being used to express opinions in Microblog, text based sentiment analysis is not enough to understand the sentiments of users. To obtain the sentiments implied in Microblog images, we propose a Visual Sentiment Topic Model (VSTM) which gathers images in the same Microblog topic to enhance the visual sentiment analysis results. First, we obtain the visual sentiment features by using Visual Sentiment Ontology (VSO); then, we build a Visual Sentiment Topic Model by using all images in the same topic; finally, we choose better visual sentiment features according to the visual sentiment features distribution in a topic. The best advantage of our approach is that the discriminative visual sentiment ontology features are selected according to the sentiment topic model. The experiment results show that the performance of our approach is better than VSO based model.
ieee international conference on multimedia big data | 2015
Rongrong Ji; Donglin Cao; Dazhen Lin
Sentiment analysis is important for understanding the social media contents and user opinions. Along with the development of social media applications, an increasing number of people combine texts and images to express their opinions. However, text based sentiment analysis methods cannot process other medias except texts. Therefore, visual sentiment analysis is born at the right moment. In this article, we review two multimodal-based visual sentiment analysis models proposed in our group. Both model exploit the multimodal content from correlation and hyper graph view respectively. In the Multimodal Correlation Model (MCM), we observe the correlation among different modalities and model then through a probabilistic graphical model. In the Hyper graph Learning Model (HLM), we use hyper graph to model the independence of each modality. We further discuss the underneath challenges and foresee potential opportunities of this direction.
soft computing | 2010
Dazhen Lin; Shaozi Li; Donglin Cao
More and more people use blogs to write down their ideas, opinions and individual thoughts. Mining blogs will obtain useful information, which can support business policy and decision-making, especially in analyzing product popularity. In this study, we mine the features of persons as implicit relations between the content of the bloggers’ posts and the bloggers. These kinds of implicit relations, which are also semantic relations, is called the Blogger Role. To mine this semantic relation, Wordnet is used to extract the Blogger Role and the features of Blogger Role. To get an appropriate Blogger Role, we cluster all bloggers’ posts and use the clustering result to revise the Blogger Role obtained by single document analysis. To get more relevant retrieval results, we combine this implicit relation with the classical retrieval model by the Blogger Role-based model. The combination is performed by the explicit model and implicit model. Results of experiments on TREC corpus show that Blogger Role reveals bloggers’ characters and mining Blogger Roles is useful in analyzing the popularity of products.
UKCI | 2017
Ben Ma; Dazhen Lin; Donglin Cao
In recent years, various social network applications have emerged to meet users demand of social activity. As the biggest Chinese Microblog platform, Sina Weibo not only provides users with a lot of information, but also promotes the diffusion spread of rumors which generated huge negative social impacts. To quickly detect rumors from Sina Weibo, many research works focus on social attributes in social network. However, content play an important role in rumor diffusion, and it was ignored in many research works. In this paper, we use two different text representations, bag of words model and neural network language model, to generate text vectors from rumor contents. Furthermore, we compared performance of two text representations in rumor detection by using some state-of-the-art classification algorithms. From the experiments in 10,000 Sina Weibo posts, we found that the best classification accuracy of bag of words model is over 90 %, and the best classification accuracy of neural network language model is over 60 %. It indicates that words of posts are more useful than semantic context vectors representation in rumor detection.
Neurocomputing | 2018
Dazhen Lin; Fan Lin; Yanping Lv; Feipeng Cai; Donglin Cao
Abstract To identify machine and human, Completely Automated Public Turing test to tell Computers and Humans Apart (CAPTCHA) is increasingly used in many web applications. The classical English and digital characters based CAPTCHAs are recognized with high accuracy. Due to the complication of Chinese characters which greatly enhance the difficulty of automatic recognition, an increasing number of Chinese web sites use Chinese Character CAPTCHAs. To recognize Chinese Character CAPTCHAs, we propose a Convolution Neural Network (CNN) based approach to learn strokes, radicals and character features of Chinese characters, and prove that our network structure is superior to LENET-5 in this task. Furthermore, we formulate the relation among accuracy, the number of training samples and iterations, which is used to estimate the performance of our approach. Firstly, this approach greatly improves the recognition accuracy of Chinese Character CAPTCHAs with distortion, rotation and background noise. Our experiments results show that this approach achieves over 95% accuracy for single Chinese character and 84% accuracy for three types of Chinese Character CAPTCHAs with four Chinese characters. Secondly, our experiment results and theoretical analysis show that the accuracy of recognition has the exponential relationship with the product of the number of training samples and iterations in the condition of enough and representative training samples. Therefore, we can estimate the training time for a certain accuracy. Finally, we certify that our approach is superior to the most famous Chinese Optical Character Recognition (OCR) software, Hanvon, in Chinese Character CAPTCHAs recognition.
UKCI | 2017
Lingxiao Li; Shaozi Li; Donglin Cao; Dazhen Lin
An image is worth a thousand of words for sentiment expression, but the semantic gap between low-level pixels and high-level sentiment make visual sentiment analysis difficult. Our work focuses on two aspects to bridge the gap: (1) High-level abstract feature learning for visual sentiment content. (2) Utilizing large-scale unlabeled dataset. We propose a hierarchical structure for automatic discovery of visual sentiment features—we called SentiNet which employed a ConvNet structure. In order to deal with the limitation of labeled data, we leverage the sentiment related signal to pre-annotate unlabeled samples from different source domains. In particular, we propose a hierarchy-stack fine-tune strategy to train SentiNet. We show how this pipeline can be applied on social media visual sentiment analysis. Our experiments on real-world covering half-million unlabeled images and two thousands labeled images show that our method defeats state-of-the-art visual methods, and prove the importance of large scale data and hierarchical architecture for visual sentiment analysis.
Mathematical Problems in Engineering | 2017
Dazhen Lin; Donglin Cao; Yanping Lv; Zheng Cai
With the development of social media, an increasing number of people use short videos in social media applications to express their opinions and sentiments. However, sentiment detection of short videos is a very challenging task because of the semantic gap problem and sequence based sentiment understanding problem. In this context, we propose a SentiPair Sequence based GIF video sentiment detection approach with two contributions. First, we propose a Synset Forest method to extract sentiment related semantic concepts from WordNet to build a robust SentiPair label set. This approach considers the semantic gap between label words and selects a robust label subset which is related to sentiment. Secondly, we propose a SentiPair Sequence based GIF video sentiment detection approach that learns the semantic sequence to understand the sentiment from GIF videos. Our experiment results on GSO-2016 (GIF Sentiment Ontology) data show that our approach not only outperforms four state-of-the-art classification methods but also shows better performance than the state-of-the-art middle level sentiment ontology features, Adjective Noun Pairs (ANPs).
web information systems modeling | 2012
Jiansong Yu; Donglin Cao; Shaozi Li; Dazhen Lin
We propose an Internet-search-based automatic image annotation feedback model, combining content-based and web-based image annotation, to solve the relevance assumption between the image and text and the limited volume of the database. In this model, we extract candidate labels from search results using web-based texts associated with the image, and then verify the final results by using Internet search results of candidate labels with content-based features. Experimental results show that this method can annotate the large-scale database with high accuracy, and achieve a 5.2% improvement on the basis of web-based automatic image annotation.
ieee international conference on intelligent systems and knowledge engineering | 2008
Dazhen Lin; Donglin Cao; Shaozi Li
Blog is one of the important components in Web 2.0. Many blog retrieval systems still use the classical retrieval method in Web page retrieval. In this paper, we present a new retrieval approach which is based on the relation between the blogger and query. The advantage of this approach is collecting the semantic information which we called blogger role in retrieval model. The experiments show that this approach achieves a better performance than the classical retrieval model in blog retrieval.