Dosam Hwang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Dosam Hwang is active.

Explore More

Publication

Featured researches published by Dosam Hwang.

Knowledge Based Systems | 2015

Exploiting matrix factorization to asymmetric user similarities in recommendation systems

Parivash Pirasteh; Dosam Hwang; Jason J. Jung

Although collaborative filtering is widely applied in recommendation systems, it still suffers from several major limitations, including data sparsity and scalability. Sparse data affects the quality of the user similarity measurement and consequently the quality of the recommender system. In this paper, we propose a novel user similarity measure aimed at providing a valid similarity measurement between users with very few ratings. The contributions of this paper are twofold: First, we suggest an asymmetric user similarity method to distinguish between the impact that the user has on his neighbor and the impact that the user receives from his neighbor. Second, we apply matrix factorization to the user similarity matrix in order to discover the similarities between users who have rated different items. Experimental results show that our method performs better than commonly used approaches, especially under cold-start condition.

asian conference on intelligent information and database systems | 2014

Item-Based Collaborative Filtering with Attribute Correlation: A Case Study on Movie Recommendation

Parivash Pirasteh; Jason J. Jung; Dosam Hwang

User-based collaborative filtering (CF) is a widely used technique to generate recommendations. Lacking sufficient ratings will prevent CF from modeling user preference effectively and finding trustworthy similar users. To alleviate this problems, item-based CF was introduced. However, when number of co-rated items is not enough or new item is added to the system, item-based CF result is not reliable, too. This paper presents a new method based on movies similarity that focuses on improving recommendation performance when dataset is sparse. In this way, we express a new method to measure the similarity between items by utilizing the genre and director of movies. Experiments show the superiority of the measure in cold start condition.

International Journal of Distributed Sensor Networks | 2014

Semantic Information Integration with Linked Data Mashups Approaches

Hanh Huu Hoang; Tai Nguyen-Phuoc Cung; Duy Khanh Truong; Dosam Hwang; Jason J. Jung

The introduction of semantic web and Linked Data helps facilitate sharing of data on the Internet more easily. Subsequently, the resource description framework (RDF) is the standard in publishing structured data resources on the Internet and is used in interconnecting with other data resources. To remedy the data integration issues of the traditional web mashups, the semantic web technology uses the Linked Data based on RDF data model as the unified data model for combining, aggregating, and transforming data from heterogeneous data resources to build Linked Data mashups. There have been tremendous amounts of efforts of semantic web community to enable Linked Data mashups but there is still lack of a systematic survey on concepts, technologies, applications, and challenges. Therefore, in this paper, we investigate in detail semantic mashups research and application approaches in the information integration. This paper also presents a Linked Data mashup application as an illustration of the proposed approaches.

Multimedia Tools and Applications | 2017

Exploiting character networks for movie summarization

Quang Dieu Tran; Dosam Hwang; O-Joun Lee; Jai E. Jung

Movie summarization focuses on providing as much information as possible for shorter movie clips while still keeping the content of the original movie and presenting a faster way for the audience to understand the movie. In this paper, we propose a novel method to summarize a movie based on character network analysis and the appearance of protagonist and main characters in the movie. Experiments were carried out for 2 movies (Titanic (1997) and Frozen (2013)) to show that our method outperforms conventional approaches in terms of the movie summarization rate.

Mobile Networks and Applications | 2015

Weighted Similarity Schemes for High Scalability in User-Based Collaborative Filtering

Parivash Pirasteh; Dosam Hwang; Jai E. Jung

Similarity-based algorithms, often referred to as memory-based collaborative filtering techniques, are one of the most successful methods in recommendation systems. When explicit ratings are available, similarity is usually defined using similarity functions, such as the Pearson correlation coefficient, cosine similarity or mean square difference. These metrics assume similarity is a symmetric criterion. Therefore, two users have equal impact on each other in recommending new items. In this paper, we introduce new weighting schemes that allow us to consider new features in finding similarities between users. These weighting schemes, first, transform symmetric similarity to asymmetric similarity by considering the number of ratings given by users on non-common items. Second, they take into account the habit effects of users are regarded on rating items by measuring the proximity of the number of repetitions for each rate on common rated items. Experiments on two datasets were implemented and compared to other similarity measures. The results show that adding weighted schemes to traditional similarity measures significantly improve the results obtained from traditional similarity measures.

Proceedings of the fifth international workshop on on Information retrieval with Asian languages | 2000

Korean text summarization using an aggregate similarity

Jae-Hoon Kim; Joon-Hong Kim; Dosam Hwang

In this paper, each document is represented by a weighted graph called a text relationship map. In the graph, each node represents a vector of nouns in a sentence, an undirected link connects two nodes if two sentences are semantically related, and a weight on the link is a value of the similarity between a pair of sentences. The vector similarity can be computed as the inner product between corresponding vector elements. The similarity is based on the word overlap between the corresponding sentences. The importance of a node on the map, called an aggregate similarity, is defined as the sum of weights on the links connecting it to other nodes on the map. In this paper, we present a Korean text summarization system using the aggregate similarity. To evaluate our system, we used two test collections: one collection (PAPER-InCon) consists of 100 papers in the domain of computer science; the other collection (NEWS) is composed of 105 articles in the newspapers. Under the compression rate of 20%, we achieved the recall of 46.6% (PAPER-InCon) and 30.5% (NEWS), and the precision of 76.9% (PAPER-InCon) and 42.3% (NEWS). Experiments show that our system outperforms two commercial systems.

IDC | 2015

Time-Frequency Social Data Analytics for Understanding Social Big Data

Duc T. Nguyen; Dosam Hwang; Jason J. Jung

Social Network Services (SNS) have been the most popular channel where users can generate and disseminate a large amount of information (so-called ‘social big data’) among other users efficiently. Discovering meaningful patterns from these SNS (e.g., clustering relevant messages, detecting events, and understanding trends of social communities) is an important, but difficult research issue on social big data analytics. In this paper, we present an on-going work to transform social data in time domain to in frequency domain for detecting meaningful events from the social big data. Consequently, this work is expected to significantly reduce the volume (and also, complexity) of the social data and to improve the performance of the data analytics.

national foundation for science and technology development conference on information and computer science | 2015

Semi-supervised approach based on co-occurrence coefficient for named entity recognition on Twitter

Van Cuong Tran; Dosam Hwang; Jason J. Jung

The nature characteristics of data in Social Network Services (SNS) are usually short, contain insufficient information, and often are influenced by noise data, thus popular Named Entity Recognition (NER) methods applied for these data could provide wrong results even if they perform well on well-format documents. Most of NER methods are based on supervised learning techniques which often require a large amount of training dataset to train a good classifier. The Conditional Random Fields (CRF) is an example of supervised learning method, which is a statistical modeling method to predict labels for sequences of input samples. Weak point of these method is only perform well on well-format sentences. However the proper sentences are not used frequently in SNS, such as a lot of tweets on Twitter are combinations of independent terms which are implicitly belonged to a context of a certain discussion topic. In this paper, we propose a method to extract named entities from Social Data using a semi-supervised learning method, it is an extension of CRF method which adapts the new challenge with segmentations of data depending on its context rather considering entire dataset. In experiments, The method is applied on a dataset collected from Twitter, which includes 8,624 tweets for training with 1,915 labeled tweets and 1,690 tweets for testing. Our system product a promised result with the F score of the classification result be approximated to 83.9%.

IDC | 2015

Social Tagging Analytics for Processing Unlabeled Resources:A Case Study on Non-geotagged Photos

Tuong Tri Nguyen; Dosam Hwang; Jason J. Jung

Social networking services (SNS) have been an important sources of geotagged resources. This paper proposes Naive Bayes method-based framework to predict the locations of non-geotagged resources on SNS. By computing TF-ICF weights (Term Frequency and Inverse Class Frequency) of tags, we discover meaningful associations between the tags and the classes (which refer to sets of locations of the resources). As the experimental result, we found that the proposed method has shown around 75% of accuracy, with respect to F1 measurement.

international conference on computational collective intelligence | 2016

An Improvement of the Two-Stage Consensus-Based Approach for Determining the Knowledge of a Collective

Van Du Nguyen; Ngoc Thanh Nguyen; Dosam Hwang

Generally the knowledge of a collective, which is considered as a representative of the knowledge states in a collective, is often determined based on a single-stage approach. For big data, however, a collective is often very large, a multi-stage approach can be used. In this paper we present an improvement of the two-stage consensus-based approach for determining the knowledge of a large collective. For this aim, clustering methods are used to classify a large collective into smaller ones. The first stage of consensus choice aims at determining the representatives of these smaller collectives. Then these representatives will be treated as the knowledge states of a new collective which will be the subject for the second stage of consensus choice. In addition, all the collectives will be checked for susceptibility to consensus in both stages of consensus choice process. Through experiments analysis, the improvement method is useful in minimizing the difference between single-stage and two-stage consensus choice approaches in determining the knowledge of a large collective.

Explore More