Hideyuki Maeda | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Hideyuki Maeda is active.

Explore More

Publication

Featured researches published by Hideyuki Maeda.

pacific-asia conference on knowledge discovery and data mining | 2017

Scalable Twitter User Clustering Approach Boosted by Personalized PageRank

Anup Naik; Hideyuki Maeda; Vibhor Kanojia; Sumio Fujita

Twitter has been the focus of analysis in recent years due to various interesting and challenging problems, one of them being Clustering of its Users based on their interests. For graphs, there are many clustering approaches which look at either the structure or at its contents. However, when we consider real world data such as Twitter Data, structural approaches may produce many different user clusters with similar interests. Similarly, content-based clustering approaches on Twitter Data produce inferior results due limited length of Tweet and due to lots of garbled data. Hence, these approaches cannot be directly used for practical applications. In this paper, we have made an effort to cluster Twitter Users based on their interest, looking at both the structure of the graph generated using Twitter Data, as well as its contents. By combining these approaches, we improve our results compared to the existing techniques, thereby generating results befitting the practical applications.

Journal of data science | 2017

Scalable Twitter user clustering approach boosted by Personalized PageRank

Anup Naik; Hideyuki Maeda; Vibhor Kanojia; Sumio Fujita

Twitter has been the focus of analysis in regard to various interesting and challenging problems, one of them being clustering of its users based on their interests. There are many clustering approaches for graphs that look at either the structure or the contents of the graph. However, when we consider real-world complex data such as Twitter data, structural approaches may produce many different user clusters with similar interests. Moreover, content-based clustering approaches on Twitter data also produce inferior results because tweets have a limited number of characters and lots of garbled data. Hence, for practical applications, these clustering approaches cannot be directly used on Twitter data. In the study reported in this paper, we clustered Twitter users on the basis of their interests, looking at both the structure of the graph generated from Twitter data and the contents of the Tweets. In short, we clustered Twitter users by using an unsupervised structural approach, merging similar clusters using a content-based approach, expanding the graph and ranking users with Personalized PageRank, and determining the topic to which a cluster belongs in accordance with the hashtag frequency. The results of combining these approaches were better than those of the existing techniques and befit practical applications.

international world wide web conferences | 2017

Euclidean Image Embedding in view of Similarity Ranking in Auction Search by Image

Riku Togashi; Hideyuki Maeda; Vibhor Kanojia; Kousuke Morimoto; Sumio Fujita

We propose an Euclidean embedding image representation, which serves to rank auction item images through wide range of semantic similarity spectrum, in the order of the relevance to the given query image much more effective than the baseline method in terms of a graded relevance measure. Our method uses three stream deep convolutional siamese networks to learn a distance metric and we leverage search query logs of an auction item search of the largest auction service in Japan. Unlike previous approaches, we define the inter-image relevance on the basis of user queries in the logs used to search each auction item, which enables us to acquire the image representation preserving the features concerning user intents in real e-commerce world.

international world wide web conferences | 2017

Enhancing Knowledge Graph Embedding with Probabilistic Negative Sampling

Vibhor Kanojia; Hideyuki Maeda; Riku Togashi; Sumio Fujita

Link Prediction using Knowledge graph embedding projects symbolic entities and relations into low dimensional vector space, thereby learning the semantic relations between entities. Among various embedding models, there is a series of translation-based models such as TransE[1], TransH[2], and TransR[3]. This paper proposes modifications in the TransR model to address the issue of skewed data which is common in real-world knowledge graphs. The enhancements enable the model to smartly generate corrupted triplets during negative sampling, which significantly improves the training time and performance of TransR. The proposed approach can be applied to other translation-based models.

international acm sigir conference on research and development in information retrieval | 2017

Find Shoes Like These

Hideyuki Maeda

We present an Euclidean embedding image representation, which serves to rank auction item images through wide range of semantic similarity spectrum, in the order of the relevance to the given query image much more effective than the baseline method in terms of a graded relevance measure. Our method uses three stream deep convolutional siamese networks to learn a distance metric and we leverage search query logs of an auction item search of the largest auction service in Japan. Unlike previous approaches, we define the inter-image relevance on the basis of user queries in the logs used to search each auction item, which enables us to acquire the image representation preserving the features concerning user intents in real e-commerce world.

international acm sigir conference on research and development in information retrieval | 2017

LSTM vs. BM25 for Open-domain QA: A Hands-on Comparison of Effectiveness and Efficiency

Sosuke Kato; Riku Togashi; Hideyuki Maeda; Sumio Fujita; Tetsuya Sakai

Recent advances in neural networks, along with the growth of rich and diverse community question answering (cQA) data, have enabled researchers to construct robust open-domain question answering (QA) systems. It is often claimed that such state-of-the-art QA systems far outperform traditional IR baselines such as BM25. However, most such studies rely on relatively small data sets, e.g., those extracted from the old TREC QA tracks. Given massive training data plus a separate corpus of Q&A pairs as the target knowledge source, how well would such a system really perform? How fast would it respond? In this demonstration, we provide the attendees of SIGIR 2017 an opportunity to experience a live comparison of two open-domain QA systems, one based on a long short-term memory (LSTM) architecture with over 11 million Yahoo! Chiebukuro (i.e., Japanese Yahoo! Answers) questions and over 27.4 million answers for training, and the other based on BM25. Both systems use the same Q&A knowledge source for answer retrieval. Our core demonstration system is a pair of Japanese monolingual QA systems, but we leverage machine translation for letting the SIGIR attendees enter English questions and compare the Japanese responses from the two systems after translating them into English.

international conference on the theory of information retrieval | 2018