Featured Researches

Information Retrieval

A Heterogeneous Information Network based Cross Domain Insurance Recommendation System for Cold Start Users

Internet is changing the world, adapting to the trend of internet sales will bring revenue to traditional insurance companies. Online insurance is still in its early stages of development, where cold start problem (prospective customer) is one of the greatest challenges. In traditional e-commerce field, several cross-domain recommendation (CDR) methods have been studied to infer preferences of cold start users based on their preferences in other domains. However, these CDR methods could not be applied to insurance domain directly due to the domain specific properties. In this paper, we propose a novel framework called a Heterogeneous information network based Cross Domain Insurance Recommendation (HCDIR) system for cold start users. Specifically, we first try to learn more effective user and item latent features in both source and target domains. In source domain, we employ gated recurrent unit (GRU) to module user dynamic interests. In target domain, given the complexity of insurance products and the data sparsity problem, we construct an insurance heterogeneous information network (IHIN) based on data from PingAn Jinguanjia, the IHIN connects users, agents, insurance products and insurance product properties together, giving us richer information. Then we employ three-level (relational, node, and semantic) attention aggregations to get user and insurance product representations. After obtaining latent features of overlapping users, a feature mapping between the two domains is learned by multi-layer perceptron (MLP). We apply HCDIR on Jinguanjia dataset, and show HCDIR significantly outperforms the state-of-the-art solutions.

Read more
Information Retrieval

A Hybrid Adaptive Educational eLearning Project based on Ontologies Matching and Recommendation System

The implementation of teaching interventions in learning needs has received considerable attention, as the provision of the same educational conditions to all students, is pedagogically ineffective. In contrast, more effectively considered the pedagogical strategies that adapt to the real individual skills of the students. An important innovation in this direction is the Adaptive Educational Systems (AES) that support automatic modeling study and adjust the teaching content on educational needs and students' skills. Effective utilization of these educational approaches can be enhanced with Artificial Intelligence (AI) technologies in order to the substantive content of the web acquires structure and the published information is perceived by the search engines. This study proposes a novel Adaptive Educational eLearning System (AEeLS) that has the capacity to gather and analyze data from learning repositories and to adapt these to the educational curriculum according to the student skills and experience. It is a novel hybrid machine learning system that combines a Semi-Supervised Classification method for ontology matching and a Recommendation Mechanism that uses a hybrid method from neighborhood-based collaborative and content-based filtering techniques, in order to provide a personalized educational environment for each student.

Read more
Information Retrieval

A Hybrid BERT and LightGBM based Model for Predicting Emotion GIF Categories on Twitter

The animated Graphical Interchange Format (GIF) images have been widely used on social media as an intuitive way of expression emotion. Given their expressiveness, GIFs offer a more nuanced and precise way to convey emotions. In this paper, we present our solution for the EmotionGIF 2020 challenge, the shared task of SocialNLP 2020. To recommend GIF categories for unlabeled tweets, we regarded this problem as a kind of matching tasks and proposed a learning to rank framework based on Bidirectional Encoder Representations from Transformer (BERT) and LightGBM. Our team won the 4th place with a Mean Average Precision @ 6 (MAP@6) score of 0.5394 on the round 1 leaderboard.

Read more
Information Retrieval

A Knowledge Graph based Approach for Mobile Application Recommendation

With the rapid prevalence of mobile devices and the dramatic proliferation of mobile applications (apps), app recommendation becomes an emergent task that would benefit both app users and stockholders. How to effectively organize and make full use of rich side information of users and apps is a key challenge to address the sparsity issue for traditional approaches. To meet this challenge, we proposed a novel end-to-end Knowledge Graph Convolutional Embedding Propagation Model (KGEP) for app recommendation. Specifically, we first designed a knowledge graph construction method to model the user and app side information, then adopted KG embedding techniques to capture the factual triplet-focused semantics of the side information related to the first-order structure of the KG, and finally proposed a relation-weighted convolutional embedding propagation model to capture the recommendation-focused semantics related to high-order structure of the KG. Extensive experiments conducted on a real-world dataset validate the effectiveness of the proposed approach compared to the state-of-the-art recommendation approaches.

Read more
Information Retrieval

A Knowledge-Enhanced Recommendation Model with Attribute-Level Co-Attention

Deep neural networks (DNNs) have been widely employed in recommender systems including incorporating attention mechanism for performance improvement. However, most of existing attention-based models only apply item-level attention on user side, restricting the further enhancement of recommendation performance. In this paper, we propose a knowledge-enhanced recommendation model ACAM, which incorporates item attributes distilled from knowledge graphs (KGs) as side information, and is built with a co-attention mechanism on attribute-level to achieve performance gains. Specifically, each user and item in ACAM are represented by a set of attribute embeddings at first. Then, user representations and item representations are augmented simultaneously through capturing the correlations between different attributes by a co-attention module. Our extensive experiments over two realistic datasets show that the user representations and item representations augmented by attribute-level co-attention gain ACAM's superiority over the state-of-the-art deep models.

Read more
Information Retrieval

A Modern Non-SQL Approach to Radiology-Centric Search Engine Design with Clinical Validation

Healthcare data is increasing in size at an unprecedented speed with much attention on big data analysis and Artificial Intelligence application for quality assurance, clinical training, severity triaging, and decision support. Radiology is well-suited for innovation given its intrinsically paired linguistic and visual data. Previous attempts to unlock this information goldmine were encumbered by heterogeneity of human language, proprietary search algorithms, and lack of medicine-specific search performance matrices. We present a de novo process of developing a document-based, secure, efficient, and accurate search engine in the context of Radiology. We assess our implementation of the search engine with comparison to pre-existing manually collected clinical databases used previously for clinical research projects in addition to computational performance benchmarks and survey feedback. By leveraging efficient database architecture, search capability, and clinical thinking, radiologists are at the forefront of harnessing the power of healthcare data.

Read more
Information Retrieval

A Multilayer Correlated Topic Model

We proposed a novel multilayer correlated topic model (MCTM) to analyze how the main ideas inherit and vary between a document and its different segments, which helps understand an article's structure. The variational expectation-maximization (EM) algorithm was derived to estimate the posterior and parameters in MCTM. We introduced two potential applications of MCTM, including the paragraph-level document analysis and market basket data analysis. The effectiveness of MCTM in understanding the document structure has been verified by the great predictive performance on held-out documents and intuitive visualization. We also showed that MCTM could successfully capture customers' popular shopping patterns in the market basket analysis.

Read more
Information Retrieval

A New Citation Recommendation Strategy Based on Term Functions in Related Studies Section

Purpose: Researchers frequently encounter the following problems when writing scientific articles: (1) Selecting appropriate citations to support the research idea is challenging. (2) The literature review is not conducted extensively, which leads to working on a research problem that others have well addressed. This study focuses on citation recommendation in the related studies section by applying the term function of a citation context, potentially improving the efficiency of writing a literature review. Design/methodology/approach: We present nine term functions with three newly created and six identified from existing literature. Using these term functions as labels, we annotate 531 research papers in three topics to evaluate our proposed recommendation strategy. BM25 and Word2vec with VSM are implemented as the baseline models for the recommendation. Then the term function information is applied to enhance the performance. Findings: The experiments show that the term function-based methods outperform the baseline methods regarding the recall, precision, and F1-score measurement, demonstrating that term functions are useful in identifying valuable citations. Research limitations: The dataset is insufficient due to the complexity of annotating citation functions for paragraphs in the related studies section. More recent deep learning models should be performed to future validate the proposed approach. Practical implications: The citation recommendation strategy can be helpful for valuable citation discovery, semantic scientific retrieval, and automatic literature review generation. Originality/value: The proposed citation function-based citation recommendation can generate intuitive explanations of the results for users, improving the transparency, persuasiveness, and effectiveness of recommender systems.

Read more
Information Retrieval

A Practical Incremental Method to Train Deep CTR Models

Deep learning models in recommender systems are usually trained in the batch mode, namely iteratively trained on a fixed-size window of training data. Such batch mode training of deep learning models suffers from low training efficiency, which may lead to performance degradation when the model is not produced on time. To tackle this issue, incremental learning is proposed and has received much attention recently. Incremental learning has great potential in recommender systems, as two consecutive window of training data overlap most of the volume. It aims to update the model incrementally with only the newly incoming samples from the timestamp when the model is updated last time, which is much more efficient than the batch mode training. However, most of the incremental learning methods focus on the research area of image recognition where new tasks or classes are learned over time. In this work, we introduce a practical incremental method to train deep CTR models, which consists of three decoupled modules (namely, data, feature and model module). Our method can achieve comparable performance to the conventional batch mode training with much better training efficiency. We conduct extensive experiments on a public benchmark and a private dataset to demonstrate the effectiveness of our proposed method.

Read more
Information Retrieval

A Qualitative Evaluation of Language Models on Automatic Question-Answering for COVID-19

COVID-19 has resulted in an ongoing pandemic and as of 12 June 2020, has caused more than 7.4 million cases and over 418,000 deaths. The highly dynamic and rapidly evolving situation with COVID-19 has made it difficult to access accurate, on-demand information regarding the disease. Online communities, forums, and social media provide potential venues to search for relevant questions and answers, or post questions and seek answers from other members. However, due to the nature of such sites, there are always a limited number of relevant questions and responses to search from, and posted questions are rarely answered immediately. With the advancements in the field of natural language processing, particularly in the domain of language models, it has become possible to design chatbots that can automatically answer consumer questions. However, such models are rarely applied and evaluated in the healthcare domain, to meet the information needs with accurate and up-to-date healthcare data. In this paper, we propose to apply a language model for automatically answering questions related to COVID-19 and qualitatively evaluate the generated responses. We utilized the GPT-2 language model and applied transfer learning to retrain it on the COVID-19 Open Research Dataset (CORD-19) corpus. In order to improve the quality of the generated responses, we applied 4 different approaches, namely tf-idf, BERT, BioBERT, and USE to filter and retain relevant sentences in the responses. In the performance evaluation step, we asked two medical experts to rate the responses. We found that BERT and BioBERT, on average, outperform both tf-idf and USE in relevance-based sentence filtering tasks. Additionally, based on the chatbot, we created a user-friendly interactive web application to be hosted online.

Read more

Ready to get started?

Join us today