[PDF] Genetic Meta-Structure Search for Recommendation on Heterogeneous Information Network

Abstract

In the past decade, the heterogeneous information network (HIN) has become an important methodology for modern recommender systems. To fully leverage its power, manually designed network templates, i.e., meta-structures, are introduced to filter out semantic-aware information. The hand-crafted meta-structure rely on intense expert knowledge, which is both laborious and data-dependent. On the other hand, the number of meta-structures grows exponentially with its size and the number of node types, which prohibits brute-force search. To address these challenges, we propose Genetic Meta-Structure Search (GEMS) to automatically optimize meta-structure designs for recommendation on HINs. Specifically, GEMS adopts a parallel genetic algorithm to search meaningful meta-structures for recommendation, and designs dedicated rules and a meta-structure predictor to efficiently explore the search space. Finally, we propose an attention based multi-view graph convolutional network module to dynamically fuse information from different meta-structures. Extensive experiments on three real-world datasets suggest the effectiveness of GEMS, which consistently outperforms all baseline methods in HIN recommendation. Compared with simplified GEMS which utilizes hand-crafted meta-paths, GEMS achieves over 6\% performance gain on most evaluation metrics. More importantly, we conduct an in-depth analysis on the identified meta-structures, which sheds light on the HIN based recommender system design.

Full PDF

GGenetic Meta-Structure Search for Recommendation onHeterogeneous Information Network

Zhenyu Han ∗ , Fengli Xu ∗ , Jinghan Shi † , Yu Shang ∗ , Haorui Ma ∗ , Pan Hui ‡ , Yong Li ∗ ∗ Beijing National Research Center for Information Science and Technology (BNRist),Department of Electronic Engineering, Tsinghua University, Beijing, China, 100084 † Beijing University of Posts and Telecommunications, BUPT ‡ University of Helsinki, The Hong Kong University of Science and [email protected],[email protected],[email protected],{shang-y17,mhr17}@mails.tsinghua.edu.cn,[email protected],[email protected]

ABSTRACT

Genetic Meta-Structure Search (GEMS) to automatically optimize meta-structuredesigns for recommendation on HINs. Specifically, GEMS adopts aparallel genetic algorithm to search meaningful meta-structures forrecommendation, and designs dedicated rules and a meta-structurepredictor to efficiently explore the search space. Finally, we proposean attention based multi-view graph convolutional network moduleto dynamically fuse information from different meta-structures.Extensive experiments on three real-world datasets suggest theeffectiveness of GEMS, which consistently outperforms all baselinemethods in HIN recommendation. Compared with simplified GEMSwhich utilizes hand-crafted meta-paths, GEMS achieves over 6%performance gain on most evaluation metrics. More importantly,we conduct an in-depth analysis on the identified meta-structures,which sheds light on the HIN based recommender system design.

CCS CONCEPTS • Computing methodologies → Machine learning algorithms ; Parallel algorithms ; Discrete space search ; •

Information sys-tems → Retrieval models and ranking . KEYWORDS

Recommender System; Heterogeneous Information Network; Auto-mated Machine Learning; Graph Convolutional Network

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others than theauthor(s) must be honored. Abstracting with credit is permitted. To copy otherwise, orrepublish, to post on servers or to redistribute to lists, requires prior specific permissionand/or a fee. Request permissions from [email protected].

CIKM ’20, October 19–23, 2020, Virtual Event, Ireland © 2020 Copyright held by the owner/author(s). Publication rights licensed to ACM.ACM ISBN 978-1-4503-6859-9/20/10...$15.00https://doi.org/10.1145/3340531.3412015

ACM Reference Format:

Zhenyu Han ∗ , Fengli Xu ∗ , Jinghan Shi † , Yu Shang ∗ , Haorui Ma ∗ , PanHui ‡ , Yong Li ∗ . 2020. Genetic Meta-Structure Search for Recommendationon Heterogeneous Information Network. In Proceedings of the 29th ACMInternational Conference on Information and Knowledge Management (CIKM’20), October 19–23, 2020, Virtual Event, Ireland.

ACM, New York, NY, USA,10 pages. https://doi.org/10.1145/3340531.3412015

Heterogeneous information networks (HINs) [21] have shown greatpotential for recommendation tasks due to the rich informationprovided by heterogeneous relations. In recommendation scenario,users can rate items, write reviews, and add new friends. Items alsohave different properties, which include brands, categories, originset al.. The HIN is able to represent all relations above in a singlegraph, which provides much more information than traditionaluser-item interaction for recommendation tasks.To effectively extract semantic information, meta-path [11] isproposed. The meta-path defines a message-passing prototype thatselects information according to specific semantics. Recently, themeta-structure [4, 21] is introduced as a generalization of the meta-path, where the chain structure is replaced by a graph structure.In that case, meta-paths can be regarded as a special case of meta-structures. In the following paper, we refer meta-structures to rep-resent both meta-paths and meta-structures as a more general form.Although meta-structures are effective to mine the semanticsin HINs, it is hard for us to design the meta-structures. First, wehave little prior knowledge on meta-structure design. Moreover,the designed meta-structures are data-dependent and cannot beextended to other HINs. In that case, lots of human labor is neededto apply meta-structure based recommendation model on HINs.In this work, we try to leverage the power of automated machinelearning to automatically search meaningful meta-structures forrecommendation. However, two challenges lie in the automatedHIN recommendation model. First, the number of meta-structuresincreases exponentially with nodes, which makes it impossible tobrute-force search all of them on recommendation tasks. Second, therecommendation model needs to leverage multiple meta-structuresfor various semantics, which requires an effective method to com-bine information learned by searched meta-structures. Therefore,the varying combinations of meta-structures bring more difficultieson both the problem search space and model design.To overcome these challenges, we propose Genetic Meta-structureSearch, i.e., GEMS. Specifically, we design a genetic-algorithm-based a r X i v : . [ c s . I R ] F e b ser 1User 1User 1 New YorkNew YorkNew York

Juanchi's

Burger

Juanchi's

Burger

Juanchi's

Burger

User 2User 2User 2

RestarauntRestarauntRestaraunt

Duffs Famous

Wings

Duffs Famous

Wings

Duffs Famous

Wings

Check-inCheck-in

Located in

Belong to Friend

Review 1Review 1Review 1

Write

Rate

User 3User 3User 3

Compliment

User 4User 4User 4

Check-in

User 5User 5User 5

Friend

Figure 1: An example of HIN on Yelp platform. model to automatically search meaningful meta-structures for rec-ommendation tasks. The genetic algorithm generates new meta-structures for recommendation, while not all of the meta-structuresare informative. To narrow down the search space, We set up severalrules to make sure most of the generated meta-structures are mean-ingful for recommendation. Besides, we design a meta-structure pre-dictor to pre-evaluate meta-structure performance before evaluat-ing on real recommendation tasks. By filtering out badly performedmeta-structures in advance, we can save computational resourcesfor promising ones. To combine different semantics from meta-structures, we design an attention-based multi-view graph con-volutional network (GCN) module that dynamically fuses learnedembeddings for each meta-structure. Overall, GEMS achieves aperformance gain of over 6% compared with the simplified versionthat leverages hand-crafted meta-paths for recommendation tasks.The contributions of this work can be summarized as follows: • To the best of our knowledge, this paper is the first work to ex-plicitly search meta-structures for heterogeneous informationnetworks. We transfer the genetic algorithm from traditionalnumerical optimization to meta-structure search problems. Ourwork saves human labor from complex meta-structure designs,resulting in a universal model for recommendation tasks inheterogeneous information networks. • We narrow down the problem search space by defining a rea-sonable meta-structure encoding method and mutation rules,alone with a meta-structure predictor to filter promising meta-structures before training on recommendation datasets. • We propose an attention-based multi-view GCN module to dy-namically fuse information guided by different meta-structuresfor a better recommendation. • We conduct extensive experiments to demonstrate the effec-tiveness of our proposed models while providing explainableresults that may shed light on human-labored meta-structuredesign.

In real-world recommender systems, we often face complex rela-tionships between different entities like Figure 1 shows, where therelations of different entities are complex. To properly model theserelations, heterogeneous information network is introduced.

Definition 2.1.

Heterogeneous Information Network [11].Given a graph G = (V , E) , where V is the node set and E is the edge set. With a node mapping function 𝜙 ( 𝑣 ) : V → T , ∀ 𝑣 ∈ V and a relation mapping function 𝜓 ( 𝑒 ) : E → R , ∀ 𝑒 ∈ E , where T is the node type set and R is the edge type set. If the numberof node types |T | > |R| >

1, thedirected graph G = (V , E , T , R) is a heterogeneous informationnetwork.Heterogeneous nodes and edges make it more difficult to filterout useful information in HINs. As a result, researchers propose tomine HINs with hand-crafted semantic-aware network templateslike meta-structures, which is defined as follows. Definition 2.2.

Meta-Structure [4] Given a HIN G = (V , E , T , R) ,a meta-structure M = (V ∗ , E ∗ ) is a sub-graph whose node 𝑣 ∗ ∈ T ,edges 𝑒 ∗ ∈ R . The meta-structure M links a single source node 𝑣 ∗ 𝑠 and a single sink node 𝑣 ∗ 𝑡 .As mentioned in previous text, the design of meta-structure iscomplicated. Based on expert knowledge of recommender systems,we can design some collaborative filtering meta-structures likeuser-item-user-item, which depicts the phenomenon that peoplewho have purchased the same items may have similar preferences.However, if we want to model more complex relationship whichare usually data-dependent, we have to spend more energy intothe design of meta-structures since previous experiences are verylimited.To save human labor from tedious meta-structure designing, weadopt automated machine learning paradigm to explicitly searchmeaningful meta-structures for recommendation. Specifically, wemodify genetic algorithm to explore designed search space as wellas choose promising meta-structures for optimization. Genetic algo-rithm [2] is a search heuristic that is inspired by natural evolutiontheory. Genetic algorithm maintains a population of models, eachmodel containing multiple genes that encode the search space in-stances. Each model is called an individual . During each generation,there are possibilities for each individual’s genes to mutate and crossover to explore the search space, then the environmental eval-uator will score each individual and eliminate individuals whoperform badly. Finally, the remain individuals reproduce accord-ing to the evaluated score to increase the percentage of promisinggenes. Finally, the recommendation problem in HINs can be definedas follows.Problem 1. Recommendation in HINs.

Given a HIN G withuser’s purchase records dataset D = { < 𝑣 𝑢 , 𝑣 𝑖 > } , where 𝑣 𝑢 , 𝑣 𝑖 standsfor user id and item id accordingly. For each user, we leverage meta-structure M to filter new interactions as side-information to generatea ranked list of items that are of interest to the user. Effective meta-structure design takes lots of human effort, whichinspires us to leverage automated machine learning paradigm tosearch promising meta-structures for recommendation. However,the search space of meta-structures grows exponentially with itssize, while the varying combination of node type T in HINs fur-ther enlarges the search space of meta-structures. Another prob-lem emerges during the utilization of meta-structures. Each meta-structure corresponds to a semantic-aware feature extracted fromHINs. To fully harness the power of HINs, recommendation models ene InitializationMutationCrossover Pre-evaluatePredictor Training AdjacencySearchGCN LayerEvaluateEliminateReproduce Meta-structure Adjacency

IndividualPerformancePromising?

No Yes

Gene-level OperationsIndividual-level OperationsMeta-StructurePredictor Attention BasedMulti-view GCN

New Generation

Figure 2: Flow chart of GEMS. usually adopt multiple meta-structures. How to determine theirimportance and combine them together in a single model is anotherimportant issue in HIN recommendation.In GEMS model, we propose a novel genetic framework to au-tomatically design effective meta-structures. Besides, we carefullydesign a set of mutation rules and evaluation modules to furthernarrow down the search space. As for the second challenge onleveraging information of multiple meta-structures, we introducean attention based multi-view GCN module for recommendation.

The genetic algorithm maintains a population as a collection of individuals . Each individual possesses several genes as its uniquefeatures. In our HIN recommendation scenario, an individual is arecommendation model with selected meta-structures. To leveragethe power of genetic algorithms, we need to encode these meta-structures into genes as the problem search space. Such an encodingmethod should satisfy the property that every meta-structure canbe represented by the encoded gene. In that case, we encode themeta-structure by an ordered node type list 𝑇 𝑀 and an adjacencymatrix 𝑨 𝑀 according to linkages in the meta-structure.We first connect the meta-structure search with genetic algo-rithm by encoding meta-structure into gene representation. Asshown in Figure 2, all the genes of individuals are mutated and then crossover with each other to generate new genes or new combina-tions of genes. These two operations are introduced to explore theproblem search space. Then each individual will be pre-evaluatedby the meta-structure predictor of GEMS, which aims to judgethe performance of individuals by their gene combinations. Forpromising individuals with high pre-evaluated scores, they will beevaluated on recommendation tasks by the attention based multi-view GCN module. For individuals with low pre-evaluated scores,we directly use this predicted score as their performance. To selectpromising meta-structures, in genetic algorithm we eliminate in-dividuals by their performance. That is, we preserve individualswith high performance while delete badly performed ones. In thatcase, promising individuals and meta-structures therein will be pre-served. Further, the remaining individuals will reproduce according to their performance to construct a new generation. The reproduceprocess generate duplicates of promising individuals to maintainthe population size.To understand how these genetic operations work in the HINrecommendation scenario, we divide them into two categories. Atthe gene level, mutation will change the structure of genes, whichresults in new meta-structures. According to the representation ofgenes, mutation can be divided into two types: add/delete edgesas well as add/delete nodes. Edge mutation only affects the adja-cency matrix, which randomly flips elements. Node mutation willadd/delete nodes in the gene, which needs to add/delete elements inboth the type list and the adjacency matrix. For example, if we needto delete a specific node, it will be removed from type list 𝑇 𝑀 , thenthe corresponding row and columns will also be removed from 𝑨 𝑀 .On the other hand, if we need to add a new node, first we randomlychoose a node type and append it at the end of 𝑇 𝑀 . Then, a newrow and new column will also be appended to 𝑨 𝑀 . Elements of thenew row and new column are randomly set. Meanwhile, crossover operation exchanges genes among individuals to strengthen genecirculation. Gene itself will not be modified in crossover operation.In conclusion, mutation and crossover operations enable GEMS toexplore all possible search space of meta-structures effectively.At the individual level, elimination and reproduction operationswill guide the optimization process to find promising meta-structures.Genetic algorithm obsoletes individuals who perform badly on rec-ommendation tasks, and the corresponding genes will be deleted.The remaining individuals generate their gene copies to form newindividuals according to their performance, which makes promisinggenes prosper in the whole population. In that case, GEMS evolvesto leverage better meta-structures for recommendation. As mentioned before, the search space of meta-structures is huge.Particularly, there are three factors that enlarge the search space.First, the node type list 𝑇 𝑀 has different combinations due to theheterogeneous nodes in HIN. Given a HIN 𝐺 = (V , E , T , R) , thenumber of node types is defined as 𝑚 : = |T | . For a meta-structurecontaining 𝑛 nodes, there exists 𝑚 𝑛 type lists. Second, the con-nection between meta-structure nodes is another important factorthat enlarges the search space. Even if the node type list 𝑇 𝑀 isfixed, there are still 2 𝑛 × 𝑛 adjacency matrices that results in dif-ferent genes. Third, since the recommendation model leveragesmultiple meta-structures at the same time, different combinationsof meta-structures will further extend the required search space. Ifthe recommendation model adopts 𝑘 meta-structures, in the worstcase the problem search space is (cid:0) 𝑚 𝑛 × 𝑛 × 𝑛 𝑘 (cid:1) .The huge search space makes it impossible to find optimal meta-structures for recommendation. According to the above analysis,we propose three constraints to avoid meaningless meta-structuresduring the search process. First, based on the domain knowledgeof recommender system, we impose several constraints on meta-structure genes to eliminate infeasible choices. Second, we set upthree mutation rules to make sure that new genes produced by themutation process introduce meaningful information for recommen-dation. Third, we design a meta-structure predictor to reduce thenumber of individuals that will be evaluated on recommendationtasks. UI UBUBI

UU BB UU BBI I

11 1 1 Encode

Ordered list:Adjacency matrix:

Meta-Structure Gene -- Decode - --- - -- - - -- - - - - UB - Figure 3: An illustrating example of meta-structure encod-ing in GEMS.

Based on the expert knowledge ofrecommendation task, we can narrow down the encoding spaceof meta-structure with the following constraints. We make thefollowing observations of HIN based recommender systems. • Relations in HIN are constrained. Take Yelp as an example, category (A) nodes can only link with business (B) nodes, whilelinkages with user (U) nodes are not allowed. • Models leveraging HIN for recommendation usually concernthe source and the sink nodes of meta-structures but not thewhole path. For example, in [14, 21] meta-structures performas semantic filters, where new interaction matrices under corre-sponding semantics are constructed to provide side informationfor recommendation.Based on the first observation, during the encoding process,we consider possible relations in the HIN by presetting forbiddenlinks in adjacency matrix 𝑨 𝑀 as −

1. In that case, we exclude meta-structures that are not practical for specific HIN. According to thesecond observation, we focus on the source node and sink nodeonly. We refer these two nodes as target nodes in the followingtext. We fix the first two nodes of 𝑇 𝑀 as the target nodes. Forexample, in Yelp dataset, the target nodes are user (U) and business(B) accordingly, since we need to recommend businesses for users.In recommendation scenario, the influence between users and itemsis mutual, which means the link between the target nodes is bi-directional. In that case, we can omit the direction of meta-structure,resulting in an upper triangular adjacency matrix.Figure 3 demonstrates a meta-structure with its encoded genein Yelp dataset. The source node is set as user (U) , which is coloredas green. The sink node is business (B) , which is colored as red.According to the optimized encoding method, most elements inour designed adjacency matrix is −

1, which greatly reduces theproblem search space. In this illustrative example, the number ofremaining non -1 positions is reduced from 25 to 7, which will behelpful to narrow down the problem search space.

During evolution, new genes will be gen-erated by the mutation operation. However, many mutated genesare meaningless for recommendation. We introduce three muta-tion rules to ensure most of the new genes will exert influence onrecommendation. • Avoid non-exist links: According to the definition of the meta-structure gene, during each mutation we will not flip elementsof −

1. This rule promises the effectiveness of new genes so that all mutated meta-structure are available in HINs. As shown inFigure 4(a), the mutated graph connects user (U) and city (I) ,which is not allowed in this scenario. • Avoid constant information loops: In recommendation scenario,new interactions are extracted according to meta-structures. Ifthe mutation does not affect the paths between target nodes,there will be little difference in searched adjacency matrix. Sincethe following GCN module is based on this adjacency, the rec-ommendation performance will not receive much performancegain. Based on this understanding, we have the second rulethat the paths between target nodes should be changed duringmutation. In Figure 4(b), the mutation does not introduce newpaths from the green user (U) to the red business (B) , which isconsidered invalid. • Cut off side-branches: After mutation there will be side-branchesin the meta-structure, e.g. a single node that only has one edgewith all other nodes. These side-branches will not affect theeffectiveness of meta-structures since they do not provide anyinformation. In that case, after the mutation we delete all theside-branches to further narrow down gene numbers that needto be evaluated in the following module. Figure 4(c) demonstratesuch side-branches. Since each business (B) has a corresponding category (A) connection, these side-branches are useless forrecommendation.

Different combinations of meta-structures will greatly affect recommendation performance. Themeta-structure predictor tries to infer recommendation perfor-mance by meta-structure combinations for each individual. Intro-ducing the meta-structure predictor to filter promising individualscan save lots of computational resources and explore more meta-structures in limited time. In the worst case, it can remember theperformance of evaluated meta-structures, which is also helpfulfor speed-up search process. Here, we build a small GCN networkaccording to the meta-structures (rather than the adjacency ma-trix it corresponds). Each kind of node type has an embeddingvector, and the GCN layer is built on connections defined by themeta-structure. The meta-structure predictor is trained by the realrecommendation performance of each individual. We normalize themetrics into − ∼ To leverage different semantic information in HIN, recommenda-tion models need to adopt multiple meta-structures to extract cor-responding knowledge. In GEMS, we design a multi-view GCN ar-chitecture to fuse semantic information guided by meta-structures.Figure 5 demonstrates the architecture of proposed multi-viewGCN on user (U) nodes. For each individual, GEMS builds multipleindependent GCN layers by different meta-structure adjacencymatrices. These GCN layers will learn embedding vectors that havedifferent semantic meanings on HIN recommendation tasks. Tofully leverage the combinational effect on semantics, we fuse theseembedding vectors in following attention fusing module. Finally U I UB B U

I UB

Mutated Meta-StructureMeta-Structure (a) Non-exist links B U I Meta-Structure

UB B U

I UB U B

Mutated Meta-Structure U (b) Constant information loops B U I UB B U I UB

Mutated Meta-Structure A Meta-Structure (c) Side-branches

Figure 4: Examples of mutation rule violations.

GCN 1

Embedding

GCN 2

EmbeddingGCN N

Embedding

Fused

Embedding ...

Concat softmaxdot · X ... Attention Fusing P r o j P r o j Mean

Aggregate

Meta-StructureAdjacency

Figure 5: Attention based Multi-view GCN. the fused embeddings are used to calculate similarities betweenuser-item pairs for recommendation tasks.During the construction of meta-structure adjacency, samplingis needed to control the number of linkages in the adjacency matrixsince the neighbor size goes quickly with the depth of the meta-structure. To ensure fairness during sampling, the search processbegins from the node type which has the biggest degree in the HIN.We denote the adjacency matrix corresponding with meta-structure M as N 𝑀 , and the neighbor set of node 𝑖 is N 𝑀 { 𝑖 } .According to different adjacency matrices defined by meta-structures,we construct multiple interaction graphs. Upon these interactiongraphs, GEMS trains different GCN layers to capture correspondingsemantic information on HIN. It should be noted that, the meta-structure can connect two far-away nodes, which enlarge the re-ceptive field of GCN. In that case, we adopt a single layer GCN foreach meta-structure to reduce the computational overhead.To fuse the embeddings learned by different GCNs together, wepropose an attention based embedding fusion mechanism to dy-namically assign the importance weight for every meta-structuresof the individual.Each GCN layer generates an embedding as h ★ ∈ R 𝑑 for targetnodes, where ★ represents different GCN layers and 𝑑 is the em-bedding size. We concatenate 𝑛 embeddings together, then applya transformation matrix 𝑾 ∈ R 𝑑 × 𝑛𝑑 to map the concatenated em-bedding into the same size of GCN embedding. This embeddingserves as query vectors q 𝑖 ∈ R 𝑑 , that is q 𝑖 = 𝑓 (cid:0) 𝑾 ( h 𝑖 ⊕ h 𝑖 . . . ⊕ h 𝑛𝑖 ) (cid:1) , (1)where 𝑓 (·) denotes a nonlinear activation function, and ⊕ denotesthe concatenation operation. With the query vector q 𝑖 , we can assign attention weights fordifferent meta-structure embeddings. For meta-structure embed-ding ★ ∈ { , , . . . , 𝑛 } , we have attention weight 𝜶 𝒊 expressed asfollows: 𝜶 𝑖 = softmax ( h ★ 𝑖 · q 𝑖 ) . (2)The fused embedding is computed as the attention coefficientweighted sum over the embeddings derived from different meta-structures, which is obtained as follows: 𝑦 𝑖 = 𝑛 ∑︁ ★ = 𝛼 ★ 𝑖 h ★ 𝑖 . (3)Finally, the inner product of fused embeddings is used to evaluatethe similarity between users and items.According to the embedding fusing module, the GEMS modelcombines different semantics from meta-structures for individuals. In GEMS, the attention based multi-view GCN is trained in a su-pervised learning approach with the purchase record of dataset D = { < 𝑣 𝑢 , 𝑣 𝑖 > } , where 𝑣 𝑢 stands for users ans 𝑣 𝑖 stands foritems. The specific meaning of the item depends on the dataset. Therecommendation likelihood is calculated as follows: 𝑧 ( 𝑣 𝑢 , 𝑣 𝑖 ) = 𝜎 (cid:0) 𝑦 𝑣 𝑢 · 𝑦 𝑣 𝑖 (cid:1) , (4)where 𝑦 𝑣 𝑢 , 𝑦 𝑣 𝑖 are fused embeddings of the GCN backend, and 𝜎 (·) is the sigmoid activation function.For any positive purchase record in dataset D , we sample a pre-defined number of negative items 𝑣 𝑛 according to the occurrencefrequency of items. We adopt a max-margin based ranking lossfunction to train the GCN backend as follows: 𝐽 G ( 𝑣 𝑢 , 𝑣 𝑖 ) = E 𝑣 𝑛 max { , 𝑧 ( 𝑣 𝑢 , 𝑣 𝑛 ) − 𝑧 ( 𝑣 𝑢 , 𝑣 𝑖 ) + Δ } , (5)where Δ denotes the hyper-parameter of pre-defined margin. Theintuition is to train the model to predict the positive samples witha higher likelihood by a pre-defined margin than negative items.Individuals are fully independent after crossover operation, andmeta-structure adjacency search and GCN training can be fully par-alleled. This enables us to leveraging the high concurrency abilityof modern computing hardware to speed-up model convergence.To better understand the procedure of GEMS, we provide a detaileddescription as shown in Algorithm 1. lgorithm 1 : Genetic Meta-Structure Search Require:

HIN 𝐺 = (V , E) , Dataset D = { < 𝑣 𝑢 , 𝑣 𝑖 > } , node fea-tures { x 𝑣 , ∀ 𝑣 ∈ 𝑉 } , epoch number 𝑒 Ensure: Initialize 𝑔𝑒𝑛𝑒𝑃𝑜𝑜𝑙 for the 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 Initialize meta-structure predictor 𝑝𝑟𝑒𝑑𝑖𝑐 (·) for 𝑒𝑝𝑜𝑐ℎ = 𝑒 do /*Mutate and Crossover*/ 𝑔𝑒𝑛𝑒𝑃𝑜𝑜𝑙 = Mutate ( 𝑔𝑒𝑛𝑒𝑃𝑜𝑜𝑙 ) 𝑔𝑒𝑛𝑒𝑃𝑜𝑜𝑙 = Crossover ( 𝑔𝑒𝑛𝑒𝑃𝑜𝑜𝑙 ) for 𝑖𝑛𝑑𝑖𝑣𝑖𝑑𝑢𝑎𝑙 do Assign 𝑖𝑛𝑑𝑖𝑣𝑖𝑑𝑢𝑎𝑙𝐺𝑒𝑛𝑒𝑠 from 𝑔𝑒𝑛𝑒𝑃𝑜𝑜𝑙 /*Using predictor to filter promising individuals*/ if 𝑝𝑟𝑒𝑑𝑖𝑐 ( 𝑖𝑛𝑑𝑖𝑣𝑖𝑑𝑢𝑎𝑙𝐺𝑒𝑛𝑒𝑠 ) < 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑 then 𝑚𝑒𝑡𝑟𝑖𝑐 = 𝑝𝑟𝑒𝑑𝑖𝑐 ( 𝑖𝑛𝑑𝑖𝑣𝑖𝑑𝑢𝑎𝑙𝐺𝑒𝑛𝑒𝑠 ) continue else for 𝑔𝑒𝑛𝑒 in 𝑖𝑛𝑑𝑖𝑣𝑖𝑑𝑢𝑎𝑙𝐺𝑒𝑛𝑒𝑠 do /*Adjacency Search*/ N 𝑔𝑒𝑛𝑒 = AdjSearch ( 𝑔𝑒𝑛𝑒 ) end for /*Train the GCN and preform recommendation task*/ 𝑚𝑒𝑡𝑟𝑖𝑐 = GCN( x 𝑣 , D , N ) end if end for /*Eliminate and Reproduce*/ 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 = Eliminate( 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛, 𝑚𝑒𝑡𝑟𝑖𝑐𝑠 ) 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 = Reproduce( 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛, 𝑚𝑒𝑡𝑟𝑖𝑐𝑠 ) Train meta-structure predictor by ( 𝑖𝑛𝑑𝑖𝑣𝑖𝑑𝑢𝑎𝑙, 𝑚𝑒𝑡𝑟𝑖𝑐 ) pairs end for We evaluate our proposed solution GEMS as well as competitivebaselines on three widely used real-world recommendation datasetswith rich heterogeneous information, Yelp , Douban Movie andAmazon . Yelp is a platform for users to rate local businesses andshare photos for others’ reviews. Douban Movie is a well-knownsocial media network in China for users to rate, share, and commentmovies. Amazon is one of the biggest e-commerce platforms withglobal operations. These three datasets have different densities ofrecords and semantic information, which guarantees the reliabilityof experiments.The basic statistics and relations of these datasets are availablein Table 1. Douban Movie is a dense dataset, whose rating matrixachieves a density of 0 . . https://movie.douban.com/ http://jmcauley.ucsd.edu/data/amazon/ Table 1: The basic information of evaluation datasets.

Dataset Relations (A-B)

User-Business (U-B) 16239 14284 198397

User-User (U-U) 16239 16239 158590User-Compliment (U-O) 16239 11 76875Business-City (B-I) 14284 47 14267Business-Category (B-A) 14284 511 40009DoubanMovie

User-Movie (U-M) 13367 12677 1068278

User-Group (U-G) 13367 2753 570047User-User (U-U) 13367 13367 4085Movie-Actor (M-A) 12677 6311 33587Movie-Director (M-D) 12677 2449 11276Movie-Type (M-T) 12677 38 27668Amazon

User-Item (U-I) 6170 2753 195791

Item-View (I-V) 2753 3857 5694Item-Category (I-C) 2753 22 5508Item-Brand (I-B) 2753 334 2753 user nodes and two relations for business nodes, which preservesrich semantics for HIN recommendation.In order to show the performance gain of our proposed GEMSmodel, we compare it with eight state-of-the-art baselines in rec-ommendation tasks. Specifically, they include traditional matrixfactorization based models (NMF, BMF, Metapath MF, and SVD++),GCN-based models (PinSAGE and GAT), alone with recent modelsthat leverage meta-structures in HINs (HAN and FMG). Recently,new HIN approaches inspired by transformer network like GTN[20] and HGT [3] emerged. In these model there are no explicitly de-fined meta-structures, and meta-structures are only used to explainthe learnt importance of different adjacency matrices. From thisperspective, they are different from other models and not includedin the baseline comparison. • NMF [8]: A matrix factorization model that results in non-negative matrices, which represents additive features of usersand items. • BMF [7]: A matrix factorization model that contains bias termfor each user and item to better depict personal characteristics. • Metapath MF [12]: In Metapath MF, new interaction matricesare generated based on meta-structures defined in Table 2, thentrain multiple basic MF model accordingly. The learned embed-dings are averaged as fused embedding, then the embeddingis processed by a linear transformation. Finally, we performinner product on the output embeddings as MF to generaterecommendation scores. • SVD++ [6]: Enhanced singular value decomposition algorithmfor recommendation. • PinSAGE [19]: A GCN model on the homogeneous graph thatsamples fixed size of neighbors during aggregation. • GAT [13]: A wildly used attention-based GCN model in the ho-mogeneous graph that dynamically assigns weights for neigh-bors. • HAN [14]: A state-of-the-art GCN-based network embeddingmodel for HINs. We adapt the model for recommendation taskand keep the original meta-structure design where the meta-structure links two homogeneous types of nodes. able 2: Hand-crafted meta-struct. designs from previous pa-pers.

Dataset Meta-StructuresYelp U-B, U-U-B, U-B-U-B, U-B-A-B, U-B-I-BDouban Movie U-M, U-M-U-M, U-G-U-M, U-M-A-M, U-M-T-MAmazon U-I, U-I-U-I, U-I-V-I, U-I-B-I, U-I-C-I • FMG [21]: The state-of-the-art model leveraging meta-structuresin HIN recommendation. FMG combines matrix factorizationwith factorization machine to relieve data sparsity. • GEMS-fix: Instead of searching all possible meta-structures, wepreset some hand-crafted meta-structures according to previouspapers to exam the performance of our model. The adoptedmeta-structures are shown in Table 2.

For evaluation, we divide each dataset into three parts for training,validating and testing with 8:1:1 ratio. Since it is inefficient to rankthe test items with the entire item set, for each train/test record,we randomly sample 4/100 negative items based on the popularityto evaluate the models. Then we adopt three commonly used per-formance metrics as

Hit Ratio at Rank K (HR@K),

Mean ReciprocalRank at Rank K (MRR@K) and

Normalized Discounted CumulativeGain at Rank K (NDCG@K). HR@K accounts for whether test itemsare present in the top-k list, and MRR@K and NDCG@K measurethe ranking positions of test positive items.For the baseline models, we reference the implementations re-leased by the authors and change the loss function into a margin-based ranking loss for recommendation purpose. For the Yelp andAmazon dataset, all the GCN based models like PinSAGE, GAT,HAN and GEMS adopt MF pre-trained embeddings as feature in-puts. Considering the number of interactions is huge in Doubandataset, we let these models learn their own feature inputs to avoidlimiting their representational ability. We fix the dimensions ofembeddings for all evaluated models at 64, and tune the learningrate and regularization parameters to optimal for each model bygrid search. Specifically, we adopt the ADAM optimizer with learn-ing rate decay for fine-grained results. For our GEMS, we set thepopulation size as 20, which means 20 individuals in each gener-ation. Each individual contains 5 genes, that are 5 correspondingmeta-structures. The mutation probability is set to 0.6 in the firstfew generations, then decreases to 0.3. For each mutation, we setthe equal probability for being complex or simple to give morefreedom to the searching process in a random walk manner.The implementation code of our model is available at https://github.com/0oshowero0/GEMS.

The main experiment results of comparing with the baselines acrossthe three datasets are reported in Table 3. From the results, we havethe following observations and conclusions. • In all the three datasets, our proposed GEMS model constantlyoutperforms the baselines on all evaluation metrics. These re-sults demonstrate the GEMS model can find a more useful meta-structure structure to leverage heterogeneous information betterthan baseline models. • GEMS with the hand-crafted meta-structures does not catch upwith the state-of-the-art model like HAN and FMG. However,with meta-structure search, the GEMS model greatly outper-forms the best baseline method. This phenomenon confirms theimportance of meta-structure design in HIN recommendation. • Compare GEMS with GEMS-fix, we can observe that GEMS out-performs GEMS-fix by 11.7%, 8.8% on HR@3 and NDCG@10 onaverage. GEMS-fix adapts commonly used hand-crafted meta-structures for recommendation, while it greatly falls behind thesearched meta-structures of GEMS. This phenomenon demon-strates that hand-crafted meta-structures are not optimal forrecommendation. • For HINs with different relationships and densities, GEMS sur-passes all state-of-the-art-models. In Douban Movie dataset, thelink density of target nodes is over 6 times bigger than Yelpdataset, which prefers dedicate designed models like HAN orFMG. However, the GEMS model still outperforms all baselines,demonstrating its adaptability over different datasets.Above analysis demonstrate the effectiveness of our proposedGEMS model, which constantly achieves better performance ondifferent recommendation scenarios.

To further understand the efficiency of GEMS, we move forwardto analyze the performance-time curve on recommendation tasks.An important note is that the search process only exists in the firstrun. After we found the effective meta-structures, we can omit thesearch process by directly set the optimized meta-structures. SinceGEMS only use one-layer GCN, the inference time should be fasterthan traditional multiple-layer GCN baselines.Figure 6 shows the performance evolution of model performancewith time on three different datasets. We record the average perfor-mance (NDCG@10) of each generation, and choose the maximumperformance before a given timestamp. In that case, we can clearlycapture the performance evolution of the whole population ratherthan some outstanding individuals. The initial meta-structure isthe direct connection of target interactions for Yelp and DoubanMovie dataset. However, for Amazon dataset, initializing by directinteraction will greatly limit the searching process due to the lim-ited relations. To break the symmetry we randomly initialize themeta-structures, which results in the difference at the beginning.For Yelp dataset, we can observe a clear convergence process forboth situations. The performance gain on Yelp dataset is over 2 . . .

6% performance gain accordingly. Considering each recordedperformance is averaged for the whole generation, we can drawthe conclusion that the search process of GEMS is effective.We also compare the performance of GEMS in both with pre-dictor and without predictor to exam whether the predictor canhelp us to explore more meaningful meta-structures in the sameperiod of time. For Yelp dataset, GEMS with predictor achieves a able 3: Performance comparison with baseline models, where ( ∗ ) indicates p<0.01 significance over best baseline. Method Yelp Douban Movie Amazon

HR3 MRR10 NDCG10 MRR50 NDCG50 HR3 MRR10 NDCG10 MRR50 NDCG50 HR3 MRR10 NDCG10 MRR50 NDCG50

NMF 0.1321 0.1172 0.1692 0.1431 0.2718 0.1359 0.1254 0.1745 0.1430 0.2763 0.1249 0.1197 0.1640 0.1408 0.2739BMF 0.1299 0.1148 0.1671 0.1366 0.2696 0.1358 0.1211 0.1726 0.1447 0.2844 0.1380 0.1223 0.1739 0.1448 0.2810Metapath MF 0.1255 0.1124 0.1625 0.1343 0.2670 0.1274 0.1133 0.1634 0.1370 0.2753 0.1301 0.1144 0.1639 0.1361 0.2674SVD++ 0.1463 0.1316 0.1733 0.1521 0.2811 0.1394 0.1241 0.1767 0.1482 0.2845 0.1505 0.1279 0.1773 0.1494 0.2876PinSAGE 0.1694 0.1456 0.2033 0.1662 0.3005 0.1463 0.1287 0.1808 0.1517 0.2902 0.1469 0.1285 0.1801 0.1503 0.2835GAT 0.1706 0.1459 0.2020 0.1665 0.2997 0.1345 0.1189 0.1683 0.1418 0.2786 0.1510 0.1349 0.1883 0.1566 0.2908HAN 0.1674 0.1459 0.2047 0.1675 0.3056 0.1496 0.1322 0.1854 0.1552 0.2945 0.1525 0.1348 0.1900 0.1572 0.2953FMG 0.1765 0.1486 0.2062 0.1675 0.2957 0.1510 0.1336 0.1800 0.1572 0.2901 0.1532 0.1313 0.1877 0.1555 0.2910GEMS-fix 0.1611 0.1405 0.1967 0.1621 0.2981 0.1437 0.1265 0.1779 0.1495 0.2874 0.1505 0.1322 0.1861 0.1545 0.2912GEMS ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ A v e r age P e r f o r m an c e With PredictorWithout Predictor (a) Yelp dataset A v e r age P e r f o r m an c e With PredictorWithout Predictor (b) Douban Movie dataset A v e r age P e r f o r m an c e With PredictorWithout Predictor (c) Amazon dataset

Figure 6: The time efficiency analysis of GEMS. relatively higher performance within 500 minutes, while the modelwithout predictor takes nearly 1500 minutes to achieve the samelevel. The meta-structure predictor has great potential to boostthe convergence of GEMS. For Douban Movie dataset, GEMS withpredictor achieves 3 .

9% performance gain compared with the ini-tial state. Douban Movie shares the most complex relations, whichrequire the model to explore more meta-structures. GEMS with-out predictor does not achieve better performance than the modelwith predictor one during the whole time window, which indicatesthe possible performance gain due to the meta-structure predictor.Similar behavior exists in Amazon dataset. In conclusion, the meta-structure predictor is helpful in reducing the required training timeas well as improving performance.

In order to investigate whether the proposed GEMS model can findnew meta-structures that are meaningful for recommendation tasks,we exam the most frequent meta-structures in the last 5 generationsof GEMS to analyze the physical meaning.As shown in Figure 7, we demonstrate the top 10 meta-structureseach dataset. The green node and red node are target nodes of themeta-structure. These genes accounts for 39 . , . , .

5% of all500 meta-structures during the last 5 generations. We labeled eachmeta-structures from M1 to M10 for short. Some of the searchedmeta-structures are already known by us, while others are nevershown in recommender system designs. Thus, we divided the meta-structures into two groups in Table 4. In Yelp dataset, we already know the direct interactions andsocial-related meta-structures are beneficial for recommendationtasks. For example, M2 depicts the direct relationship between usersand businesses, M5 and M7 depict the behavior of common interestbetween friends. Besides, the collaborative filtering meta-structureM10 often appears in meta-structure based recommendation models.At the same time, new meta-structures can also facilitate recommen-dations based on their semantics. M6 emphasizes the importance oflocality, where users tend to visit local businesses in the same city.M3 and M9 depict enhanced social groups that the same businessvisited by more than one friend is also attractive for the targetcustomer. What’s more, M4 delivers a more complex relationshipthat a user tends to patronize the local business that her friend hasbeen to. These searched meta-structures show that social relationsand locality play an important role in Yelp dataset.For movie recommendation in Douban Movie, we can find manycollaborative filtering meta-structures like M2, M3, and M6. Accord-ing to the traditional understanding, users tend to appreciate newmovies of the same actor or the same director. We can also findsimilar social relations as Yelp in M5, M8, where friends tend toshare the same interest in movies. New meta-structures like M4 andM10 are two enhanced versions of collaborative filtering. For exam-ple, users prefer new movies with the same director and the sametype with old ones according to M4. Besides, M1 and M7 combinesocial relation with collaborative filtering, where the collaborativefiltering path links the friend of the target user.

BOOU B U M1 UB M2 UBBU U M3 UBUI B M4 UBU M5 UBB I M6 UBU M7 UBUU U M8 UBU U M9 UBA B

M10 (a) Yelp dataset

UMDM U M1 UMM A M2 UMM A M3 UMDM T M4 UMU M5 UMD M M6 UMDM U M7 UMU M8 UMDM U M9 UMAM T

M10 (b) Douban Movie dataset UI M1 UII U M2 UIU I M3 UII B M4 UIV I M5 UIIB I C M6 UII V M7 UIBI I V M8 UIC I M9 UIU I

M10 (c) Amazon dataset

Figure 7: Most important meta-structures identified byGEMS.Table 4: Hand-crafted and newly searched meta-structures.

Dataset Hand-crafted Meta-Struct. Newly Searched Meta-Struct.Yelp M2, M5, M7, M10 M1, M3, M4 ,M6, M8, M9Douban M2, M3, M5, M6, M8 M1, M4, M7, M9, M10Amazon M1-M5, M7, M9, M10 M6, M8

In Amazon dataset, user-user relationship does not exist. Userscan only connect to item nodes, generating a traditional recommen-dation scenario. The resulting meta-structures show that most ofthe meta-structures can be regarded as a simple modification oncollaborative filtering meta-structures. It shows that the GEMS canalso work well in traditional recommendation scenarios.From these results and analysis, we can confirm the effective-ness of the searched meta-structures by GEMS. Besides, these meta-structures can also provide new inspirations on recommender sys-tem design. First, there are only 7 meta-structures among these 30meta-structures, which confirms the effectiveness of meta-structuresfor HIN recommendation. Newly designed recommendation mod-els should adopt meta-structures to better leverage the complexrelations of HINs. Another important insight is that there is no per-fect design paradigm that suits all recommendation scenarios. ForYelp users, social factors are very important. While for the DoubanMovie or Amazon users, they often pay more attention to the itemitself. During the design of meta-structures, we need to considerthe core feature of scenarios for a better recommendation. (a) Training result on Amazon dataset (b) Testing result on Amazon dataset

Figure 8: Effectiveness of meta-structure predictor.Table 5: Spearman correlations of meta-structure predictors.

Dataset Training set Test setYelp 0.5499 0.3426Douban Movie 0.4517 0.4946Amazon 0.5313 0.4160

To further evaluate the effectiveness of the proposed meta-structurepredictor, we train meta-structure predictors on the historical per-formance of each individual, then compare their performance onboth training set and test set. Figure 8 shows the relation betweenthe true label and predictor output on Amazon dataset. Ideally,the predicted results should be exactly the same as the true label.However, since the predictor evaluates the performance at indi-vidual level, it is not able to hard remember the performance ofmeta-structures, which brings more difficulties during inference.We need to note that the absolute output is not a key factor to eval-uate the predictor. Since the meta-structure predictor is designedto select promising individuals, we only expect the relative perfor-mance is correct. In that case, the promising individuals will stillsurpass worse ones and sent to be evaluated on real recommenda-tion tasks. To evaluate the ranking performance, we further examthe Spearman ranking correlations between the predictor outputand true labels as shown in Table 5. We can find that the rankingperformance of meta-structure predictor is promising for unseendata on the test set. Compared with other automated predictors like[9], our meta-structure predictor achieves reasonable performanceon model selection.

We investigate the sensitivity of parameters for GEMS model onregularization parameters and embedding dimension. Experimentsare implemented on Yelp dataset.From Figure 9, we can observe that the performance varies withregularization parameters (lambda) and the embedding dimensions,but the NDCG@10 consistently remains above 0.16. Specifically,the optimal regularization parameter for GEMS is 0.05. In addition,the NDCG@10 consistently increases from 0 . . .01 0.03 0.05 0.07 0.10 0.30lambda0.160.180.200.22 NDC G @ (a) Regularization

16 32 64 128 256Embedding Size0.160.180.200.22

NDC G @ (b) Embedding size Figure 9: Parameter sensitivity on Yelp dataset.

Graph convolutional networks:

The idea of GCNs is to designa spatial in-variant aggregate function to generate node embed-dings by aggregating features from their local neighbourhood [5].In recommendation systems, user-item connections naturally forma network structure, which suits the GCN based models. PinSage[19] constructs GCN on item graph to learns item-to-item similar-ity, and DiffNet [15] performs social diffusion process on user-usergraph by GCN. However, using a homogeneous graph greatly lim-its the expressiveness of GCN model. In that case, heterogeneousinformation networks are becoming the mainstream setup of GCN.

Heterogeneous information networks:

Heterogeneous infor-mation network [11] is a well-established framework that containsdifferent types of nodes and relations, which processes much moresemantic information than the traditional homogeneous network.To effectively guide information propagation in HINs, meta-pathbased models like PathSim [11] have been proposed. Meta-pathbased models demonstrate promising results on HIN setting, whichstimulates many data mining tasks like recommendation [10], simi-larity search and classification [1] to leveraging the power of HINs.Except for meta-path based model, the meta-structure has also be-come a new topic in HINs. In recommendation area, social factorsare important for a success recommendation [17], which requiresHINs to improve the recommendation quality. RecoGCN [18] pro-poses a meta-path based attention GCN in agent-initiated sociale-commerce scenario. FMG [21] adopts a MF+FM schema to lever-age meta-structures for recommendation. As a more general term,we use meta-structure to represent both the two concept.

Automated machine learning:

AutoML is a broad concept thatapplies to all automated models that try to take the place of humanson identifying proper configurations in machine learning concept.Automated model selection [16] is a common examples of AutoML.In this paper, the proposed GEMS leverages genetic algorithm tofind meaningful meta-structures, which is the first try on automatedmeta-structure searching for HIN recommendation.

In this paper, we proposed a novel model leveraging automatedmachine learning paradigm to search promising meta-structuresfor HIN recommendation. To effectively explore possible meta-structures, we carefully designed the search space of the problemwhich boosts the searching efficiency. During recommendation,we proposed a multi-view GCN armed by attention mechanismto fuse different semantic information guided by meta-structures. Extensive experiments demonstrate the performance gain in GEMS,where the optimized meta-structures also shed light on sophisti-cated recommendation model design in HIN scenario. Importantfuture works will be conducted on a more dedicated downstreamscorer function to generate better recommendations.

ACKNOWLEDGMENTS

This work was supported in part by The National Key Researchand Development Program of China under grant 2018YFB1800804,the National Nature Science Foundation of China under U1936217,61971267, 61972223, 61941117, 61861136003, Beijing Natural Sci-ence Foundation under L182038, Beijing National Research Centerfor Information Science and Technology under 20031887521, andresearch fund of Tsinghua University - Tencent Joint Laboratoryfor Internet Innovation Technology.

REFERENCES [1] Phiradet Bangcharoensap, Tsuyoshi Murata, Hayato Kobayashi, and NobuyukiShimizu. 2016. Transductive classification on heterogeneous information net-works with edge betweenness-based normalization. In

WSDM 2016 . 437–446.[2] David E Goldberg. 2006.

Genetic algorithms . Pearson Education India.[3] Ziniu Hu, Yuxiao Dong, Kuansan Wang, and Yizhou Sun. 2020. Heterogeneousgraph transformer. In

WWW 2020 . 2704–2710.[4] Zhipeng Huang, Yudian Zheng, Reynold Cheng, Yizhou Sun, Nikos Mamoulis,and Xiang Li. 2016. Meta structure: Computing relevance in large heterogeneousinformation networks. In

KDD 2016 . 1595–1604.[5] Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graphconvolutional networks. arXiv preprint arXiv:1609.02907 (2016).[6] Yehuda Koren. 2008. Factorization meets the neighborhood: a multifacetedcollaborative filtering model. In

KDD 2008 . 426–434.[7] Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix factorization tech-niques for recommender systems.

Computer

42, 8 (2009), 30–37.[8] Daniel D Lee and H Sebastian Seung. 1999. Learning the parts of objects bynon-negative matrix factorization.

Nature

ECCV 2018 . 19–34.[10] Chuan Shi, Zhiqiang Zhang, Ping Luo, Philip S Yu, Yading Yue, and Bin Wu. 2015.Semantic path based personalized recommendation on weighted heterogeneousinformation networks. In

CIKM 2015 . 453–462.[11] Yizhou Sun, Jiawei Han, Xifeng Yan, Philip S Yu, and Tianyi Wu. 2011. Pathsim:Meta path-based top-k similarity search in heterogeneous information networks.

Proc. VLDB Endow.

4, 11 (2011), 992–1003.[12] Fatemeh Vahedian, Robin Burke, and Bamshad Mobasher. 2016. Meta-pathselection for extended multi-relational matrix factorization. In

FLAIRS Conference2016 .[13] Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, PietroLio, and Yoshua Bengio. 2017. Graph attention networks. arXiv preprintarXiv:1710.10903 (2017).[14] Xiao Wang, Houye Ji, Chuan Shi, Bai Wang, Yanfang Ye, Peng Cui, and Philip SYu. 2019. Heterogeneous graph attention network. In

WWW 2019 . 2022–2032.[15] Le Wu, Peijie Sun, Yanjie Fu, Richang Hong, Xiting Wang, and Meng Wang. 2019.A neural influence diffusion model for social recommendation. In

SIGIR 2019 .235–244.[16] Lingxi Xie and Alan Yuille. 2017. Genetic CNN. In

ICCV 2017 . 1379–1388.[17] Fengli Xu, Zhenyu Han, Jinghua Piao, and Yong Li. 2019. " I Think You’ll Like It"Modelling the Online Purchase Behavior in Social E-commerce.

Proceedings ofthe ACM on Human-Computer Interaction

3, CSCW (2019), 1–23.[18] Fengli Xu, Jianxun Lian, Zhenyu Han, Yong Li, Yujian Xu, and Xing Xie.2019. Relation-aware graph convolutional networks for agent-initiated sociale-commerce recommendation. In

CIKM 2019 . 529–538.[19] Rex Ying, Ruining He, Kaifeng Chen, Pong Eksombatchai, William L Hamilton,and Jure Leskovec. 2018. Graph convolutional neural networks for web-scalerecommender systems. In

KDD 2018 . 974–983.[20] Seongjun Yun, Minbyul Jeong, Raehyun Kim, Jaewoo Kang, and Hyunwoo J Kim.2019. Graph Transformer Networks. In

NIPS 2019 . 11960–11970.[21] Huan Zhao, Quanming Yao, Jianda Li, Yangqiu Song, and Dik Lun Lee. 2017. Meta-graph based recommendation fusion over heterogeneous information networks.In