Reinforcement learning on graphs: A survey

Recent successes in reinforcement learning (RL) have led to problem-solving across various fields such as robotics, gaming, and natural language processing. RL deals with how agents should learn to take actions to maximize cumulative rewards through interactions with the environment. The rapid advancement of RL has prompted scholars to explore new RL models for real-world applications in domains like finance, healthcare, and transportation. There is active research in data mining for graph structures, as many real-world datasets are represented as graphs. With the continuous development of RL methods in recent years, scholars are increasingly interested in combining graph mining with RL to address decision problems arising in graph mining tasks. Collaborative research on graph mining algorithms and RL models is on the rise, as evidenced by trends in published papers on Graph Reinforcement Learning (GRL) from January 2017 to April 2022.

Traditional methods and deep learning-based models for graph mining tasks have significant differences in model design and training processes compared to RL-based methods, posing challenges for scholars in employing RL methods to analyze graph data. Scholars have been extensively researching in the fields of RL and graph mining to address these challenges, with attempts made in various areas such as rumor detection, recommendation systems, and automated machine learning (AutoML). The authors define GRL as solutions and measures for solving graph mining tasks by analyzing critical components such as nodes, links, and subgraphs in graphs with RL methods to explore the topological structure and attribute information of the graphs. A systematic review of this area is deemed necessary, and the authors believe their work represents the first comprehensive survey of various GRL methods.

Preliminaries

Graph Neural Networks

A Graph Neural Network (GNN) is a deep learning model used to process data represented in graph structures. This model learns relationships between nodes and edges (links) within a graph to perform various tasks within the graph.

Fundamentally, a GNN takes as input the feature vectors of each node in the graph, representing attributes of the nodes. For example, in a social network, these feature vectors might represent user profile information.

GNN aggregates information from neighboring nodes to compute embeddings for each node. This allows for representations of each node considering its local context.

The basic operation of a GNN can be described as follows:

  1. Input Features: Given a matrix $X$ representing the feature vectors of each node, with dimensions $n × d$, where $n$ is the number of nodes and $d$ is the dimensionality of each node’s feature vector.

  2. Neighbor Aggregation: Each node collects information from its neighboring nodes and integrates this information to create a new representation for itself. This can be expressed mathematically as:

    \[\begin{align*} h_v^{(l)} = \sigma \left( \sum_{u \in N(v)} f^{(l)}(h_u^{(l-1)}, h_v^{(l-1)}, e_{uv}) \right) \end{align*}\]

    Here, $ h_v^{(l)} $ represents the embedding of node $v$ in layer $l$, $ N(v) $ is the set of neighboring nodes of node v, $ f^{(l)} $ is the aggregation function at layer $l$, and $ e_{uv} $ represents information about the edge between nodes $u$ and $v$.

  3. Recursive Computation: This neighbor aggregation step is repeated multiple times to iteratively update the embeddings of each node, taking into account the global graph structure.

  4. Output Computation: Finally, the final embeddings of each node are used to perform the desired task (e.g., classification, regression, prediction).

In this way, a Graph Neural Network effectively infers and predicts node attributes while considering the complexity of the graph structure, serving as a powerful model for various applications within graphs.

Representation Learning

Network Representation Learning, also known as graph embedding or graph representation learning, aims to learn low-dimensional vector representations (embeddings) for nodes in a network such that nodes with similar network neighborhoods are closer together in the embedding space.

Let’s delve into the details using mathematical expressions:

  1. Input: Given an undirected graph $ G = (V, E) $ where $ V $ is the set of nodes and $ E $ is the set of edges, represented by an adjacency matrix $ A $ where $ A_{ij} = 1 $ if there exists an edge between nodes $ i $ and $ j $, and $ A_{ij} = 0 $ otherwise.

  2. Objective Function: Network representation learning typically aims to minimize an objective function that measures the discrepancy between the similarity of nodes in the original graph and their similarity in the embedding space. One common objective function is the pairwise distance between nodes in the embedding space, minimized over all pairs of connected nodes in the graph:

    \[\begin{align*} \text{minimize} \sum_{(i, j) \in E} d(f(v_i), f(v_j)) \end{align*}\]

    Here, $ f(v_i) $ and $ f(v_j) $ represent the embeddings of nodes $ i $ and $ j $, respectively, and $ d(\cdot) $ denotes a distance metric, such as Euclidean distance or cosine similarity.

  3. Optimization: The objective function is minimized using optimization algorithms such as stochastic gradient descent (SGD) or its variants. During optimization, the embeddings of nodes are updated iteratively to minimize the objective function.

  4. Embedding Space: The learned embeddings capture structural and semantic information of the nodes in the graph. Nodes with similar network neighborhoods will have embeddings that are closer together in the embedding space, enabling downstream tasks such as node classification, link prediction, or community detection.

In summary, Network Representation Learning leverages mathematical formulations to learn low-dimensional vector representations for nodes in a network, aiming to preserve network structure and facilitate downstream network analysis tasks.

Knowledge Graphs

Large-scale knowledge graphs like DBpedia, Freebase, and Yago serve as essential infrastructure for various AI applications such as recommendation systems and dialogue generation. Knowledge graphs are defined as $G = (h, r, t)$, where $h$ represents the head entity, $t$ denotes the tail entity, and $r$ denotes the relationship between them. Scholars have proposed methods for knowledge graph completion, including knowledge graph embedding and multi-hop path reasoning. These methods fall into three categories: path ranking-based, representation learning-based, and RL-based methods. RL-based methods treat knowledge graph reasoning as Markov Decision Processes (MDP).

Reinforcement Learning

Reinforcement learning (RL) is a type of machine learning where an agent learns to make decisions by interacting with an environment to achieve a goal. The agent receives feedback in the form of rewards or penalties based on its actions, which helps it learn optimal strategies over time.

Mathematically, in RL, the agent learns a policy $ \pi $, which maps states $ s $ to actions $ a $, in order to maximize its cumulative reward $ R $. This is often formulated as finding the optimal policy that maximizes the expected sum of future rewards:

\[\begin{align*} \max_\pi \mathbb{E} \left[ \sum_{t=0}^{\infty} \gamma^t r_t \right] \end{align*}\]

Here, $ r_t $ represents the reward received at time step $ t $, and $ \gamma $ is a discount factor that controls the importance of future rewards.

Recent advancements in RL have led to the development of several state-of-the-art techniques. These include deep reinforcement learning, which involves using deep neural networks to approximate complex decision-making processes, and algorithms like Proximal Policy Optimization (PPO), Trust Region Policy Optimization (TRPO), and Soft Actor-Critic (SAC), which have demonstrated superior performance in various domains such as robotics, gaming, and natural language processing. Additionally, techniques like meta-reinforcement learning, where agents learn to adapt their strategies across different tasks, and model-based reinforcement learning, which leverages learned models of the environment to improve sample efficiency, are also gaining traction in the field. Overall, RL continues to be an active area of research with ongoing advancements and applications in diverse domains.

RL on Graphs

Existing methods to solve graph data mining problems with RL methods focus on network representation learning, adversarial attacks, relational reasoning. In addition, many real-world applications study the GRL problem from different perspectives.

Datasets & Open-source

Graph Mining with RL

Representation Learning

Network representation learning is the process of learning a mapping that embeds nodes of a graph as low-dimensional vectors, capturing various structural and semantic information. These methods aim to optimize representations so that geometric relationships in the embedding space preserve the original graph’s structure.

The obtained node representations effectively support tasks such as node classification, clustering, link prediction, and graph classification. However, existing methods face challenges such as low feature discrimination, demand for prior knowledge, and low explainability.

To address these challenges, approaches like SUGAR utilize hierarchical learning to retain structural information and achieve discriminative representations. Furthermore, there is increasing interest in Graph Neural Networks (GNNs), with novel techniques focusing on node sampling strategies and message passing mechanisms.

Additionally, efforts are being made to improve performance through data augmentation techniques and methods that leverage reinforcement learning for network representation learning. These advancements contribute to enhancing the performance of network representation learning across various real-world networks.

Furthermore, GPA, which aims to improve the learning performance of graph representations, studies how to efficiently label the nodes in GNNs, thereby reducing the annotation cost of training GNNs using an active learning method.

Relational Reasoning

Discovering and understanding causal mechanisms involves searching for Directed Acyclic Graphs (DAGs) that minimize defined score functions. While reinforcement learning (RL) methods have shown promising results in causal discovery from observed data, navigating the space of DAGs or uncovering implied conditions presents significant complexity.

RL agents with random policies can autonomously define search policies based on learned uncertain information, rapidly updating them with reward signals. Zhu et al. propose leveraging RL to find underlying DAGs without relying on smooth score functions. Their algorithm employs Actor-Critic as the search algorithm and outputs the graph with the best reward among all generated during training.

However, computational challenges arise with this method, and exploring the action space comprising directed graphs is commonly difficult. Wang et al. introduce the CORL method, incorporating RL into the ordering-based paradigm. They describe the ordering search problem as a multi-step MDP and implement the ordering generation process using encoder-decoder structures. Sun et al. combine transfer learning and RL for co-learning to leverage prior causal knowledge for causal reasoning tasks.

Task-oriented Spoken Dialogue Systems (SDS) continuously interact with humans to accomplish predefined tasks. Chen et al. propose an alternative method incorporating the DQN algorithm to innovate neural network structures for dialogue policy adaptation. Additionally, natural question generation models are proposed to enhance Q&A task performance. Chen et al. focus on natural question generation, introducing a RL-based Graph-to-Sequence (Graph2Seq) model employing the self-critical sequence training (SCST) algorithm to optimize evaluation metrics directly.

Explicitly obtaining user preferences for recommended items and attributes through interactive conversations is the goal of conversational recommender systems. Deng et al. leverage a graph structure to integrate recommendation and conversation components. They use a dynamic weighted graph to model changing interrelationships during conversations and consider a graph-based MDP environment for simultaneous relationship processing.

With the rise of artificial intelligence, knowledge graphs have become crucial data infrastructure for various real-world applications, including dialogue systems and knowledge reasoning. Lin et al. propose a policy-based agent for multi-hop reasoning tasks through RL approaches. MINERVA utilizes RL to train end-to-end models for answering questions on multi-hop knowledge graphs.

To address incomplete knowledge graph problems, scholars commonly employ multi-hop reasoning. Wan et al. suggest a hierarchical RL method to simulate human thinking patterns, decomposing the reasoning process into RL policies. Recommendation systems, critical to online applications, incorporate DRL models like DQN and DDPG for decision-making and long-term planning in dynamic environments. KGQR integrates graph learning and sequential decision problems in interactive recommender systems, enhancing RL performance through semantic correlations in knowledge graphs.

Real World Applications

Research on Graph Representation Learning (GRL) has surged in recent years, with significant implications for real-world applications and garnering attention from scholars. Applications span various domains including transportation network optimization, E-commerce recommendation systems, drug structure prediction, molecular structure generation, and COVID-19 control strategies. Notable advancements in GRL methods are outlined below:

1) Explainability: Scholars are focusing on enhancing the interpretability of Graph Neural Networks (GNNs) to enable their use in critical applications such as medicine, privacy, and security. Explaining GNNs at both the instance and model levels is crucial for building trust. Methods like SubgraphX and RioGNN offer explanations at the subgraph level, enhancing the interpretability of multi-relational GNNs.

2) City Services: GRL methods aid in addressing urban challenges like traffic congestion and communication inefficiency. Techniques such as traffic flow prediction, traffic signal control, and electronic toll collection optimization optimize city services. GRL also enhances packet switching network routing and Wireless Local Area Networks (WLANs) channel allocation.

3) Epidemic Control: In controlling epidemics, GRL plays a pivotal role in predicting information diffusion, dynamically allocating resources, and identifying key nodes for intervention. Algorithms like RAI assist in curbing virus spread by leveraging social relationships in the Internet of Things (IoT).

4) Combinatorial Optimization: GRL methods are employed in solving combinatorial optimization problems efficiently. OpenGraphGym and S2V-DQN tackle combinatorial graph optimization, while Action Schema Network learns generalized policies for probabilistic planning problems.

5) Medicine: GRL techniques find applications in Clinical Decision Support (CDS), Medicine Combination Prediction (MCP), chemical reaction product prediction, and brain network analysis. Models like Graph Convolution RL and Graph Transformation Policy Network aid in medicine correlation prediction and chemical molecule generation, contributing to drug discovery research.

These applications underscore the versatility and impact of GRL across diverse fields, paving the way for innovative solutions to complex real-world challenges.

Collections by domain

Representation Learning

Year Venue Model Title Algorithm Paper Code
2018 ACM SIGKDD GAM Graph Classification using structural attention Partially Observable Markov Decision Process (POMDP) Paper  
2019 IEEE TNSM DDPG-HFA A Deep Reinforcement Learning Approach for VNF Forwarding Graph Embedding DDPG Paper  
2019 arXiv GraphNAS GraphNAS: Graph Neural Architecture Search with Reinforcement Learning MDP Paper Code
2019 arXiv AGNN Auto-GNN: Neural Architecture Search of Graph Neural Networks REINFORCE Paper  
2019 AISTATS GRPI Representation Learning on Graphs: A Reinforcement Learning Application MDP Paper Code
2019 ICDM GDPNet Learning Robust Representations with Graph Denoising Policy Network MDP Paper  
2020 ICPR DAGCN Reinforcement learning with dual attention guided graph convolution for relation extraction MDP Paper  
2020 IEEE J-SAC A3C+GCN Automatic Virtual Network Embedding: A Deep Reinforcement Learning Approach With Graph Convolutional Networks A3C Paper  
2020 ICPR DAGCN Reinforcement learning with dual attention guided graph convolution for relation extraction MDP Paper  
2020 ICASSP RLNet Learning network representation through reinforcement learning MDP Paper  
2020 NeurIPS GPA Graph Policy Network for Transferable Active Learning on Graphs MDP Paper Code
2020 KDD Policy-GNN Policy-GNN: Aggregation Optimization for Graph Neural Networks DQN Paper Code
2020 ICLR DGN Graph Convolutional Reinforcement Learning Q-Learning Paper Code
2020 AAAI/ACMAI GAEA GAEA: Graph Augmentation for Equitable Access via Reinforcement Learning MDP Paper Code
2021 DASFAA IMGER A reinforcement learning model for influence maximization in social networks DDQN Paper  
2021 IEEE ICDM GQNAS GQNAS: Graph Q Network for Neural Architecture Search DQN Paper  
2021 IEEE TKDE Netrl Netrl: Task-aware network denoising via deep reinforcement learning DQN Paper Code
2021 IEEE ICDM ACE-HGNN ACE-HGNN: Adaptive Curvature Exploration Hyperbolic Graph Neural Network Nash Q-leaning Paper  
2021 WWW SUGAR SUGAR: Subgraph Neural Network with Reinforcement Pooling and Self-Supervised Mutual Information Mechanism Q-Learning Paper Code
2021 ACM TOIS RioGNN Reinforced Neighborhood Selection Guided Multi-Relational Graph Neural Networks MDP Paper Code
2022 Knowledge-Based Systems AFGSL AFGSL: Automatic Feature Generation based on Graph Structure Learning Q-Learning Paper  
2022 arXiv GraphAug Automated Data Augmentations for Graph Classification MDP Paper  
2022 IEEE TKDE RTGNN Multi-view Tensor Graph Neural Networks Through Reinforced Aggregation MDP Paper Code
2022 Neurocomputing Treeago Treeago: Tree-structure aggregation and optimization for graph neural network DQN Paper  
2022 IEEE TKDE GraphNAS++ GraphNAS++: Distributed Architecture Search for Graph Neural Networks REINFORCE Paper  
2022 Neural Computing and Applications Kyriakides et al. Evolving graph convolutional networks for neural architecture search MDP Paper  
2022 Information Sciences GraphTUL Contextual spatio-temporal graph representation learning for reinforced human mobility mining MDP Paper  
2022 AAAI BiGeNe Batch Active Learning with Graph Neural Networks via Multi-Agent Deep Reinforcement Learning DQN Paper  
2022 arXiv AdaNet Robust Knowledge Adaptation for Dynamic Graph Neural Networks REINFORCE Paper  
2022 Annals of Operations Research CRL Counterfactual based reinforcement learning for graph neural networks MolDQN Paper  
2023 ICML DeepIM Deep Graph Representation Learning and Optimization for Influence Maximization MDP Paper Code
2023 IEEE TKDE HGNAS++: Efficient Architecture Search for Heterogeneous Graph Neural Networks MDP Paper  
2023 Information Sciences DeepGNAS Search for deep graph neural networks DQN Paper  
2023 MLSys X-RLflow X-RLflow: Graph Reinforcement Learning for Neural Network Subgraphs Transformation PPO Paper Code

Adversarial Attacks

Year Venue Model Title Algorithm Paper Code
2018 ICML RL-S2V Adversarial Attack on Graph Structured Data Q-learning Paper  
2018 TrustCom/BigDataSE Yousefi et al. A Reinforcement Learning Approach for Attack Graph Analysis Q-learning Paper  
2019 arXiv ReWatt Attacking Graph Convolutional Networks via Rewiring MDP Paper  
2020 CIKM CARE-GNN Enhancing Graph Neural Network-based Fraud Detectors against Camouflaged Fraudsters BMAB Paper Code
2020 WWW NIPA Adversarial Attacks on Graph Neural Networks via Node Injections: A Hierarchical Reinforcement Learning Approach DQN Paper  
2021 SBP-BRiMS Dineen et al. Reinforcement Learning for Data Poisoning on Graph Neural Networks REINFORCE Paper  
2022 Neural Computing and Applications Wu et al. Poisoning attacks against knowledge graph-based recommendation systems using deep reinforcement learning MDP Paper  
2022 KDD KGAttack Knowledge-enhanced Black-box Attacks for Recommendations AC Paper  
2022 IEEE TKDE RL-GraphMI Model Inversion Attacks Against Graph Neural Networks Q-Learning Paper Code
2023 IJCNN AdRumor-RL Interpretable and Effective Reinforcement Learning for Attacking against Graph-based Rumor Detection MDP Paper  

Relational Reasoning

Year Venue Model Title Algorithm Paper Code
2017 arXiv Deeppath Deeppath: A reinforcement learning method for knowledge graph reasoning DQN Paper Code
2017 arXiv KBGAN KBGAN: Adversarial Learning for Knowledge Graph Embeddings REINFORCE Paper Code
2017 ICLR MINERVA Go for a Walk and Arrive at the Answer: Reasoning Over Paths in Knowledge Bases using Reinforcement Learning REINFORCE Paper Code
2018 IEEE ICDMW MARLPaR Path Reasoning over Knowledge Graph: A Multi-agent and Reinforcement Learning Based Method MDP Paper  
2018 KDD DEERS Recommendations with Negative Feedback via Pairwise Deep Reinforcement Learning MDP Paper  
2018 COLING Chen et al. Structured Dialogue Policy with Graph Neural Networks REINFORCE Paper  
2018 EMNLP Lin et al. Multi-Hop Knowledge Graph Reasoning with Reward Shaping REINFORCE Paper  
2019 arXiv Graph2Seq Reinforcement learning based graph-to-sequence model for natural question generation MDP Paper Code
2019 arXiv Ekar Ekar: An Explainable Method for Knowledge Aware Recommendation MDP Paper  
2019 ACM SIGIR PGPR Reinforcement Knowledge Graph Reasoning for Explainable Recommendation REINFORCE Paper Code
2020 ACM SIGIR KGQR Interactive Recommender System via Knowledge Graph-enhanced Reinforcement Learning DQN Paper  
2020 arXiv KG-A2C Graph Constrained Reinforcement Learning for Natural Language Action Spaces A2C Paper Code
2020 ACM SIGKDD IMUP Incremental Mobile User Profiling: Reinforcement Learning with Spatial Knowledge Graph for Modeling Event Streams DQN Paper  
2020 ICLR RL-BIC Causal Discovery with Reinforcement Learning AC Paper Code
2020 arXiv RL-HGNN Reinforcement Learning Enhanced Heterogeneous Graph Neural Network DQN Paper  
2020 ISPA/BDCloud /SocialCom /SustainCom DKDR DKDR: An Approach of Knowledge Graph and Deep Reinforcement Learning for Disease Diagnosis Q-Learning Paper  
2020 KDD NIRec An Efficient Neighborhood-based Interaction Model for Recommendation on Heterogeneous Graph MDP Paper  
2020 SIGIR GCQN Reinforcement Learning based Recommendation with Graph Convolutional Q-network Q-Learning Paper  
2020 Knowledge-Based Systems GRL GRL: Knowledge graph completion with GAN-based reinforcement learning DDPG Paper  
2021 ACM SIGIR UNICORN Unified conversational recommendation policy learning via graph-based reinforcement learning DDQN Paper  
2021 IEEE TNNLS Sun et al. Model-based transfer reinforcement learning based on graphical model representations DDPG Paper  
2021 arXiv TITer TimeTraveler: Reinforcement Learning for Temporal Knowledge Graph Forecasting REINFORCE Paper Code
2021 Neurocomputing MemoryPath MemoryPath: A deep reinforcement learning framework for incorporating memory component into knowledge graph reasoning MDP Paper  
2021 IJCAI CORL Ordering-Based Causal Discovery with Reinforcement Learning MDP Paper Code
2021 EMNLP-IJCNLP AttnPath Incorporating Graph Attention Mechanism into Knowledge Graph Reasoning Based on Deep Reinforcement Learning MDP Paper  
2021 Neural Networks Dapath Dapath: Distance-aware knowledge graph reasoning based on deep reinforcement learning REINFORCE Paper Code
2021 IJCAI RLH Reasoning like human: Hierarchical reinforcement learning for knowledge graph reasoning MDP Paper  
2020 Knowledge-Based Systems ADRL ADRL: An attention-based deep reinforcement learning framework for knowledge graph reasoning AC Paper  
2021 IJCKG PAAR Multi-hop Knowledge Graph Reasoning Based on Hyperbolic Knowledge Graph Embedding and Reinforcement Learning MDP Paper Code
2021 KSEM Zheng et al. Hierarchical Policy Network with Multi-agent for Knowledge Graph Reasoning Based on Reinforcement Learning REINFORCE Paper  
2022 Knowledge-Based Systems RF Dynamic knowledge graph reasoning based on deep reinforcement learning AC Paper  
2022 Applied Intelligence RLPath RLPath: a knowledge graph link prediction method using reinforcement learning based attentive relation path searching and representation learning MDP Paper  
2022 Soft Computing GNNRC A novel embedding learning framework for relation completion and recommendation based on graph neural network and multi-task learning MDP Paper  
2022 ACM/IMS Transactions on Data Science TRGIR A Text-based Deep Reinforcement Learning Framework Using Self-supervised Graph Representation for Interactive Recommendation DDPG Paper  
2022 arXiv KGRGRL KGRGRL: A User’s Permission Reasoning Method Based on Knowledge Graph Reward Guidance Reinforcement Learning MDP Paper  
2022 AAAI CURL Learning to Walk with Dual Agents for Knowledge Graph Reasoning MDP Paper Code
2022 arXiv FreeKD FreeKD: Free-direction Knowledge Distillation for Graph Neural Networks MDP Paper  
2022 ACM Transactions on Information Systems Feng et al. Reinforcement Routing on Proximity Graph for Efficient Recommendation MDP Paper  
2022 DASFAA ExKGR ExKGR: Explainable Multi-hop Reasoning for Evolving Knowledge Graph MDP Paper  
2022 DASFAA Zhang et al. A Joint Framework for Explainable Recommendation with Knowledge Reasoning and Graph Representation A2C Paper  
2022 Artificial Intelligence in Medicine GTGAT Gated Tree-based Graph Attention Network (GTGAT) for medical knowledge graph reasoning MDP Paepr  
2022 Education and Information Technologies MEUR Graph path fusion and reinforcement reasoning for recommendation in MOOCs MDP Paper  
2022 AICAT Wu et al. A construction technology of automatic reasoning system based on knowledge graph MDP Paper  
2022 Information Processing & Management SparKGR Iterative rule-guided reasoning over sparse knowledge graphs with deep reinforcement learning DQN Paper  
2022 arXiv GRADER Generalizing Goal-Conditioned Reinforcement Learning with Variational Causal Reasoning MDP Paper  
2022 arXiv APPO Performance Optimization for Semantic Communications: An Attention-based Reinforcement Learning Approach MDP Paper  
2022 arXiv CERec Reinforced Path Reasoning for Counterfactual Explainable Recommendation MDP Paper Code
2022 ACM SIGIR CGKR Alleviating Spurious Correlations in Knowledge-aware Recommendations through Counterfactual Generator MDP Paper Code
2022 ACM SIGIR HICR Conversational Recommendation via Hierarchical Information Modeling DQN Paper  
2022 ACM SIGIR MARIS Multi-Agent RL-based Information Selection Model for Sequential Recommendation MDP Paper  
2022 ICME ROGC ROGC: Role-Oriented Graph Convolution Based Multi-Agent Reinforcement Learning MARL Paper  
2022 AAAI HiTKG HiTKG: Towards Goal-Oriented Conversations via Multi-Hierarchy Learning MDP Paper  
2022 Applied Soft Computing SSRL Self-Supervised Reinforcement Learning with dual-reward for knowledge-aware recommendation Actor-Critic Paper  
2022 Knowledge-Based Systems KAiPP KAiPP: An interaction recommendation approach for knowledge aided intelligent process planning with reinforcement learning MDP Paper  
2022 CIKM KRAF A Flexible Advertising Framework using Knowledge Graph-Enriched Multi-Agent Reinforcement Learning MARL Paper  
2022 CIKM GPR Two-Level Graph Path Reasoning for Conversational Recommendation with User Realistic Preference DQN Paper  
2022 SSRN VRNet Knowledge Graph Relation Reasoning with Variational Reinforcement Network MDP Paper  
2022 KBS Zhu et al. Step by step: A hierarchical framework for multi-hop knowledge graph reasoning with reinforcement learning MDP Paper Code
2023 SDM GARL Causal Discovery by Graph Attention Reinforcement Learning MDP Paper  
2023 Education and Information Technologies MEUR Graph path fusion and reinforcement reasoning for recommendation in MOOCs Actor-Critic Paper  
2023 IEEE TKDE TMER-RL Reinforcement Learning based Path Exploration for Sequential Explainable Recommendation MDP Paper  
2023 CLeaR MCD A Meta-Reinforcement Learning Algorithm for Causal Discovery Actor-Critic Paper Code
2023 Applied Intelligence RED Reinforcement learning-based denoising network for sequential recommendation MDP Paper  
2023 Conference of the European Chapter of the Association for Computational Linguistics Jiang et al. Path Spuriousness-aware Reinforcement Learning for Multi-Hop Knowledge Graph Reasoning REINFORCE Paper Code

Real-World Applications

Explainability

Year Venue Model Title Algorithm Paper Code
2019 NeurIPS GMETAEXP Learning Transferable Graph Exploration MDP Paper  
2019 arXiv Ekar Ekar: An Explainable Method for Knowledge Aware Recommendation MDP Paper  
2019 ACM SIGIR PGPR Reinforcement Knowledge Graph Reasoning for Explainable Recommendation REINFORCE Paper Code
2020 KDD XGNN XGNN: Towards Model-Level Explanations of Graph Neural Networks MDP Paper  
2021 ICML SubgraphX On Explainability of Graph Neural Networks via Subgraph Explorations MCTS Paper Code
2021 arXiv SparRL SparRL: Graph Sparsification via Deep Reinforcement Learning MDP Paper Code
2021 ACM TOIS RioGNN Reinforced Neighborhood Selection Guided Multi-Relational Graph Neural Networks MDP Paper Code
2022 ICLR G2RL Graph-Enhanced Exploration for Goal-oriented Reinforcement Learning Q-Learning Paper  
2022 IEEE TNNLS LEGIT Explaining Deep Graph Networks via Input Perturbation MDP Paper Code
2022 ADC Mishra et al. Predicting Taxi Hotspots in Dynamic Conditions Using Graph Neural Network MDP Paper  
2022 CIKM Saha et al. A Model-Centric Explainer for Graph Neural Network based Node Classification REINFORCE Paper Code

City Services

Year Venue Model Title Algorithm Paper Code
2018 IEEE Big Data Obara et al. Deep Reinforcement Learning Approach for Train Rescheduling Utilizing Graph Theory DQN Paper  
2018 PMLR Zhang et al. Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents. Actor-Critic Paper  
2018 IEEE ITSC NFQI Traffic Signal Control Based on Reinforcement Learning with Graph Convolutional Neural Nets NFQI Paper  
2019 SOSR Rusek et al. Unveiling the potential of Graph Neural Networks for network modeling and optimization in SDN MDP Paper  
2019 arXiv DRL+GNN Deep Reinforcement Learning meets Graph Neural Networks: exploring a routing optimization use case DQN Paper Code
2019 SIGCOMM RouteNet Challenging the generalization capabilities of graph neural networks for network modeling MDP Paper  
2020 IJCAI eGCN Dynamic Electronic Toll Collection via Multi-Agent Deep Reinforcement Learning with Edge-Based Graph Convolutional Networks MDP Paper  
2020 IEEE TMC STMARL STMARL: A Spatio-Temporal Multi-Agent Reinforcement Learning Approach for Cooperative Traffic Light Control DQN Paper  
2020 IEEE Access NAKASHIMA et al. Deep Reinforcement Learning-Based Channel Allocation for Wireless LANs With Graph Convolutional Networks DDQN Paper  
2020 Artificial Intelligence in China DR-DCG Coordinated Learning for Lane Changing Based on Coordination Graph and Reinforcement Learning MDP Paper  
2021 IEEE ICCCS GraphLight GraphLight: Graph-based Reinforcement Learning for Traffic Signal Control REINFORCE Paper  
2021 IEEE T-ITS IG-RL IG-RL: Inductive Graph Reinforcement Learning for Massive-Scale Traffic Signal Control MDP Paper Code
2021 Computer‐Aided Civil and Infrastructure Engineering GCQ Graph neural network and reinforcement learning for multi-agent cooperative control of connected autonomous vehicles DQN Paper  
2021 IEEE T-ITS SAGE-Garph Deep Reinforcement Learning With Graph Representation for Vehicle Repositioning DDQN Paper  
2021 Information Sciences Dynamic graph Dynamic graph convolutional network for long-term traffic flow prediction with reinforcement learning PPO Paper  
2022 Digital Signal Processing DQN-GCN-GAT A new ensemble deep graph reinforcement learning network for spatio-temporal traffic volume forecasting in a freeway network DQN Paper  
2022 IEEE TMC RedPacketBike RedPacketBike: A Graph-Based Demand Modeling and Crowd-Driven Station Rebalancing Framework for Bike Sharing Systems MDP Paper  
2022 International Journal of Electrical Power & Energy Systems GRL Real-time fast charging station recommendation for electric vehicles in coupled power-transportation networks: A graph reinforcement learning method DQN Paper  
2022 Digital Signal Processing DQN-GCN-GAT A new ensemble deep graph reinforcement learning network for spatio-temporal traffic volume forecasting in a freeway network DQN Paper  
2022 arXiv MuJAM Model-based graph reinforcement learning for inductive traffic signal control MDP Paper  
2022 Applied Intelligence GCQN-TSC Graph cooperation deep reinforcement learning for ecological urban traffic signal control MDP Paper  
2022 arXiv GRL Graph Convolution-Based Deep Reinforcement Learning for Multi-Agent Decision-Making in Mixed Traffic Environments DQN Paper Code
2022 Knowledge-Based Systems MetaSTGAT Meta-learning based spatial-temporal graph attention network for traffic signal control DQN Paper  
2022 Applied Intelligence VARL VARL: a variational autoencoder-based reinforcement learning Framework for vehicle routing problems MDP Paper  
2022 Information Fusion IHA-MDGI An inductive heterogeneous graph attention-based multi-agent deep graph infomax algorithm for adaptive traffic signal control multi-agent ATSC Paper  
2022 Transportation Research Part C: Emerging Technologies RDGCNI A novel reinforced dynamic graph convolutional network model with data imputation for network-wide traffic flow prediction DDPG Paper  
2022 IJECE ERL-MA Evolutionary reinforcement learning multi-agents system for intelligent traffic light control: new approach and case of study Q-Learning Paper  
2022 IEEE Internet of Things Journal GCN-based DRL Joint Routing and Scheduling Optimization in Time-Sensitive Networks Using Graph Convolutional Network-based Deep Reinforcement Learning DQN Paper  
2022 Artificial Intelligence and Computing on Industrial Applications MB-GCN A Deep Coordination Graph Convolution Reinforcement Learning for Multi-Intelligent Vehicle Driving Policy MDP Paper  
2022 IET Communications GRL A generic intelligent routing method using deep reinforcement learning with graph neural networks PPO Paper  
2022 CIKM Lou et al Meta-Reinforcement Learning for Multiple Traffic Signals Control MDP Paper  
2022 IEEE ICIEA GCN-DQN/GCN-DDQN Multi-Vehicles Decision-Making in Interactive Highway Exit: A Graph Reinforcement Learning Approach DQN/DDQN Paper  
2022 IEEE ITSC Liu et al. Graph Convolution-Based Deep Reinforcement Learning for Multi-Agent Decision-Making in Interactive Traffic Scenarios MDP Paper Code
2022 arXiv MuJAM Model-based graph reinforcement learning for inductive traffic signal control MDP Paper  
2023 IET Gener. Transm. Distrib. GraphSAGE-D3QN An emergency control strategy for undervoltage load shedding of power system: A graph deep reinforcement learning method D3QN Paper  

Epidemic Control

Year Venue Model Title Algorithm Paper Code
2020 Nature Machine Intelligence FINDER Finding key players in complex networks through deep reinforcement learning Q-Learning Paper Code
2021 ICML RLGN Controlling Graph Dynamics with Reinforcement Learning and Graph Neural Networks PPO Paper  
2021 IEEE TNNLS FOREST Full-Scale Information Diffusion Prediction With Reinforced Recurrent Networks REINFORCE Paper Code
2021 arXiv HITTER Hypernetwork Dismantling via Deep Reinforcement Learning DQN Paper  
2022 IEEE TETCI EDRL-IM Influence Maximization in Complex Networks by Using Evolutionary Deep Reinforcement Learning DQN Paper  
2022 ACM Transactions on Knowledge Discovery from Data IDRLECA Contact Tracing and Epidemic Intervention via Deep Reinforcement Learning PPO Paper  
2022 KDD Vehicle Precise Mobility Intervention for Epidemic Control Using Unobservable Information via Deep Reinforcement Learning HRL Paper  

Combinatorial Optimization

Year Venue Model Title Algorithm Paper Code
2017 NIPS S2V-DQN Learning Combinatorial Optimization Algorithms over Graphs Q-Learning Paper Code
2018 AAAI ASNets Action Schema Networks: Generalised Policies with Deep Learning MDP Paper Code
2019 arXiv GPN Combinatorial Optimization by Graph Pointer Networks and Hierarchical Reinforcement Learning REINFORCE Paper Code
2020 Nature Machine Intelligence FINDER Finding key players in complex networks through deep reinforcement learning Q-Learning Paper Code
2021 IEEE Communications Letters DeepOpt Combining Deep Reinforcement Learning With Graph Neural Networks for Optimal VNF Placement REINFORCE Paper  
2020 IEEE Access SILVA et al. Temporal Graph Traversals Using Reinforcement Learning With Proximal Policy Optimization PPO Paper  
2022 IEEE TETCI EDRL-IM Influence Maximization in Complex Networks by Using Evolutionary Deep Reinforcement Learning DQN Paper  
2022 arXiv GTA-RL Solving Dynamic Graph Problems with Multi-Attention Deep Reinforcement Learning REINFORCE Paper Code
2022 IEEE TII Song et al. Flexible Job Shop Scheduling via Graph Neural Network and Deep Reinforcement Learning PPO Paper  
2022 Engineering Applications of Artificial Intelligence GCE-MAD A graph convolutional encoder and multi-head attention decoder network for TSP via reinforcement learning REINFORCE Paper  
2022 IEEE TII DGERD A Deep Reinforcement Learning Framework Based on an Attention Mechanism and Disjunctive Graph Embedding for the Job Shop Scheduling Problem DQN Paper  
2022 arXiv ECORD Learning to Solve Combinatorial Graph Partitioning Problems via Efficient Exploration DQN Paper Code
2022 Information Sciences G3DQN A graph neural networks-based deep Q-learning approach for job shop scheduling problems in traffic management DQN Paper  
2022 KDD DGMP Enhancing Machine Learning Approaches for Graph Optimization Problems with Diversifying Graph Augmentation MDP Paper  
2022 Neurocomputing E-GAT Solve routing problems with a residual edge-graph attention neural network PPO Paper Code
2022 arXiv N-BLS Subgraph Matching via Query-Conditioned Subgraph Matching Neural Networks and Bi-Level Tree Search MCTS Paper  
2022 techrxiv TOFA You Only Train Once: A highly generalizable reinforcement learning method for dynamic job shop scheduling problem MDP Paper Code
2022 arXiv LKH Solving the Traveling Salesperson Problem with Precedence Constraints by Deep Reinforcement Learning MDP Paper Code
2022 IEEE TII DRL Flexible job-shop scheduling via graph neural network and deep reinforcement learning PPO Paper Code
2023 Knowledge-Based Systems DeepMAG DeepMAG: Deep reinforcement learning with multi-agent graphs for flexible job shop scheduling MARL Paper  
2023 Information Sciences BDRL Solving combinatorial optimization problems over graphs with BERT-Based Deep Reinforcement Learning REINFORCE Paper  

Medicine

Year Venue Model Title Algorithm Paper Code
2018 NeurIPS GCPN Graph convolutional policy network for goal-directed molecular graph generation MDP Paper Code
2018 arXiv MolGAN MolGAN: An implicit generative model for small molecular graphs MDP Paper  
2019 CIKM CompNet Order-free Medicine Combination Prediction with Graph Convolutional Reinforcement Learning DQN Paper Code
2019 KDD GTPN Graph Transformation Policy Network for Chemical Reaction Prediction A2C Paper  
2020 IEEE Access Wang et al. Risk-Aware Identification of Highly Suspected COVID-19 Cases in Social IoT: A Joint Graph Theory and Reinforcement Learning Approach Q-Learning Paper  
2020 ISPA/BDCloud /SocialCom /SustainCom DKDR DKDR: An Approach of Knowledge Graph and Deep Reinforcement Learning for Disease Diagnosis Q-Learning Paper  
2020 Journal of cheminformatics DeepGraphMolGen DeepGraphMolGen, a multi-objective, computational strategy for generating molecules with desirable properties: a graph convolution and reinforcement learning approach PPO Paper Code
2022 arXiv BN-GNN Deep Reinforcement Learning Guided Graph Neural Networks for Brain Network Analysis DDQN Paper  
2023 Aiche Journal Stops et al. Flowsheet generation through hierarchical reinforcement learning and graph neural networks Actor-Critic Paper  

Others

Year Venue Model Title Algorithm Paper Code
2017 IEEE TNNLS FRDNN Deep direct reinforcement learning for financial signal representation and trading DRL Paper  
2018 ICLR NerveNet NerveNet: Learning Structured Policy with Graph Neural Networks PPO Paper  
2020 ACM/IEEE DAC GCN-RL GCN-RL Circuit Designer: Transferable Transistor Sizing with Graph Neural Networks and Reinforcement Learning AC Paper  
2021 Expert Systems with Applications DeepPocket Deep graph convolutional reinforcement learning for financial portfolio management-deeppocket AC Paper  
2021 arXiv Gnn-rl compression Gnn-rl compression: Topologyaware network pruning using multi-stage graph embedding and reinforcement learning DDPG Paper  
2021 ICCV AGMC Auto graph encoder-decoder for neural network pruning DQN Paper  
2022 Applied Intelligence GraphPruning Graph pruning for model compression DDPG Paper  
2022 ICLR AGILE Know Your Action Set: Learning Action Relations for Reinforcement Learning PPO DQN CDQN Paper Code
2022 ICLR MAPSRL-2 Reinforcement Learning under a Multi-agent Predictive State Representation Model: Method and Theory Q-Learning Paper  
2022 ICLR SWAT Structure-Aware Transformer Policy for Inhomogeneous Multi-Task Reinforcement Learning AC Paper  
2022 IEEE TPAMI DRL-DBSCAN Reinforced, Incremental and Cross-lingual Event Detection From Social Messages MarGNN Paper Code
2022 ACM Transactions on Asian and Low-Resource Language Information Processing GA-SCS GA-SCS: Graph-Augmented Source Code Summarization MDP Paper  
2022 IP CCC RCGNN Reinforced Contrastive Graph Neural Networks (RCGNN) for Anomaly Detection Recursive Scalable Reinforcement Learning (RSRL) Paper  
2022 IEEE Transactions on Computational Social Systems MADDPG Misinformation Propagation in Online Social Networks: Game Theoretic and Reinforcement Learning Approaches MARL Paper  

Reference