Reinforcement learning on graphs: A survey

Recent successes in reinforcement learning (RL) have led to problem-solving across various fields such as robotics, gaming, and natural language processing. RL deals with how agents should learn to take actions to maximize cumulative rewards through interactions with the environment. The rapid advancement of RL has prompted scholars to explore new RL models for real-world applications in domains like finance, healthcare, and transportation. There is active research in data mining for graph structures, as many real-world datasets are represented as graphs. With the continuous development of RL methods in recent years, scholars are increasingly interested in combining graph mining with RL to address decision problems arising in graph mining tasks. Collaborative research on graph mining algorithms and RL models is on the rise, as evidenced by trends in published papers on Graph Reinforcement Learning (GRL) from January 2017 to April 2022.

Traditional methods and deep learning-based models for graph mining tasks have significant differences in model design and training processes compared to RL-based methods, posing challenges for scholars in employing RL methods to analyze graph data. Scholars have been extensively researching in the fields of RL and graph mining to address these challenges, with attempts made in various areas such as rumor detection, recommendation systems, and automated machine learning (AutoML). The authors define GRL as solutions and measures for solving graph mining tasks by analyzing critical components such as nodes, links, and subgraphs in graphs with RL methods to explore the topological structure and attribute information of the graphs. A systematic review of this area is deemed necessary, and the authors believe their work represents the first comprehensive survey of various GRL methods.

Preliminaries

Graph Neural Networks

A Graph Neural Network (GNN) is a deep learning model used to process data represented in graph structures. This model learns relationships between nodes and edges (links) within a graph to perform various tasks within the graph.

Fundamentally, a GNN takes as input the feature vectors of each node in the graph, representing attributes of the nodes. For example, in a social network, these feature vectors might represent user profile information.

GNN aggregates information from neighboring nodes to compute embeddings for each node. This allows for representations of each node considering its local context.

The basic operation of a GNN can be described as follows:

Input Features: Given a matrix $X$ representing the feature vectors of each node, with dimensions $n × d$, where $n$ is the number of nodes and $d$ is the dimensionality of each node’s feature vector.
Neighbor Aggregation: Each node collects information from its neighboring nodes and integrates this information to create a new representation for itself. This can be expressed mathematically as:
\[\begin{align*} h_v^{(l)} = \sigma \left( \sum_{u \in N(v)} f^{(l)}(h_u^{(l-1)}, h_v^{(l-1)}, e_{uv}) \right) \end{align*}\]
Here, $ h_v^{(l)} $ represents the embedding of node $v$ in layer $l$, $ N(v) $ is the set of neighboring nodes of node v, $ f^{(l)} $ is the aggregation function at layer $l$, and $ e_{uv} $ represents information about the edge between nodes $u$ and $v$.
Recursive Computation: This neighbor aggregation step is repeated multiple times to iteratively update the embeddings of each node, taking into account the global graph structure.
Output Computation: Finally, the final embeddings of each node are used to perform the desired task (e.g., classification, regression, prediction).

In this way, a Graph Neural Network effectively infers and predicts node attributes while considering the complexity of the graph structure, serving as a powerful model for various applications within graphs.

Representation Learning

Network Representation Learning, also known as graph embedding or graph representation learning, aims to learn low-dimensional vector representations (embeddings) for nodes in a network such that nodes with similar network neighborhoods are closer together in the embedding space.

Let’s delve into the details using mathematical expressions:

Input: Given an undirected graph $ G = (V, E) $ where $ V $ is the set of nodes and $ E $ is the set of edges, represented by an adjacency matrix $ A $ where $ A_{ij} = 1 $ if there exists an edge between nodes $ i $ and $ j $, and $ A_{ij} = 0 $ otherwise.
Objective Function: Network representation learning typically aims to minimize an objective function that measures the discrepancy between the similarity of nodes in the original graph and their similarity in the embedding space. One common objective function is the pairwise distance between nodes in the embedding space, minimized over all pairs of connected nodes in the graph:
\[\begin{align*} \text{minimize} \sum_{(i, j) \in E} d(f(v_i), f(v_j)) \end{align*}\]
Here, $ f(v_i) $ and $ f(v_j) $ represent the embeddings of nodes $ i $ and $ j $, respectively, and $ d(\cdot) $ denotes a distance metric, such as Euclidean distance or cosine similarity.
Optimization: The objective function is minimized using optimization algorithms such as stochastic gradient descent (SGD) or its variants. During optimization, the embeddings of nodes are updated iteratively to minimize the objective function.
Embedding Space: The learned embeddings capture structural and semantic information of the nodes in the graph. Nodes with similar network neighborhoods will have embeddings that are closer together in the embedding space, enabling downstream tasks such as node classification, link prediction, or community detection.

In summary, Network Representation Learning leverages mathematical formulations to learn low-dimensional vector representations for nodes in a network, aiming to preserve network structure and facilitate downstream network analysis tasks.

Knowledge Graphs

Large-scale knowledge graphs like DBpedia, Freebase, and Yago serve as essential infrastructure for various AI applications such as recommendation systems and dialogue generation. Knowledge graphs are defined as $G = (h, r, t)$, where $h$ represents the head entity, $t$ denotes the tail entity, and $r$ denotes the relationship between them. Scholars have proposed methods for knowledge graph completion, including knowledge graph embedding and multi-hop path reasoning. These methods fall into three categories: path ranking-based, representation learning-based, and RL-based methods. RL-based methods treat knowledge graph reasoning as Markov Decision Processes (MDP).

Reinforcement Learning

Reinforcement learning (RL) is a type of machine learning where an agent learns to make decisions by interacting with an environment to achieve a goal. The agent receives feedback in the form of rewards or penalties based on its actions, which helps it learn optimal strategies over time.

Mathematically, in RL, the agent learns a policy $ \pi $, which maps states $ s $ to actions $ a $, in order to maximize its cumulative reward $ R $. This is often formulated as finding the optimal policy that maximizes the expected sum of future rewards:

\[\begin{align*} \max_\pi \mathbb{E} \left[ \sum_{t=0}^{\infty} \gamma^t r_t \right] \end{align*}\]

Here, $ r_t $ represents the reward received at time step $ t $, and $ \gamma $ is a discount factor that controls the importance of future rewards.

Recent advancements in RL have led to the development of several state-of-the-art techniques. These include deep reinforcement learning, which involves using deep neural networks to approximate complex decision-making processes, and algorithms like Proximal Policy Optimization (PPO), Trust Region Policy Optimization (TRPO), and Soft Actor-Critic (SAC), which have demonstrated superior performance in various domains such as robotics, gaming, and natural language processing. Additionally, techniques like meta-reinforcement learning, where agents learn to adapt their strategies across different tasks, and model-based reinforcement learning, which leverages learned models of the environment to improve sample efficiency, are also gaining traction in the field. Overall, RL continues to be an active area of research with ongoing advancements and applications in diverse domains.

RL on Graphs

Existing methods to solve graph data mining problems with RL methods focus on network representation learning, adversarial attacks, relational reasoning. In addition, many real-world applications study the GRL problem from different perspectives.

Datasets & Open-source

Graph Mining with RL

Representation Learning

Network representation learning is the process of learning a mapping that embeds nodes of a graph as low-dimensional vectors, capturing various structural and semantic information. These methods aim to optimize representations so that geometric relationships in the embedding space preserve the original graph’s structure.

The obtained node representations effectively support tasks such as node classification, clustering, link prediction, and graph classification. However, existing methods face challenges such as low feature discrimination, demand for prior knowledge, and low explainability.

To address these challenges, approaches like SUGAR utilize hierarchical learning to retain structural information and achieve discriminative representations. Furthermore, there is increasing interest in Graph Neural Networks (GNNs), with novel techniques focusing on node sampling strategies and message passing mechanisms.

Additionally, efforts are being made to improve performance through data augmentation techniques and methods that leverage reinforcement learning for network representation learning. These advancements contribute to enhancing the performance of network representation learning across various real-world networks.

Furthermore, GPA, which aims to improve the learning performance of graph representations, studies how to efficiently label the nodes in GNNs, thereby reducing the annotation cost of training GNNs using an active learning method.

Relational Reasoning

Discovering and understanding causal mechanisms involves searching for Directed Acyclic Graphs (DAGs) that minimize defined score functions. While reinforcement learning (RL) methods have shown promising results in causal discovery from observed data, navigating the space of DAGs or uncovering implied conditions presents significant complexity.

RL agents with random policies can autonomously define search policies based on learned uncertain information, rapidly updating them with reward signals. Zhu et al. propose leveraging RL to find underlying DAGs without relying on smooth score functions. Their algorithm employs Actor-Critic as the search algorithm and outputs the graph with the best reward among all generated during training.

However, computational challenges arise with this method, and exploring the action space comprising directed graphs is commonly difficult. Wang et al. introduce the CORL method, incorporating RL into the ordering-based paradigm. They describe the ordering search problem as a multi-step MDP and implement the ordering generation process using encoder-decoder structures. Sun et al. combine transfer learning and RL for co-learning to leverage prior causal knowledge for causal reasoning tasks.

Task-oriented Spoken Dialogue Systems (SDS) continuously interact with humans to accomplish predefined tasks. Chen et al. propose an alternative method incorporating the DQN algorithm to innovate neural network structures for dialogue policy adaptation. Additionally, natural question generation models are proposed to enhance Q&A task performance. Chen et al. focus on natural question generation, introducing a RL-based Graph-to-Sequence (Graph2Seq) model employing the self-critical sequence training (SCST) algorithm to optimize evaluation metrics directly.

Explicitly obtaining user preferences for recommended items and attributes through interactive conversations is the goal of conversational recommender systems. Deng et al. leverage a graph structure to integrate recommendation and conversation components. They use a dynamic weighted graph to model changing interrelationships during conversations and consider a graph-based MDP environment for simultaneous relationship processing.

With the rise of artificial intelligence, knowledge graphs have become crucial data infrastructure for various real-world applications, including dialogue systems and knowledge reasoning. Lin et al. propose a policy-based agent for multi-hop reasoning tasks through RL approaches. MINERVA utilizes RL to train end-to-end models for answering questions on multi-hop knowledge graphs.

To address incomplete knowledge graph problems, scholars commonly employ multi-hop reasoning. Wan et al. suggest a hierarchical RL method to simulate human thinking patterns, decomposing the reasoning process into RL policies. Recommendation systems, critical to online applications, incorporate DRL models like DQN and DDPG for decision-making and long-term planning in dynamic environments. KGQR integrates graph learning and sequential decision problems in interactive recommender systems, enhancing RL performance through semantic correlations in knowledge graphs.

Real World Applications

Research on Graph Representation Learning (GRL) has surged in recent years, with significant implications for real-world applications and garnering attention from scholars. Applications span various domains including transportation network optimization, E-commerce recommendation systems, drug structure prediction, molecular structure generation, and COVID-19 control strategies. Notable advancements in GRL methods are outlined below:

1) Explainability: Scholars are focusing on enhancing the interpretability of Graph Neural Networks (GNNs) to enable their use in critical applications such as medicine, privacy, and security. Explaining GNNs at both the instance and model levels is crucial for building trust. Methods like SubgraphX and RioGNN offer explanations at the subgraph level, enhancing the interpretability of multi-relational GNNs.

2) City Services: GRL methods aid in addressing urban challenges like traffic congestion and communication inefficiency. Techniques such as traffic flow prediction, traffic signal control, and electronic toll collection optimization optimize city services. GRL also enhances packet switching network routing and Wireless Local Area Networks (WLANs) channel allocation.

3) Epidemic Control: In controlling epidemics, GRL plays a pivotal role in predicting information diffusion, dynamically allocating resources, and identifying key nodes for intervention. Algorithms like RAI assist in curbing virus spread by leveraging social relationships in the Internet of Things (IoT).

4) Combinatorial Optimization: GRL methods are employed in solving combinatorial optimization problems efficiently. OpenGraphGym and S2V-DQN tackle combinatorial graph optimization, while Action Schema Network learns generalized policies for probabilistic planning problems.

5) Medicine: GRL techniques find applications in Clinical Decision Support (CDS), Medicine Combination Prediction (MCP), chemical reaction product prediction, and brain network analysis. Models like Graph Convolution RL and Graph Transformation Policy Network aid in medicine correlation prediction and chemical molecule generation, contributing to drug discovery research.

These applications underscore the versatility and impact of GRL across diverse fields, paving the way for innovative solutions to complex real-world challenges.

Collections by domain

Representation Learning

Year	Venue	Model	Title	Algorithm	Paper	Code
2018	ACM SIGKDD	GAM	Graph Classification using structural attention	Partially Observable Markov Decision Process (POMDP)	Paper
2019	IEEE TNSM	DDPG-HFA	A Deep Reinforcement Learning Approach for VNF Forwarding Graph Embedding	DDPG	Paper
2019	arXiv	GraphNAS	GraphNAS: Graph Neural Architecture Search with Reinforcement Learning	MDP	Paper	Code
2019	arXiv	AGNN	Auto-GNN: Neural Architecture Search of Graph Neural Networks	REINFORCE	Paper
2019	AISTATS	GRPI	Representation Learning on Graphs: A Reinforcement Learning Application	MDP	Paper	Code
2019	ICDM	GDPNet	Learning Robust Representations with Graph Denoising Policy Network	MDP	Paper
2020	ICPR	DAGCN	Reinforcement learning with dual attention guided graph convolution for relation extraction	MDP	Paper
2020	IEEE J-SAC	A3C+GCN	Automatic Virtual Network Embedding: A Deep Reinforcement Learning Approach With Graph Convolutional Networks	A3C	Paper
2020	ICPR	DAGCN	Reinforcement learning with dual attention guided graph convolution for relation extraction	MDP	Paper
2020	ICASSP	RLNet	Learning network representation through reinforcement learning	MDP	Paper
2020	NeurIPS	GPA	Graph Policy Network for Transferable Active Learning on Graphs	MDP	Paper	Code
2020	KDD	Policy-GNN	Policy-GNN: Aggregation Optimization for Graph Neural Networks	DQN	Paper	Code
2020	ICLR	DGN	Graph Convolutional Reinforcement Learning	Q-Learning	Paper	Code
2020	AAAI/ACMAI	GAEA	GAEA: Graph Augmentation for Equitable Access via Reinforcement Learning	MDP	Paper	Code
2021	DASFAA	IMGER	A reinforcement learning model for influence maximization in social networks	DDQN	Paper
2021	IEEE ICDM	GQNAS	GQNAS: Graph Q Network for Neural Architecture Search	DQN	Paper
2021	IEEE TKDE	Netrl	Netrl: Task-aware network denoising via deep reinforcement learning	DQN	Paper	Code
2021	IEEE ICDM	ACE-HGNN	ACE-HGNN: Adaptive Curvature Exploration Hyperbolic Graph Neural Network	Nash Q-leaning	Paper
2021	WWW	SUGAR	SUGAR: Subgraph Neural Network with Reinforcement Pooling and Self-Supervised Mutual Information Mechanism	Q-Learning	Paper	Code
2021	ACM TOIS	RioGNN	Reinforced Neighborhood Selection Guided Multi-Relational Graph Neural Networks	MDP	Paper	Code
2022	Knowledge-Based Systems	AFGSL	AFGSL: Automatic Feature Generation based on Graph Structure Learning	Q-Learning	Paper
2022	arXiv	GraphAug	Automated Data Augmentations for Graph Classification	MDP	Paper
2022	IEEE TKDE	RTGNN	Multi-view Tensor Graph Neural Networks Through Reinforced Aggregation	MDP	Paper	Code
2022	Neurocomputing	Treeago	Treeago: Tree-structure aggregation and optimization for graph neural network	DQN	Paper
2022	IEEE TKDE	GraphNAS++	GraphNAS++: Distributed Architecture Search for Graph Neural Networks	REINFORCE	Paper
2022	Neural Computing and Applications	Kyriakides et al.	Evolving graph convolutional networks for neural architecture search	MDP	Paper
2022	Information Sciences	GraphTUL	Contextual spatio-temporal graph representation learning for reinforced human mobility mining	MDP	Paper
2022	AAAI	BiGeNe	Batch Active Learning with Graph Neural Networks via Multi-Agent Deep Reinforcement Learning	DQN	Paper
2022	arXiv	AdaNet	Robust Knowledge Adaptation for Dynamic Graph Neural Networks	REINFORCE	Paper
2022	Annals of Operations Research	CRL	Counterfactual based reinforcement learning for graph neural networks	MolDQN	Paper
2023	ICML	DeepIM	Deep Graph Representation Learning and Optimization for Influence Maximization	MDP	Paper	Code
2023	IEEE TKDE	HGNAS++:	Efficient Architecture Search for Heterogeneous Graph Neural Networks	MDP	Paper
2023	Information Sciences	DeepGNAS	Search for deep graph neural networks	DQN	Paper
2023	MLSys	X-RLflow	X-RLflow: Graph Reinforcement Learning for Neural Network Subgraphs Transformation	PPO	Paper	Code

Adversarial Attacks

Year	Venue	Model	Title	Algorithm	Paper	Code
2018	ICML	RL-S2V	Adversarial Attack on Graph Structured Data	Q-learning	Paper
2018	TrustCom/BigDataSE	Yousefi et al.	A Reinforcement Learning Approach for Attack Graph Analysis	Q-learning	Paper
2019	arXiv	ReWatt	Attacking Graph Convolutional Networks via Rewiring	MDP	Paper
2020	CIKM	CARE-GNN	Enhancing Graph Neural Network-based Fraud Detectors against Camouflaged Fraudsters	BMAB	Paper	Code
2020	WWW	NIPA	Adversarial Attacks on Graph Neural Networks via Node Injections: A Hierarchical Reinforcement Learning Approach	DQN	Paper
2021	SBP-BRiMS	Dineen et al.	Reinforcement Learning for Data Poisoning on Graph Neural Networks	REINFORCE	Paper
2022	Neural Computing and Applications	Wu et al.	Poisoning attacks against knowledge graph-based recommendation systems using deep reinforcement learning	MDP	Paper
2022	KDD	KGAttack	Knowledge-enhanced Black-box Attacks for Recommendations	AC	Paper
2022	IEEE TKDE	RL-GraphMI	Model Inversion Attacks Against Graph Neural Networks	Q-Learning	Paper	Code
2023	IJCNN	AdRumor-RL	Interpretable and Effective Reinforcement Learning for Attacking against Graph-based Rumor Detection	MDP	Paper

Relational Reasoning

Year	Venue	Model	Title	Algorithm	Paper	Code
2017	arXiv	Deeppath	Deeppath: A reinforcement learning method for knowledge graph reasoning	DQN	Paper	Code
2017	arXiv	KBGAN	KBGAN: Adversarial Learning for Knowledge Graph Embeddings	REINFORCE	Paper	Code
2017	ICLR	MINERVA	Go for a Walk and Arrive at the Answer: Reasoning Over Paths in Knowledge Bases using Reinforcement Learning	REINFORCE	Paper	Code
2018	IEEE ICDMW	MARLPaR	Path Reasoning over Knowledge Graph: A Multi-agent and Reinforcement Learning Based Method	MDP	Paper
2018	KDD	DEERS	Recommendations with Negative Feedback via Pairwise Deep Reinforcement Learning	MDP	Paper
2018	COLING	Chen et al.	Structured Dialogue Policy with Graph Neural Networks	REINFORCE	Paper
2018	EMNLP	Lin et al.	Multi-Hop Knowledge Graph Reasoning with Reward Shaping	REINFORCE	Paper
2019	arXiv	Graph2Seq	Reinforcement learning based graph-to-sequence model for natural question generation	MDP	Paper	Code
2019	arXiv	Ekar	Ekar: An Explainable Method for Knowledge Aware Recommendation	MDP	Paper
2019	ACM SIGIR	PGPR	Reinforcement Knowledge Graph Reasoning for Explainable Recommendation	REINFORCE	Paper	Code
2020	ACM SIGIR	KGQR	Interactive Recommender System via Knowledge Graph-enhanced Reinforcement Learning	DQN	Paper
2020	arXiv	KG-A2C	Graph Constrained Reinforcement Learning for Natural Language Action Spaces	A2C	Paper	Code
2020	ACM SIGKDD	IMUP	Incremental Mobile User Profiling: Reinforcement Learning with Spatial Knowledge Graph for Modeling Event Streams	DQN	Paper
2020	ICLR	RL-BIC	Causal Discovery with Reinforcement Learning	AC	Paper	Code
2020	arXiv	RL-HGNN	Reinforcement Learning Enhanced Heterogeneous Graph Neural Network	DQN	Paper
2020	ISPA/BDCloud /SocialCom /SustainCom	DKDR	DKDR: An Approach of Knowledge Graph and Deep Reinforcement Learning for Disease Diagnosis	Q-Learning	Paper
2020	KDD	NIRec	An Efficient Neighborhood-based Interaction Model for Recommendation on Heterogeneous Graph	MDP	Paper
2020	SIGIR	GCQN	Reinforcement Learning based Recommendation with Graph Convolutional Q-network	Q-Learning	Paper
2020	Knowledge-Based Systems	GRL	GRL: Knowledge graph completion with GAN-based reinforcement learning	DDPG	Paper
2021	ACM SIGIR	UNICORN	Unified conversational recommendation policy learning via graph-based reinforcement learning	DDQN	Paper
2021	IEEE TNNLS	Sun et al.	Model-based transfer reinforcement learning based on graphical model representations	DDPG	Paper
2021	arXiv	TITer	TimeTraveler: Reinforcement Learning for Temporal Knowledge Graph Forecasting	REINFORCE	Paper	Code
2021	Neurocomputing	MemoryPath	MemoryPath: A deep reinforcement learning framework for incorporating memory component into knowledge graph reasoning	MDP	Paper
2021	IJCAI	CORL	Ordering-Based Causal Discovery with Reinforcement Learning	MDP	Paper	Code
2021	EMNLP-IJCNLP	AttnPath	Incorporating Graph Attention Mechanism into Knowledge Graph Reasoning Based on Deep Reinforcement Learning	MDP	Paper
2021	Neural Networks	Dapath	Dapath: Distance-aware knowledge graph reasoning based on deep reinforcement learning	REINFORCE	Paper	Code
2021	IJCAI	RLH	Reasoning like human: Hierarchical reinforcement learning for knowledge graph reasoning	MDP	Paper
2020	Knowledge-Based Systems	ADRL	ADRL: An attention-based deep reinforcement learning framework for knowledge graph reasoning	AC	Paper
2021	IJCKG	PAAR	Multi-hop Knowledge Graph Reasoning Based on Hyperbolic Knowledge Graph Embedding and Reinforcement Learning	MDP	Paper	Code
2021	KSEM	Zheng et al.	Hierarchical Policy Network with Multi-agent for Knowledge Graph Reasoning Based on Reinforcement Learning	REINFORCE	Paper
2022	Knowledge-Based Systems	RF	Dynamic knowledge graph reasoning based on deep reinforcement learning	AC	Paper
2022	Applied Intelligence	RLPath	RLPath: a knowledge graph link prediction method using reinforcement learning based attentive relation path searching and representation learning	MDP	Paper
2022	Soft Computing	GNNRC	A novel embedding learning framework for relation completion and recommendation based on graph neural network and multi-task learning	MDP	Paper
2022	ACM/IMS Transactions on Data Science	TRGIR	A Text-based Deep Reinforcement Learning Framework Using Self-supervised Graph Representation for Interactive Recommendation	DDPG	Paper
2022	arXiv	KGRGRL	KGRGRL: A User’s Permission Reasoning Method Based on Knowledge Graph Reward Guidance Reinforcement Learning	MDP	Paper
2022	AAAI	CURL	Learning to Walk with Dual Agents for Knowledge Graph Reasoning	MDP	Paper	Code
2022	arXiv	FreeKD	FreeKD: Free-direction Knowledge Distillation for Graph Neural Networks	MDP	Paper
2022	ACM Transactions on Information Systems	Feng et al.	Reinforcement Routing on Proximity Graph for Efficient Recommendation	MDP	Paper
2022	DASFAA	ExKGR	ExKGR: Explainable Multi-hop Reasoning for Evolving Knowledge Graph	MDP	Paper
2022	DASFAA	Zhang et al.	A Joint Framework for Explainable Recommendation with Knowledge Reasoning and Graph Representation	A2C	Paper
2022	Artificial Intelligence in Medicine	GTGAT	Gated Tree-based Graph Attention Network (GTGAT) for medical knowledge graph reasoning	MDP	Paepr
2022	Education and Information Technologies	MEUR	Graph path fusion and reinforcement reasoning for recommendation in MOOCs	MDP	Paper
2022	AICAT	Wu et al.	A construction technology of automatic reasoning system based on knowledge graph	MDP	Paper
2022	Information Processing & Management	SparKGR	Iterative rule-guided reasoning over sparse knowledge graphs with deep reinforcement learning	DQN	Paper
2022	arXiv	GRADER	Generalizing Goal-Conditioned Reinforcement Learning with Variational Causal Reasoning	MDP	Paper
2022	arXiv	APPO	Performance Optimization for Semantic Communications: An Attention-based Reinforcement Learning Approach	MDP	Paper
2022	arXiv	CERec	Reinforced Path Reasoning for Counterfactual Explainable Recommendation	MDP	Paper	Code
2022	ACM SIGIR	CGKR	Alleviating Spurious Correlations in Knowledge-aware Recommendations through Counterfactual Generator	MDP	Paper	Code
2022	ACM SIGIR	HICR	Conversational Recommendation via Hierarchical Information Modeling	DQN	Paper
2022	ACM SIGIR	MARIS	Multi-Agent RL-based Information Selection Model for Sequential Recommendation	MDP	Paper
2022	ICME	ROGC	ROGC: Role-Oriented Graph Convolution Based Multi-Agent Reinforcement Learning	MARL	Paper
2022	AAAI	HiTKG	HiTKG: Towards Goal-Oriented Conversations via Multi-Hierarchy Learning	MDP	Paper
2022	Applied Soft Computing	SSRL	Self-Supervised Reinforcement Learning with dual-reward for knowledge-aware recommendation	Actor-Critic	Paper
2022	Knowledge-Based Systems	KAiPP	KAiPP: An interaction recommendation approach for knowledge aided intelligent process planning with reinforcement learning	MDP	Paper
2022	CIKM	KRAF	A Flexible Advertising Framework using Knowledge Graph-Enriched Multi-Agent Reinforcement Learning	MARL	Paper
2022	CIKM	GPR	Two-Level Graph Path Reasoning for Conversational Recommendation with User Realistic Preference	DQN	Paper
2022	SSRN	VRNet	Knowledge Graph Relation Reasoning with Variational Reinforcement Network	MDP	Paper
2022	KBS	Zhu et al.	Step by step: A hierarchical framework for multi-hop knowledge graph reasoning with reinforcement learning	MDP	Paper	Code
2023	SDM	GARL	Causal Discovery by Graph Attention Reinforcement Learning	MDP	Paper
2023	Education and Information Technologies	MEUR	Graph path fusion and reinforcement reasoning for recommendation in MOOCs	Actor-Critic	Paper
2023	IEEE TKDE	TMER-RL	Reinforcement Learning based Path Exploration for Sequential Explainable Recommendation	MDP	Paper
2023	CLeaR	MCD	A Meta-Reinforcement Learning Algorithm for Causal Discovery	Actor-Critic	Paper	Code
2023	Applied Intelligence	RED	Reinforcement learning-based denoising network for sequential recommendation	MDP	Paper
2023	Conference of the European Chapter of the Association for Computational Linguistics	Jiang et al.	Path Spuriousness-aware Reinforcement Learning for Multi-Hop Knowledge Graph Reasoning	REINFORCE	Paper	Code

Real-World Applications

Explainability

Year	Venue	Model	Title	Algorithm	Paper	Code
2019	NeurIPS	GMETAEXP	Learning Transferable Graph Exploration	MDP	Paper
2019	arXiv	Ekar	Ekar: An Explainable Method for Knowledge Aware Recommendation	MDP	Paper
2019	ACM SIGIR	PGPR	Reinforcement Knowledge Graph Reasoning for Explainable Recommendation	REINFORCE	Paper	Code
2020	KDD	XGNN	XGNN: Towards Model-Level Explanations of Graph Neural Networks	MDP	Paper
2021	ICML	SubgraphX	On Explainability of Graph Neural Networks via Subgraph Explorations	MCTS	Paper	Code
2021	arXiv	SparRL	SparRL: Graph Sparsification via Deep Reinforcement Learning	MDP	Paper	Code
2021	ACM TOIS	RioGNN	Reinforced Neighborhood Selection Guided Multi-Relational Graph Neural Networks	MDP	Paper	Code
2022	ICLR	G2RL	Graph-Enhanced Exploration for Goal-oriented Reinforcement Learning	Q-Learning	Paper
2022	IEEE TNNLS	LEGIT	Explaining Deep Graph Networks via Input Perturbation	MDP	Paper	Code
2022	ADC	Mishra et al.	Predicting Taxi Hotspots in Dynamic Conditions Using Graph Neural Network	MDP	Paper
2022	CIKM	Saha et al.	A Model-Centric Explainer for Graph Neural Network based Node Classification	REINFORCE	Paper	Code

City Services

Year	Venue	Model	Title	Algorithm	Paper	Code
2018	IEEE Big Data	Obara et al.	Deep Reinforcement Learning Approach for Train Rescheduling Utilizing Graph Theory	DQN	Paper
2018	PMLR	Zhang et al.	Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents.	Actor-Critic	Paper
2018	IEEE ITSC	NFQI	Traffic Signal Control Based on Reinforcement Learning with Graph Convolutional Neural Nets	NFQI	Paper
2019	SOSR	Rusek et al.	Unveiling the potential of Graph Neural Networks for network modeling and optimization in SDN	MDP	Paper
2019	arXiv	DRL+GNN	Deep Reinforcement Learning meets Graph Neural Networks: exploring a routing optimization use case	DQN	Paper	Code
2019	SIGCOMM	RouteNet	Challenging the generalization capabilities of graph neural networks for network modeling	MDP	Paper
2020	IJCAI	eGCN	Dynamic Electronic Toll Collection via Multi-Agent Deep Reinforcement Learning with Edge-Based Graph Convolutional Networks	MDP	Paper
2020	IEEE TMC	STMARL	STMARL: A Spatio-Temporal Multi-Agent Reinforcement Learning Approach for Cooperative Traffic Light Control	DQN	Paper
2020	IEEE Access	NAKASHIMA et al.	Deep Reinforcement Learning-Based Channel Allocation for Wireless LANs With Graph Convolutional Networks	DDQN	Paper
2020	Artificial Intelligence in China	DR-DCG	Coordinated Learning for Lane Changing Based on Coordination Graph and Reinforcement Learning	MDP	Paper
2021	IEEE ICCCS	GraphLight	GraphLight: Graph-based Reinforcement Learning for Traffic Signal Control	REINFORCE	Paper
2021	IEEE T-ITS	IG-RL	IG-RL: Inductive Graph Reinforcement Learning for Massive-Scale Traffic Signal Control	MDP	Paper	Code
2021	Computer‐Aided Civil and Infrastructure Engineering	GCQ	Graph neural network and reinforcement learning for multi-agent cooperative control of connected autonomous vehicles	DQN	Paper
2021	IEEE T-ITS	SAGE-Garph	Deep Reinforcement Learning With Graph Representation for Vehicle Repositioning	DDQN	Paper
2021	Information Sciences	Dynamic graph	Dynamic graph convolutional network for long-term traffic flow prediction with reinforcement learning	PPO	Paper
2022	Digital Signal Processing	DQN-GCN-GAT	A new ensemble deep graph reinforcement learning network for spatio-temporal traffic volume forecasting in a freeway network	DQN	Paper
2022	IEEE TMC	RedPacketBike	RedPacketBike: A Graph-Based Demand Modeling and Crowd-Driven Station Rebalancing Framework for Bike Sharing Systems	MDP	Paper
2022	International Journal of Electrical Power & Energy Systems	GRL	Real-time fast charging station recommendation for electric vehicles in coupled power-transportation networks: A graph reinforcement learning method	DQN	Paper
2022	Digital Signal Processing	DQN-GCN-GAT	A new ensemble deep graph reinforcement learning network for spatio-temporal traffic volume forecasting in a freeway network	DQN	Paper
2022	arXiv	MuJAM	Model-based graph reinforcement learning for inductive traffic signal control	MDP	Paper
2022	Applied Intelligence	GCQN-TSC	Graph cooperation deep reinforcement learning for ecological urban traffic signal control	MDP	Paper
2022	arXiv	GRL	Graph Convolution-Based Deep Reinforcement Learning for Multi-Agent Decision-Making in Mixed Traffic Environments	DQN	Paper	Code
2022	Knowledge-Based Systems	MetaSTGAT	Meta-learning based spatial-temporal graph attention network for traffic signal control	DQN	Paper
2022	Applied Intelligence	VARL	VARL: a variational autoencoder-based reinforcement learning Framework for vehicle routing problems	MDP	Paper
2022	Information Fusion	IHA-MDGI	An inductive heterogeneous graph attention-based multi-agent deep graph infomax algorithm for adaptive traffic signal control	multi-agent ATSC	Paper
2022	Transportation Research Part C: Emerging Technologies	RDGCNI	A novel reinforced dynamic graph convolutional network model with data imputation for network-wide traffic flow prediction	DDPG	Paper
2022	IJECE	ERL-MA	Evolutionary reinforcement learning multi-agents system for intelligent traffic light control: new approach and case of study	Q-Learning	Paper
2022	IEEE Internet of Things Journal	GCN-based DRL	Joint Routing and Scheduling Optimization in Time-Sensitive Networks Using Graph Convolutional Network-based Deep Reinforcement Learning	DQN	Paper
2022	Artificial Intelligence and Computing on Industrial Applications	MB-GCN	A Deep Coordination Graph Convolution Reinforcement Learning for Multi-Intelligent Vehicle Driving Policy	MDP	Paper
2022	IET Communications	GRL	A generic intelligent routing method using deep reinforcement learning with graph neural networks	PPO	Paper
2022	CIKM	Lou et al	Meta-Reinforcement Learning for Multiple Traffic Signals Control	MDP	Paper
2022	IEEE ICIEA	GCN-DQN/GCN-DDQN	Multi-Vehicles Decision-Making in Interactive Highway Exit: A Graph Reinforcement Learning Approach	DQN/DDQN	Paper
2022	IEEE ITSC	Liu et al.	Graph Convolution-Based Deep Reinforcement Learning for Multi-Agent Decision-Making in Interactive Traffic Scenarios	MDP	Paper	Code
2022	arXiv	MuJAM	Model-based graph reinforcement learning for inductive traffic signal control	MDP	Paper
2023	IET Gener. Transm. Distrib.	GraphSAGE-D3QN	An emergency control strategy for undervoltage load shedding of power system: A graph deep reinforcement learning method	D3QN	Paper

Epidemic Control

Year	Venue	Model	Title	Algorithm	Paper	Code
2020	Nature Machine Intelligence	FINDER	Finding key players in complex networks through deep reinforcement learning	Q-Learning	Paper	Code
2021	ICML	RLGN	Controlling Graph Dynamics with Reinforcement Learning and Graph Neural Networks	PPO	Paper
2021	IEEE TNNLS	FOREST	Full-Scale Information Diffusion Prediction With Reinforced Recurrent Networks	REINFORCE	Paper	Code
2021	arXiv	HITTER	Hypernetwork Dismantling via Deep Reinforcement Learning	DQN	Paper
2022	IEEE TETCI	EDRL-IM	Influence Maximization in Complex Networks by Using Evolutionary Deep Reinforcement Learning	DQN	Paper
2022	ACM Transactions on Knowledge Discovery from Data	IDRLECA	Contact Tracing and Epidemic Intervention via Deep Reinforcement Learning	PPO	Paper
2022	KDD	Vehicle	Precise Mobility Intervention for Epidemic Control Using Unobservable Information via Deep Reinforcement Learning	HRL	Paper

Combinatorial Optimization

Year	Venue	Model	Title	Algorithm	Paper	Code
2017	NIPS	S2V-DQN	Learning Combinatorial Optimization Algorithms over Graphs	Q-Learning	Paper	Code
2018	AAAI	ASNets	Action Schema Networks: Generalised Policies with Deep Learning	MDP	Paper	Code
2019	arXiv	GPN	Combinatorial Optimization by Graph Pointer Networks and Hierarchical Reinforcement Learning	REINFORCE	Paper	Code
2020	Nature Machine Intelligence	FINDER	Finding key players in complex networks through deep reinforcement learning	Q-Learning	Paper	Code
2021	IEEE Communications Letters	DeepOpt	Combining Deep Reinforcement Learning With Graph Neural Networks for Optimal VNF Placement	REINFORCE	Paper
2020	IEEE Access	SILVA et al.	Temporal Graph Traversals Using Reinforcement Learning With Proximal Policy Optimization	PPO	Paper
2022	IEEE TETCI	EDRL-IM	Influence Maximization in Complex Networks by Using Evolutionary Deep Reinforcement Learning	DQN	Paper
2022	arXiv	GTA-RL	Solving Dynamic Graph Problems with Multi-Attention Deep Reinforcement Learning	REINFORCE	Paper	Code
2022	IEEE TII	Song et al.	Flexible Job Shop Scheduling via Graph Neural Network and Deep Reinforcement Learning	PPO	Paper
2022	Engineering Applications of Artificial Intelligence	GCE-MAD	A graph convolutional encoder and multi-head attention decoder network for TSP via reinforcement learning	REINFORCE	Paper
2022	IEEE TII	DGERD	A Deep Reinforcement Learning Framework Based on an Attention Mechanism and Disjunctive Graph Embedding for the Job Shop Scheduling Problem	DQN	Paper
2022	arXiv	ECORD	Learning to Solve Combinatorial Graph Partitioning Problems via Efficient Exploration	DQN	Paper	Code
2022	Information Sciences	G3DQN	A graph neural networks-based deep Q-learning approach for job shop scheduling problems in traffic management	DQN	Paper
2022	KDD	DGMP	Enhancing Machine Learning Approaches for Graph Optimization Problems with Diversifying Graph Augmentation	MDP	Paper
2022	Neurocomputing	E-GAT	Solve routing problems with a residual edge-graph attention neural network	PPO	Paper	Code
2022	arXiv	N-BLS	Subgraph Matching via Query-Conditioned Subgraph Matching Neural Networks and Bi-Level Tree Search	MCTS	Paper
2022	techrxiv	TOFA	You Only Train Once: A highly generalizable reinforcement learning method for dynamic job shop scheduling problem	MDP	Paper	Code
2022	arXiv	LKH	Solving the Traveling Salesperson Problem with Precedence Constraints by Deep Reinforcement Learning	MDP	Paper	Code
2022	IEEE TII	DRL	Flexible job-shop scheduling via graph neural network and deep reinforcement learning	PPO	Paper	Code
2023	Knowledge-Based Systems	DeepMAG	DeepMAG: Deep reinforcement learning with multi-agent graphs for flexible job shop scheduling	MARL	Paper
2023	Information Sciences	BDRL	Solving combinatorial optimization problems over graphs with BERT-Based Deep Reinforcement Learning	REINFORCE	Paper

Medicine

Year	Venue	Model	Title	Algorithm	Paper	Code
2018	NeurIPS	GCPN	Graph convolutional policy network for goal-directed molecular graph generation	MDP	Paper	Code
2018	arXiv	MolGAN	MolGAN: An implicit generative model for small molecular graphs	MDP	Paper
2019	CIKM	CompNet	Order-free Medicine Combination Prediction with Graph Convolutional Reinforcement Learning	DQN	Paper	Code
2019	KDD	GTPN	Graph Transformation Policy Network for Chemical Reaction Prediction	A2C	Paper
2020	IEEE Access	Wang et al.	Risk-Aware Identification of Highly Suspected COVID-19 Cases in Social IoT: A Joint Graph Theory and Reinforcement Learning Approach	Q-Learning	Paper
2020	ISPA/BDCloud /SocialCom /SustainCom	DKDR	DKDR: An Approach of Knowledge Graph and Deep Reinforcement Learning for Disease Diagnosis	Q-Learning	Paper
2020	Journal of cheminformatics	DeepGraphMolGen	DeepGraphMolGen, a multi-objective, computational strategy for generating molecules with desirable properties: a graph convolution and reinforcement learning approach	PPO	Paper	Code
2022	arXiv	BN-GNN	Deep Reinforcement Learning Guided Graph Neural Networks for Brain Network Analysis	DDQN	Paper
2023	Aiche Journal	Stops et al.	Flowsheet generation through hierarchical reinforcement learning and graph neural networks	Actor-Critic	Paper

Others

Year	Venue	Model	Title	Algorithm	Paper	Code
2017	IEEE TNNLS	FRDNN	Deep direct reinforcement learning for financial signal representation and trading	DRL	Paper
2018	ICLR	NerveNet	NerveNet: Learning Structured Policy with Graph Neural Networks	PPO	Paper
2020	ACM/IEEE DAC	GCN-RL	GCN-RL Circuit Designer: Transferable Transistor Sizing with Graph Neural Networks and Reinforcement Learning	AC	Paper
2021	Expert Systems with Applications	DeepPocket	Deep graph convolutional reinforcement learning for financial portfolio management-deeppocket	AC	Paper
2021	arXiv	Gnn-rl compression	Gnn-rl compression: Topologyaware network pruning using multi-stage graph embedding and reinforcement learning	DDPG	Paper
2021	ICCV	AGMC	Auto graph encoder-decoder for neural network pruning	DQN	Paper
2022	Applied Intelligence	GraphPruning	Graph pruning for model compression	DDPG	Paper
2022	ICLR	AGILE	Know Your Action Set: Learning Action Relations for Reinforcement Learning	PPO DQN CDQN	Paper	Code
2022	ICLR	MAPSRL-2	Reinforcement Learning under a Multi-agent Predictive State Representation Model: Method and Theory	Q-Learning	Paper
2022	ICLR	SWAT	Structure-Aware Transformer Policy for Inhomogeneous Multi-Task Reinforcement Learning	AC	Paper
2022	IEEE TPAMI	DRL-DBSCAN	Reinforced, Incremental and Cross-lingual Event Detection From Social Messages	MarGNN	Paper	Code
2022	ACM Transactions on Asian and Low-Resource Language Information Processing	GA-SCS	GA-SCS: Graph-Augmented Source Code Summarization	MDP	Paper
2022	IP CCC	RCGNN	Reinforced Contrastive Graph Neural Networks (RCGNN) for Anomaly Detection	Recursive Scalable Reinforcement Learning (RSRL)	Paper
2022	IEEE Transactions on Computational Social Systems	MADDPG	Misinformation Propagation in Online Social Networks: Game Theoretic and Reinforcement Learning Approaches	MARL	Paper

Reference

PREVIOUSPrimal Wasserstein Imitation Learning

NEXTRAFT: Integrating RAG with Fine-Tuning