Recent successes in reinforcement learning (RL) have led to problem-solving across various fields such as robotics, gaming, and natural language processing. RL deals with how agents should learn to take actions to maximize cumulative rewards through interactions with the environment. The rapid advancement of RL has prompted scholars to explore new RL models for real-world applications in domains like finance, healthcare, and transportation. There is active research in data mining for graph structures, as many real-world datasets are represented as graphs. With the continuous development of RL methods in recent years, scholars are increasingly interested in combining graph mining with RL to address decision problems arising in graph mining tasks. Collaborative research on graph mining algorithms and RL models is on the rise, as evidenced by trends in published papers on Graph Reinforcement Learning (GRL) from January 2017 to April 2022.
Traditional methods and deep learning-based models for graph mining tasks have significant differences in model design and training processes compared to RL-based methods, posing challenges for scholars in employing RL methods to analyze graph data. Scholars have been extensively researching in the fields of RL and graph mining to address these challenges, with attempts made in various areas such as rumor detection, recommendation systems, and automated machine learning (AutoML). The authors define GRL as solutions and measures for solving graph mining tasks by analyzing critical components such as nodes, links, and subgraphs in graphs with RL methods to explore the topological structure and attribute information of the graphs. A systematic review of this area is deemed necessary, and the authors believe their work represents the first comprehensive survey of various GRL methods.
Preliminaries
Graph Neural Networks
A Graph Neural Network (GNN) is a deep learning model used to process data represented in graph structures. This model learns relationships between nodes and edges (links) within a graph to perform various tasks within the graph.
Fundamentally, a GNN takes as input the feature vectors of each node in the graph, representing attributes of the nodes. For example, in a social network, these feature vectors might represent user profile information.
GNN aggregates information from neighboring nodes to compute embeddings for each node. This allows for representations of each node considering its local context.
The basic operation of a GNN can be described as follows:
-
Input Features: Given a matrix $X$ representing the feature vectors of each node, with dimensions $n × d$, where $n$ is the number of nodes and $d$ is the dimensionality of each node’s feature vector.
-
Neighbor Aggregation: Each node collects information from its neighboring nodes and integrates this information to create a new representation for itself. This can be expressed mathematically as:
\[\begin{align*} h_v^{(l)} = \sigma \left( \sum_{u \in N(v)} f^{(l)}(h_u^{(l-1)}, h_v^{(l-1)}, e_{uv}) \right) \end{align*}\]Here, $ h_v^{(l)} $ represents the embedding of node $v$ in layer $l$, $ N(v) $ is the set of neighboring nodes of node v, $ f^{(l)} $ is the aggregation function at layer $l$, and $ e_{uv} $ represents information about the edge between nodes $u$ and $v$.
-
Recursive Computation: This neighbor aggregation step is repeated multiple times to iteratively update the embeddings of each node, taking into account the global graph structure.
-
Output Computation: Finally, the final embeddings of each node are used to perform the desired task (e.g., classification, regression, prediction).
In this way, a Graph Neural Network effectively infers and predicts node attributes while considering the complexity of the graph structure, serving as a powerful model for various applications within graphs.
Representation Learning
Network Representation Learning, also known as graph embedding or graph representation learning, aims to learn low-dimensional vector representations (embeddings) for nodes in a network such that nodes with similar network neighborhoods are closer together in the embedding space.
Let’s delve into the details using mathematical expressions:
-
Input: Given an undirected graph $ G = (V, E) $ where $ V $ is the set of nodes and $ E $ is the set of edges, represented by an adjacency matrix $ A $ where $ A_{ij} = 1 $ if there exists an edge between nodes $ i $ and $ j $, and $ A_{ij} = 0 $ otherwise.
-
Objective Function: Network representation learning typically aims to minimize an objective function that measures the discrepancy between the similarity of nodes in the original graph and their similarity in the embedding space. One common objective function is the pairwise distance between nodes in the embedding space, minimized over all pairs of connected nodes in the graph:
\[\begin{align*} \text{minimize} \sum_{(i, j) \in E} d(f(v_i), f(v_j)) \end{align*}\]Here, $ f(v_i) $ and $ f(v_j) $ represent the embeddings of nodes $ i $ and $ j $, respectively, and $ d(\cdot) $ denotes a distance metric, such as Euclidean distance or cosine similarity.
-
Optimization: The objective function is minimized using optimization algorithms such as stochastic gradient descent (SGD) or its variants. During optimization, the embeddings of nodes are updated iteratively to minimize the objective function.
-
Embedding Space: The learned embeddings capture structural and semantic information of the nodes in the graph. Nodes with similar network neighborhoods will have embeddings that are closer together in the embedding space, enabling downstream tasks such as node classification, link prediction, or community detection.
In summary, Network Representation Learning leverages mathematical formulations to learn low-dimensional vector representations for nodes in a network, aiming to preserve network structure and facilitate downstream network analysis tasks.
Knowledge Graphs
Large-scale knowledge graphs like DBpedia, Freebase, and Yago serve as essential infrastructure for various AI applications such as recommendation systems and dialogue generation. Knowledge graphs are defined as $G = (h, r, t)$, where $h$ represents the head entity, $t$ denotes the tail entity, and $r$ denotes the relationship between them. Scholars have proposed methods for knowledge graph completion, including knowledge graph embedding and multi-hop path reasoning. These methods fall into three categories: path ranking-based, representation learning-based, and RL-based methods. RL-based methods treat knowledge graph reasoning as Markov Decision Processes (MDP).
Reinforcement Learning
Reinforcement learning (RL) is a type of machine learning where an agent learns to make decisions by interacting with an environment to achieve a goal. The agent receives feedback in the form of rewards or penalties based on its actions, which helps it learn optimal strategies over time.
Mathematically, in RL, the agent learns a policy $ \pi $, which maps states $ s $ to actions $ a $, in order to maximize its cumulative reward $ R $. This is often formulated as finding the optimal policy that maximizes the expected sum of future rewards:
\[\begin{align*} \max_\pi \mathbb{E} \left[ \sum_{t=0}^{\infty} \gamma^t r_t \right] \end{align*}\]Here, $ r_t $ represents the reward received at time step $ t $, and $ \gamma $ is a discount factor that controls the importance of future rewards.
Recent advancements in RL have led to the development of several state-of-the-art techniques. These include deep reinforcement learning, which involves using deep neural networks to approximate complex decision-making processes, and algorithms like Proximal Policy Optimization (PPO), Trust Region Policy Optimization (TRPO), and Soft Actor-Critic (SAC), which have demonstrated superior performance in various domains such as robotics, gaming, and natural language processing. Additionally, techniques like meta-reinforcement learning, where agents learn to adapt their strategies across different tasks, and model-based reinforcement learning, which leverages learned models of the environment to improve sample efficiency, are also gaining traction in the field. Overall, RL continues to be an active area of research with ongoing advancements and applications in diverse domains.
RL on Graphs
Existing methods to solve graph data mining problems with RL methods focus on network representation learning, adversarial attacks, relational reasoning. In addition, many real-world applications study the GRL problem from different perspectives.
Datasets & Open-source
Graph Mining with RL
Representation Learning
Network representation learning is the process of learning a mapping that embeds nodes of a graph as low-dimensional vectors, capturing various structural and semantic information. These methods aim to optimize representations so that geometric relationships in the embedding space preserve the original graph’s structure.
The obtained node representations effectively support tasks such as node classification, clustering, link prediction, and graph classification. However, existing methods face challenges such as low feature discrimination, demand for prior knowledge, and low explainability.
To address these challenges, approaches like SUGAR utilize hierarchical learning to retain structural information and achieve discriminative representations. Furthermore, there is increasing interest in Graph Neural Networks (GNNs), with novel techniques focusing on node sampling strategies and message passing mechanisms.
Additionally, efforts are being made to improve performance through data augmentation techniques and methods that leverage reinforcement learning for network representation learning. These advancements contribute to enhancing the performance of network representation learning across various real-world networks.
Furthermore, GPA, which aims to improve the learning performance of graph representations, studies how to efficiently label the nodes in GNNs, thereby reducing the annotation cost of training GNNs using an active learning method.
Relational Reasoning
Discovering and understanding causal mechanisms involves searching for Directed Acyclic Graphs (DAGs) that minimize defined score functions. While reinforcement learning (RL) methods have shown promising results in causal discovery from observed data, navigating the space of DAGs or uncovering implied conditions presents significant complexity.
RL agents with random policies can autonomously define search policies based on learned uncertain information, rapidly updating them with reward signals. Zhu et al. propose leveraging RL to find underlying DAGs without relying on smooth score functions. Their algorithm employs Actor-Critic as the search algorithm and outputs the graph with the best reward among all generated during training.
However, computational challenges arise with this method, and exploring the action space comprising directed graphs is commonly difficult. Wang et al. introduce the CORL method, incorporating RL into the ordering-based paradigm. They describe the ordering search problem as a multi-step MDP and implement the ordering generation process using encoder-decoder structures. Sun et al. combine transfer learning and RL for co-learning to leverage prior causal knowledge for causal reasoning tasks.
Task-oriented Spoken Dialogue Systems (SDS) continuously interact with humans to accomplish predefined tasks. Chen et al. propose an alternative method incorporating the DQN algorithm to innovate neural network structures for dialogue policy adaptation. Additionally, natural question generation models are proposed to enhance Q&A task performance. Chen et al. focus on natural question generation, introducing a RL-based Graph-to-Sequence (Graph2Seq) model employing the self-critical sequence training (SCST) algorithm to optimize evaluation metrics directly.
Explicitly obtaining user preferences for recommended items and attributes through interactive conversations is the goal of conversational recommender systems. Deng et al. leverage a graph structure to integrate recommendation and conversation components. They use a dynamic weighted graph to model changing interrelationships during conversations and consider a graph-based MDP environment for simultaneous relationship processing.
With the rise of artificial intelligence, knowledge graphs have become crucial data infrastructure for various real-world applications, including dialogue systems and knowledge reasoning. Lin et al. propose a policy-based agent for multi-hop reasoning tasks through RL approaches. MINERVA utilizes RL to train end-to-end models for answering questions on multi-hop knowledge graphs.
To address incomplete knowledge graph problems, scholars commonly employ multi-hop reasoning. Wan et al. suggest a hierarchical RL method to simulate human thinking patterns, decomposing the reasoning process into RL policies. Recommendation systems, critical to online applications, incorporate DRL models like DQN and DDPG for decision-making and long-term planning in dynamic environments. KGQR integrates graph learning and sequential decision problems in interactive recommender systems, enhancing RL performance through semantic correlations in knowledge graphs.
Real World Applications
Research on Graph Representation Learning (GRL) has surged in recent years, with significant implications for real-world applications and garnering attention from scholars. Applications span various domains including transportation network optimization, E-commerce recommendation systems, drug structure prediction, molecular structure generation, and COVID-19 control strategies. Notable advancements in GRL methods are outlined below:
1) Explainability: Scholars are focusing on enhancing the interpretability of Graph Neural Networks (GNNs) to enable their use in critical applications such as medicine, privacy, and security. Explaining GNNs at both the instance and model levels is crucial for building trust. Methods like SubgraphX and RioGNN offer explanations at the subgraph level, enhancing the interpretability of multi-relational GNNs.
2) City Services: GRL methods aid in addressing urban challenges like traffic congestion and communication inefficiency. Techniques such as traffic flow prediction, traffic signal control, and electronic toll collection optimization optimize city services. GRL also enhances packet switching network routing and Wireless Local Area Networks (WLANs) channel allocation.
3) Epidemic Control: In controlling epidemics, GRL plays a pivotal role in predicting information diffusion, dynamically allocating resources, and identifying key nodes for intervention. Algorithms like RAI assist in curbing virus spread by leveraging social relationships in the Internet of Things (IoT).
4) Combinatorial Optimization: GRL methods are employed in solving combinatorial optimization problems efficiently. OpenGraphGym and S2V-DQN tackle combinatorial graph optimization, while Action Schema Network learns generalized policies for probabilistic planning problems.
5) Medicine: GRL techniques find applications in Clinical Decision Support (CDS), Medicine Combination Prediction (MCP), chemical reaction product prediction, and brain network analysis. Models like Graph Convolution RL and Graph Transformation Policy Network aid in medicine correlation prediction and chemical molecule generation, contributing to drug discovery research.
These applications underscore the versatility and impact of GRL across diverse fields, paving the way for innovative solutions to complex real-world challenges.
Collections by domain
Representation Learning
Year | Venue | Model | Title | Algorithm | Paper | Code |
---|---|---|---|---|---|---|
2018 | ACM SIGKDD | GAM | Graph Classification using structural attention | Partially Observable Markov Decision Process (POMDP) | Paper | |
2019 | IEEE TNSM | DDPG-HFA | A Deep Reinforcement Learning Approach for VNF Forwarding Graph Embedding | DDPG | Paper | |
2019 | arXiv | GraphNAS | GraphNAS: Graph Neural Architecture Search with Reinforcement Learning | MDP | Paper | Code |
2019 | arXiv | AGNN | Auto-GNN: Neural Architecture Search of Graph Neural Networks | REINFORCE | Paper | |
2019 | AISTATS | GRPI | Representation Learning on Graphs: A Reinforcement Learning Application | MDP | Paper | Code |
2019 | ICDM | GDPNet | Learning Robust Representations with Graph Denoising Policy Network | MDP | Paper | |
2020 | ICPR | DAGCN | Reinforcement learning with dual attention guided graph convolution for relation extraction | MDP | Paper | |
2020 | IEEE J-SAC | A3C+GCN | Automatic Virtual Network Embedding: A Deep Reinforcement Learning Approach With Graph Convolutional Networks | A3C | Paper | |
2020 | ICPR | DAGCN | Reinforcement learning with dual attention guided graph convolution for relation extraction | MDP | Paper | |
2020 | ICASSP | RLNet | Learning network representation through reinforcement learning | MDP | Paper | |
2020 | NeurIPS | GPA | Graph Policy Network for Transferable Active Learning on Graphs | MDP | Paper | Code |
2020 | KDD | Policy-GNN | Policy-GNN: Aggregation Optimization for Graph Neural Networks | DQN | Paper | Code |
2020 | ICLR | DGN | Graph Convolutional Reinforcement Learning | Q-Learning | Paper | Code |
2020 | AAAI/ACMAI | GAEA | GAEA: Graph Augmentation for Equitable Access via Reinforcement Learning | MDP | Paper | Code |
2021 | DASFAA | IMGER | A reinforcement learning model for influence maximization in social networks | DDQN | Paper | |
2021 | IEEE ICDM | GQNAS | GQNAS: Graph Q Network for Neural Architecture Search | DQN | Paper | |
2021 | IEEE TKDE | Netrl | Netrl: Task-aware network denoising via deep reinforcement learning | DQN | Paper | Code |
2021 | IEEE ICDM | ACE-HGNN | ACE-HGNN: Adaptive Curvature Exploration Hyperbolic Graph Neural Network | Nash Q-leaning | Paper | |
2021 | WWW | SUGAR | SUGAR: Subgraph Neural Network with Reinforcement Pooling and Self-Supervised Mutual Information Mechanism | Q-Learning | Paper | Code |
2021 | ACM TOIS | RioGNN | Reinforced Neighborhood Selection Guided Multi-Relational Graph Neural Networks | MDP | Paper | Code |
2022 | Knowledge-Based Systems | AFGSL | AFGSL: Automatic Feature Generation based on Graph Structure Learning | Q-Learning | Paper | |
2022 | arXiv | GraphAug | Automated Data Augmentations for Graph Classification | MDP | Paper | |
2022 | IEEE TKDE | RTGNN | Multi-view Tensor Graph Neural Networks Through Reinforced Aggregation | MDP | Paper | Code |
2022 | Neurocomputing | Treeago | Treeago: Tree-structure aggregation and optimization for graph neural network | DQN | Paper | |
2022 | IEEE TKDE | GraphNAS++ | GraphNAS++: Distributed Architecture Search for Graph Neural Networks | REINFORCE | Paper | |
2022 | Neural Computing and Applications | Kyriakides et al. | Evolving graph convolutional networks for neural architecture search | MDP | Paper | |
2022 | Information Sciences | GraphTUL | Contextual spatio-temporal graph representation learning for reinforced human mobility mining | MDP | Paper | |
2022 | AAAI | BiGeNe | Batch Active Learning with Graph Neural Networks via Multi-Agent Deep Reinforcement Learning | DQN | Paper | |
2022 | arXiv | AdaNet | Robust Knowledge Adaptation for Dynamic Graph Neural Networks | REINFORCE | Paper | |
2022 | Annals of Operations Research | CRL | Counterfactual based reinforcement learning for graph neural networks | MolDQN | Paper | |
2023 | ICML | DeepIM | Deep Graph Representation Learning and Optimization for Influence Maximization | MDP | Paper | Code |
2023 | IEEE TKDE | HGNAS++: | Efficient Architecture Search for Heterogeneous Graph Neural Networks | MDP | Paper | |
2023 | Information Sciences | DeepGNAS | Search for deep graph neural networks | DQN | Paper | |
2023 | MLSys | X-RLflow | X-RLflow: Graph Reinforcement Learning for Neural Network Subgraphs Transformation | PPO | Paper | Code |
Adversarial Attacks
Year | Venue | Model | Title | Algorithm | Paper | Code |
---|---|---|---|---|---|---|
2018 | ICML | RL-S2V | Adversarial Attack on Graph Structured Data | Q-learning | Paper | |
2018 | TrustCom/BigDataSE | Yousefi et al. | A Reinforcement Learning Approach for Attack Graph Analysis | Q-learning | Paper | |
2019 | arXiv | ReWatt | Attacking Graph Convolutional Networks via Rewiring | MDP | Paper | |
2020 | CIKM | CARE-GNN | Enhancing Graph Neural Network-based Fraud Detectors against Camouflaged Fraudsters | BMAB | Paper | Code |
2020 | WWW | NIPA | Adversarial Attacks on Graph Neural Networks via Node Injections: A Hierarchical Reinforcement Learning Approach | DQN | Paper | |
2021 | SBP-BRiMS | Dineen et al. | Reinforcement Learning for Data Poisoning on Graph Neural Networks | REINFORCE | Paper | |
2022 | Neural Computing and Applications | Wu et al. | Poisoning attacks against knowledge graph-based recommendation systems using deep reinforcement learning | MDP | Paper | |
2022 | KDD | KGAttack | Knowledge-enhanced Black-box Attacks for Recommendations | AC | Paper | |
2022 | IEEE TKDE | RL-GraphMI | Model Inversion Attacks Against Graph Neural Networks | Q-Learning | Paper | Code |
2023 | IJCNN | AdRumor-RL | Interpretable and Effective Reinforcement Learning for Attacking against Graph-based Rumor Detection | MDP | Paper |
Relational Reasoning
Year | Venue | Model | Title | Algorithm | Paper | Code |
---|---|---|---|---|---|---|
2017 | arXiv | Deeppath | Deeppath: A reinforcement learning method for knowledge graph reasoning | DQN | Paper | Code |
2017 | arXiv | KBGAN | KBGAN: Adversarial Learning for Knowledge Graph Embeddings | REINFORCE | Paper | Code |
2017 | ICLR | MINERVA | Go for a Walk and Arrive at the Answer: Reasoning Over Paths in Knowledge Bases using Reinforcement Learning | REINFORCE | Paper | Code |
2018 | IEEE ICDMW | MARLPaR | Path Reasoning over Knowledge Graph: A Multi-agent and Reinforcement Learning Based Method | MDP | Paper | |
2018 | KDD | DEERS | Recommendations with Negative Feedback via Pairwise Deep Reinforcement Learning | MDP | Paper | |
2018 | COLING | Chen et al. | Structured Dialogue Policy with Graph Neural Networks | REINFORCE | Paper | |
2018 | EMNLP | Lin et al. | Multi-Hop Knowledge Graph Reasoning with Reward Shaping | REINFORCE | Paper | |
2019 | arXiv | Graph2Seq | Reinforcement learning based graph-to-sequence model for natural question generation | MDP | Paper | Code |
2019 | arXiv | Ekar | Ekar: An Explainable Method for Knowledge Aware Recommendation | MDP | Paper | |
2019 | ACM SIGIR | PGPR | Reinforcement Knowledge Graph Reasoning for Explainable Recommendation | REINFORCE | Paper | Code |
2020 | ACM SIGIR | KGQR | Interactive Recommender System via Knowledge Graph-enhanced Reinforcement Learning | DQN | Paper | |
2020 | arXiv | KG-A2C | Graph Constrained Reinforcement Learning for Natural Language Action Spaces | A2C | Paper | Code |
2020 | ACM SIGKDD | IMUP | Incremental Mobile User Profiling: Reinforcement Learning with Spatial Knowledge Graph for Modeling Event Streams | DQN | Paper | |
2020 | ICLR | RL-BIC | Causal Discovery with Reinforcement Learning | AC | Paper | Code |
2020 | arXiv | RL-HGNN | Reinforcement Learning Enhanced Heterogeneous Graph Neural Network | DQN | Paper | |
2020 | ISPA/BDCloud /SocialCom /SustainCom | DKDR | DKDR: An Approach of Knowledge Graph and Deep Reinforcement Learning for Disease Diagnosis | Q-Learning | Paper | |
2020 | KDD | NIRec | An Efficient Neighborhood-based Interaction Model for Recommendation on Heterogeneous Graph | MDP | Paper | |
2020 | SIGIR | GCQN | Reinforcement Learning based Recommendation with Graph Convolutional Q-network | Q-Learning | Paper | |
2020 | Knowledge-Based Systems | GRL | GRL: Knowledge graph completion with GAN-based reinforcement learning | DDPG | Paper | |
2021 | ACM SIGIR | UNICORN | Unified conversational recommendation policy learning via graph-based reinforcement learning | DDQN | Paper | |
2021 | IEEE TNNLS | Sun et al. | Model-based transfer reinforcement learning based on graphical model representations | DDPG | Paper | |
2021 | arXiv | TITer | TimeTraveler: Reinforcement Learning for Temporal Knowledge Graph Forecasting | REINFORCE | Paper | Code |
2021 | Neurocomputing | MemoryPath | MemoryPath: A deep reinforcement learning framework for incorporating memory component into knowledge graph reasoning | MDP | Paper | |
2021 | IJCAI | CORL | Ordering-Based Causal Discovery with Reinforcement Learning | MDP | Paper | Code |
2021 | EMNLP-IJCNLP | AttnPath | Incorporating Graph Attention Mechanism into Knowledge Graph Reasoning Based on Deep Reinforcement Learning | MDP | Paper | |
2021 | Neural Networks | Dapath | Dapath: Distance-aware knowledge graph reasoning based on deep reinforcement learning | REINFORCE | Paper | Code |
2021 | IJCAI | RLH | Reasoning like human: Hierarchical reinforcement learning for knowledge graph reasoning | MDP | Paper | |
2020 | Knowledge-Based Systems | ADRL | ADRL: An attention-based deep reinforcement learning framework for knowledge graph reasoning | AC | Paper | |
2021 | IJCKG | PAAR | Multi-hop Knowledge Graph Reasoning Based on Hyperbolic Knowledge Graph Embedding and Reinforcement Learning | MDP | Paper | Code |
2021 | KSEM | Zheng et al. | Hierarchical Policy Network with Multi-agent for Knowledge Graph Reasoning Based on Reinforcement Learning | REINFORCE | Paper | |
2022 | Knowledge-Based Systems | RF | Dynamic knowledge graph reasoning based on deep reinforcement learning | AC | Paper | |
2022 | Applied Intelligence | RLPath | RLPath: a knowledge graph link prediction method using reinforcement learning based attentive relation path searching and representation learning | MDP | Paper | |
2022 | Soft Computing | GNNRC | A novel embedding learning framework for relation completion and recommendation based on graph neural network and multi-task learning | MDP | Paper | |
2022 | ACM/IMS Transactions on Data Science | TRGIR | A Text-based Deep Reinforcement Learning Framework Using Self-supervised Graph Representation for Interactive Recommendation | DDPG | Paper | |
2022 | arXiv | KGRGRL | KGRGRL: A User’s Permission Reasoning Method Based on Knowledge Graph Reward Guidance Reinforcement Learning | MDP | Paper | |
2022 | AAAI | CURL | Learning to Walk with Dual Agents for Knowledge Graph Reasoning | MDP | Paper | Code |
2022 | arXiv | FreeKD | FreeKD: Free-direction Knowledge Distillation for Graph Neural Networks | MDP | Paper | |
2022 | ACM Transactions on Information Systems | Feng et al. | Reinforcement Routing on Proximity Graph for Efficient Recommendation | MDP | Paper | |
2022 | DASFAA | ExKGR | ExKGR: Explainable Multi-hop Reasoning for Evolving Knowledge Graph | MDP | Paper | |
2022 | DASFAA | Zhang et al. | A Joint Framework for Explainable Recommendation with Knowledge Reasoning and Graph Representation | A2C | Paper | |
2022 | Artificial Intelligence in Medicine | GTGAT | Gated Tree-based Graph Attention Network (GTGAT) for medical knowledge graph reasoning | MDP | Paepr | |
2022 | Education and Information Technologies | MEUR | Graph path fusion and reinforcement reasoning for recommendation in MOOCs | MDP | Paper | |
2022 | AICAT | Wu et al. | A construction technology of automatic reasoning system based on knowledge graph | MDP | Paper | |
2022 | Information Processing & Management | SparKGR | Iterative rule-guided reasoning over sparse knowledge graphs with deep reinforcement learning | DQN | Paper | |
2022 | arXiv | GRADER | Generalizing Goal-Conditioned Reinforcement Learning with Variational Causal Reasoning | MDP | Paper | |
2022 | arXiv | APPO | Performance Optimization for Semantic Communications: An Attention-based Reinforcement Learning Approach | MDP | Paper | |
2022 | arXiv | CERec | Reinforced Path Reasoning for Counterfactual Explainable Recommendation | MDP | Paper | Code |
2022 | ACM SIGIR | CGKR | Alleviating Spurious Correlations in Knowledge-aware Recommendations through Counterfactual Generator | MDP | Paper | Code |
2022 | ACM SIGIR | HICR | Conversational Recommendation via Hierarchical Information Modeling | DQN | Paper | |
2022 | ACM SIGIR | MARIS | Multi-Agent RL-based Information Selection Model for Sequential Recommendation | MDP | Paper | |
2022 | ICME | ROGC | ROGC: Role-Oriented Graph Convolution Based Multi-Agent Reinforcement Learning | MARL | Paper | |
2022 | AAAI | HiTKG | HiTKG: Towards Goal-Oriented Conversations via Multi-Hierarchy Learning | MDP | Paper | |
2022 | Applied Soft Computing | SSRL | Self-Supervised Reinforcement Learning with dual-reward for knowledge-aware recommendation | Actor-Critic | Paper | |
2022 | Knowledge-Based Systems | KAiPP | KAiPP: An interaction recommendation approach for knowledge aided intelligent process planning with reinforcement learning | MDP | Paper | |
2022 | CIKM | KRAF | A Flexible Advertising Framework using Knowledge Graph-Enriched Multi-Agent Reinforcement Learning | MARL | Paper | |
2022 | CIKM | GPR | Two-Level Graph Path Reasoning for Conversational Recommendation with User Realistic Preference | DQN | Paper | |
2022 | SSRN | VRNet | Knowledge Graph Relation Reasoning with Variational Reinforcement Network | MDP | Paper | |
2022 | KBS | Zhu et al. | Step by step: A hierarchical framework for multi-hop knowledge graph reasoning with reinforcement learning | MDP | Paper | Code |
2023 | SDM | GARL | Causal Discovery by Graph Attention Reinforcement Learning | MDP | Paper | |
2023 | Education and Information Technologies | MEUR | Graph path fusion and reinforcement reasoning for recommendation in MOOCs | Actor-Critic | Paper | |
2023 | IEEE TKDE | TMER-RL | Reinforcement Learning based Path Exploration for Sequential Explainable Recommendation | MDP | Paper | |
2023 | CLeaR | MCD | A Meta-Reinforcement Learning Algorithm for Causal Discovery | Actor-Critic | Paper | Code |
2023 | Applied Intelligence | RED | Reinforcement learning-based denoising network for sequential recommendation | MDP | Paper | |
2023 | Conference of the European Chapter of the Association for Computational Linguistics | Jiang et al. | Path Spuriousness-aware Reinforcement Learning for Multi-Hop Knowledge Graph Reasoning | REINFORCE | Paper | Code |
Real-World Applications
Explainability
Year | Venue | Model | Title | Algorithm | Paper | Code |
---|---|---|---|---|---|---|
2019 | NeurIPS | GMETAEXP | Learning Transferable Graph Exploration | MDP | Paper | |
2019 | arXiv | Ekar | Ekar: An Explainable Method for Knowledge Aware Recommendation | MDP | Paper | |
2019 | ACM SIGIR | PGPR | Reinforcement Knowledge Graph Reasoning for Explainable Recommendation | REINFORCE | Paper | Code |
2020 | KDD | XGNN | XGNN: Towards Model-Level Explanations of Graph Neural Networks | MDP | Paper | |
2021 | ICML | SubgraphX | On Explainability of Graph Neural Networks via Subgraph Explorations | MCTS | Paper | Code |
2021 | arXiv | SparRL | SparRL: Graph Sparsification via Deep Reinforcement Learning | MDP | Paper | Code |
2021 | ACM TOIS | RioGNN | Reinforced Neighborhood Selection Guided Multi-Relational Graph Neural Networks | MDP | Paper | Code |
2022 | ICLR | G2RL | Graph-Enhanced Exploration for Goal-oriented Reinforcement Learning | Q-Learning | Paper | |
2022 | IEEE TNNLS | LEGIT | Explaining Deep Graph Networks via Input Perturbation | MDP | Paper | Code |
2022 | ADC | Mishra et al. | Predicting Taxi Hotspots in Dynamic Conditions Using Graph Neural Network | MDP | Paper | |
2022 | CIKM | Saha et al. | A Model-Centric Explainer for Graph Neural Network based Node Classification | REINFORCE | Paper | Code |
City Services
Year | Venue | Model | Title | Algorithm | Paper | Code |
---|---|---|---|---|---|---|
2018 | IEEE Big Data | Obara et al. | Deep Reinforcement Learning Approach for Train Rescheduling Utilizing Graph Theory | DQN | Paper | |
2018 | PMLR | Zhang et al. | Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents. | Actor-Critic | Paper | |
2018 | IEEE ITSC | NFQI | Traffic Signal Control Based on Reinforcement Learning with Graph Convolutional Neural Nets | NFQI | Paper | |
2019 | SOSR | Rusek et al. | Unveiling the potential of Graph Neural Networks for network modeling and optimization in SDN | MDP | Paper | |
2019 | arXiv | DRL+GNN | Deep Reinforcement Learning meets Graph Neural Networks: exploring a routing optimization use case | DQN | Paper | Code |
2019 | SIGCOMM | RouteNet | Challenging the generalization capabilities of graph neural networks for network modeling | MDP | Paper | |
2020 | IJCAI | eGCN | Dynamic Electronic Toll Collection via Multi-Agent Deep Reinforcement Learning with Edge-Based Graph Convolutional Networks | MDP | Paper | |
2020 | IEEE TMC | STMARL | STMARL: A Spatio-Temporal Multi-Agent Reinforcement Learning Approach for Cooperative Traffic Light Control | DQN | Paper | |
2020 | IEEE Access | NAKASHIMA et al. | Deep Reinforcement Learning-Based Channel Allocation for Wireless LANs With Graph Convolutional Networks | DDQN | Paper | |
2020 | Artificial Intelligence in China | DR-DCG | Coordinated Learning for Lane Changing Based on Coordination Graph and Reinforcement Learning | MDP | Paper | |
2021 | IEEE ICCCS | GraphLight | GraphLight: Graph-based Reinforcement Learning for Traffic Signal Control | REINFORCE | Paper | |
2021 | IEEE T-ITS | IG-RL | IG-RL: Inductive Graph Reinforcement Learning for Massive-Scale Traffic Signal Control | MDP | Paper | Code |
2021 | Computer‐Aided Civil and Infrastructure Engineering | GCQ | Graph neural network and reinforcement learning for multi-agent cooperative control of connected autonomous vehicles | DQN | Paper | |
2021 | IEEE T-ITS | SAGE-Garph | Deep Reinforcement Learning With Graph Representation for Vehicle Repositioning | DDQN | Paper | |
2021 | Information Sciences | Dynamic graph | Dynamic graph convolutional network for long-term traffic flow prediction with reinforcement learning | PPO | Paper | |
2022 | Digital Signal Processing | DQN-GCN-GAT | A new ensemble deep graph reinforcement learning network for spatio-temporal traffic volume forecasting in a freeway network | DQN | Paper | |
2022 | IEEE TMC | RedPacketBike | RedPacketBike: A Graph-Based Demand Modeling and Crowd-Driven Station Rebalancing Framework for Bike Sharing Systems | MDP | Paper | |
2022 | International Journal of Electrical Power & Energy Systems | GRL | Real-time fast charging station recommendation for electric vehicles in coupled power-transportation networks: A graph reinforcement learning method | DQN | Paper | |
2022 | Digital Signal Processing | DQN-GCN-GAT | A new ensemble deep graph reinforcement learning network for spatio-temporal traffic volume forecasting in a freeway network | DQN | Paper | |
2022 | arXiv | MuJAM | Model-based graph reinforcement learning for inductive traffic signal control | MDP | Paper | |
2022 | Applied Intelligence | GCQN-TSC | Graph cooperation deep reinforcement learning for ecological urban traffic signal control | MDP | Paper | |
2022 | arXiv | GRL | Graph Convolution-Based Deep Reinforcement Learning for Multi-Agent Decision-Making in Mixed Traffic Environments | DQN | Paper | Code |
2022 | Knowledge-Based Systems | MetaSTGAT | Meta-learning based spatial-temporal graph attention network for traffic signal control | DQN | Paper | |
2022 | Applied Intelligence | VARL | VARL: a variational autoencoder-based reinforcement learning Framework for vehicle routing problems | MDP | Paper | |
2022 | Information Fusion | IHA-MDGI | An inductive heterogeneous graph attention-based multi-agent deep graph infomax algorithm for adaptive traffic signal control | multi-agent ATSC | Paper | |
2022 | Transportation Research Part C: Emerging Technologies | RDGCNI | A novel reinforced dynamic graph convolutional network model with data imputation for network-wide traffic flow prediction | DDPG | Paper | |
2022 | IJECE | ERL-MA | Evolutionary reinforcement learning multi-agents system for intelligent traffic light control: new approach and case of study | Q-Learning | Paper | |
2022 | IEEE Internet of Things Journal | GCN-based DRL | Joint Routing and Scheduling Optimization in Time-Sensitive Networks Using Graph Convolutional Network-based Deep Reinforcement Learning | DQN | Paper | |
2022 | Artificial Intelligence and Computing on Industrial Applications | MB-GCN | A Deep Coordination Graph Convolution Reinforcement Learning for Multi-Intelligent Vehicle Driving Policy | MDP | Paper | |
2022 | IET Communications | GRL | A generic intelligent routing method using deep reinforcement learning with graph neural networks | PPO | Paper | |
2022 | CIKM | Lou et al | Meta-Reinforcement Learning for Multiple Traffic Signals Control | MDP | Paper | |
2022 | IEEE ICIEA | GCN-DQN/GCN-DDQN | Multi-Vehicles Decision-Making in Interactive Highway Exit: A Graph Reinforcement Learning Approach | DQN/DDQN | Paper | |
2022 | IEEE ITSC | Liu et al. | Graph Convolution-Based Deep Reinforcement Learning for Multi-Agent Decision-Making in Interactive Traffic Scenarios | MDP | Paper | Code |
2022 | arXiv | MuJAM | Model-based graph reinforcement learning for inductive traffic signal control | MDP | Paper | |
2023 | IET Gener. Transm. Distrib. | GraphSAGE-D3QN | An emergency control strategy for undervoltage load shedding of power system: A graph deep reinforcement learning method | D3QN | Paper |
Epidemic Control
Year | Venue | Model | Title | Algorithm | Paper | Code |
---|---|---|---|---|---|---|
2020 | Nature Machine Intelligence | FINDER | Finding key players in complex networks through deep reinforcement learning | Q-Learning | Paper | Code |
2021 | ICML | RLGN | Controlling Graph Dynamics with Reinforcement Learning and Graph Neural Networks | PPO | Paper | |
2021 | IEEE TNNLS | FOREST | Full-Scale Information Diffusion Prediction With Reinforced Recurrent Networks | REINFORCE | Paper | Code |
2021 | arXiv | HITTER | Hypernetwork Dismantling via Deep Reinforcement Learning | DQN | Paper | |
2022 | IEEE TETCI | EDRL-IM | Influence Maximization in Complex Networks by Using Evolutionary Deep Reinforcement Learning | DQN | Paper | |
2022 | ACM Transactions on Knowledge Discovery from Data | IDRLECA | Contact Tracing and Epidemic Intervention via Deep Reinforcement Learning | PPO | Paper | |
2022 | KDD | Vehicle | Precise Mobility Intervention for Epidemic Control Using Unobservable Information via Deep Reinforcement Learning | HRL | Paper |
Combinatorial Optimization
Year | Venue | Model | Title | Algorithm | Paper | Code |
---|---|---|---|---|---|---|
2017 | NIPS | S2V-DQN | Learning Combinatorial Optimization Algorithms over Graphs | Q-Learning | Paper | Code |
2018 | AAAI | ASNets | Action Schema Networks: Generalised Policies with Deep Learning | MDP | Paper | Code |
2019 | arXiv | GPN | Combinatorial Optimization by Graph Pointer Networks and Hierarchical Reinforcement Learning | REINFORCE | Paper | Code |
2020 | Nature Machine Intelligence | FINDER | Finding key players in complex networks through deep reinforcement learning | Q-Learning | Paper | Code |
2021 | IEEE Communications Letters | DeepOpt | Combining Deep Reinforcement Learning With Graph Neural Networks for Optimal VNF Placement | REINFORCE | Paper | |
2020 | IEEE Access | SILVA et al. | Temporal Graph Traversals Using Reinforcement Learning With Proximal Policy Optimization | PPO | Paper | |
2022 | IEEE TETCI | EDRL-IM | Influence Maximization in Complex Networks by Using Evolutionary Deep Reinforcement Learning | DQN | Paper | |
2022 | arXiv | GTA-RL | Solving Dynamic Graph Problems with Multi-Attention Deep Reinforcement Learning | REINFORCE | Paper | Code |
2022 | IEEE TII | Song et al. | Flexible Job Shop Scheduling via Graph Neural Network and Deep Reinforcement Learning | PPO | Paper | |
2022 | Engineering Applications of Artificial Intelligence | GCE-MAD | A graph convolutional encoder and multi-head attention decoder network for TSP via reinforcement learning | REINFORCE | Paper | |
2022 | IEEE TII | DGERD | A Deep Reinforcement Learning Framework Based on an Attention Mechanism and Disjunctive Graph Embedding for the Job Shop Scheduling Problem | DQN | Paper | |
2022 | arXiv | ECORD | Learning to Solve Combinatorial Graph Partitioning Problems via Efficient Exploration | DQN | Paper | Code |
2022 | Information Sciences | G3DQN | A graph neural networks-based deep Q-learning approach for job shop scheduling problems in traffic management | DQN | Paper | |
2022 | KDD | DGMP | Enhancing Machine Learning Approaches for Graph Optimization Problems with Diversifying Graph Augmentation | MDP | Paper | |
2022 | Neurocomputing | E-GAT | Solve routing problems with a residual edge-graph attention neural network | PPO | Paper | Code |
2022 | arXiv | N-BLS | Subgraph Matching via Query-Conditioned Subgraph Matching Neural Networks and Bi-Level Tree Search | MCTS | Paper | |
2022 | techrxiv | TOFA | You Only Train Once: A highly generalizable reinforcement learning method for dynamic job shop scheduling problem | MDP | Paper | Code |
2022 | arXiv | LKH | Solving the Traveling Salesperson Problem with Precedence Constraints by Deep Reinforcement Learning | MDP | Paper | Code |
2022 | IEEE TII | DRL | Flexible job-shop scheduling via graph neural network and deep reinforcement learning | PPO | Paper | Code |
2023 | Knowledge-Based Systems | DeepMAG | DeepMAG: Deep reinforcement learning with multi-agent graphs for flexible job shop scheduling | MARL | Paper | |
2023 | Information Sciences | BDRL | Solving combinatorial optimization problems over graphs with BERT-Based Deep Reinforcement Learning | REINFORCE | Paper |
Medicine
Year | Venue | Model | Title | Algorithm | Paper | Code |
---|---|---|---|---|---|---|
2018 | NeurIPS | GCPN | Graph convolutional policy network for goal-directed molecular graph generation | MDP | Paper | Code |
2018 | arXiv | MolGAN | MolGAN: An implicit generative model for small molecular graphs | MDP | Paper | |
2019 | CIKM | CompNet | Order-free Medicine Combination Prediction with Graph Convolutional Reinforcement Learning | DQN | Paper | Code |
2019 | KDD | GTPN | Graph Transformation Policy Network for Chemical Reaction Prediction | A2C | Paper | |
2020 | IEEE Access | Wang et al. | Risk-Aware Identification of Highly Suspected COVID-19 Cases in Social IoT: A Joint Graph Theory and Reinforcement Learning Approach | Q-Learning | Paper | |
2020 | ISPA/BDCloud /SocialCom /SustainCom | DKDR | DKDR: An Approach of Knowledge Graph and Deep Reinforcement Learning for Disease Diagnosis | Q-Learning | Paper | |
2020 | Journal of cheminformatics | DeepGraphMolGen | DeepGraphMolGen, a multi-objective, computational strategy for generating molecules with desirable properties: a graph convolution and reinforcement learning approach | PPO | Paper | Code |
2022 | arXiv | BN-GNN | Deep Reinforcement Learning Guided Graph Neural Networks for Brain Network Analysis | DDQN | Paper | |
2023 | Aiche Journal | Stops et al. | Flowsheet generation through hierarchical reinforcement learning and graph neural networks | Actor-Critic | Paper |
Others
Year | Venue | Model | Title | Algorithm | Paper | Code |
---|---|---|---|---|---|---|
2017 | IEEE TNNLS | FRDNN | Deep direct reinforcement learning for financial signal representation and trading | DRL | Paper | |
2018 | ICLR | NerveNet | NerveNet: Learning Structured Policy with Graph Neural Networks | PPO | Paper | |
2020 | ACM/IEEE DAC | GCN-RL | GCN-RL Circuit Designer: Transferable Transistor Sizing with Graph Neural Networks and Reinforcement Learning | AC | Paper | |
2021 | Expert Systems with Applications | DeepPocket | Deep graph convolutional reinforcement learning for financial portfolio management-deeppocket | AC | Paper | |
2021 | arXiv | Gnn-rl compression | Gnn-rl compression: Topologyaware network pruning using multi-stage graph embedding and reinforcement learning | DDPG | Paper | |
2021 | ICCV | AGMC | Auto graph encoder-decoder for neural network pruning | DQN | Paper | |
2022 | Applied Intelligence | GraphPruning | Graph pruning for model compression | DDPG | Paper | |
2022 | ICLR | AGILE | Know Your Action Set: Learning Action Relations for Reinforcement Learning | PPO DQN CDQN | Paper | Code |
2022 | ICLR | MAPSRL-2 | Reinforcement Learning under a Multi-agent Predictive State Representation Model: Method and Theory | Q-Learning | Paper | |
2022 | ICLR | SWAT | Structure-Aware Transformer Policy for Inhomogeneous Multi-Task Reinforcement Learning | AC | Paper | |
2022 | IEEE TPAMI | DRL-DBSCAN | Reinforced, Incremental and Cross-lingual Event Detection From Social Messages | MarGNN | Paper | Code |
2022 | ACM Transactions on Asian and Low-Resource Language Information Processing | GA-SCS | GA-SCS: Graph-Augmented Source Code Summarization | MDP | Paper | |
2022 | IP CCC | RCGNN | Reinforced Contrastive Graph Neural Networks (RCGNN) for Anomaly Detection | Recursive Scalable Reinforcement Learning (RSRL) | Paper | |
2022 | IEEE Transactions on Computational Social Systems | MADDPG | Misinformation Propagation in Online Social Networks: Game Theoretic and Reinforcement Learning Approaches | MARL | Paper |
Reference
- M. Nie, D. Chen and D. Wang, "Reinforcement Learning on Graphs: A Survey," in IEEE Transactions on Emerging Topics in Computational Intelligence, vol. 7, no. 4, pp. 1065-1082, Aug. 2023, doi: 10.1109/TETCI.2022.3222545.
- S. Zhu, I. Ng, and Z. Chen, “Causal discovery with reinforcement learning,” arXiv preprint arXiv:1906.04477, 2019.
- X. Wang, Y. Du, S. Zhu, L. Ke, Z. Chen, J. Hao, and J. Wang, “Ordering-based causal discovery with reinforcement learning,” arXiv preprint arXiv:2105.06631, 2021.
- Y. Sun, K. Zhang, and C. Sun, “Model-based transfer reinforcement learning based on graphical model representations,” IEEE Transactions on Neural Networks and Learning Systems, pp. 1–14, 2021
- L. Chen, B. Tan, S. Long, and K. Yu, “Structured dialogue policy with graph neural networks,” in Proceedings of the 27th International Conference on Computational Linguistics, Santa Fe, New Mexico, USA, 2018, pp. 1257–1268.
- Y. Chen, L. Wu, and M. J. Zaki, “Reinforcement learning based graphto-sequence model for natural question generation,” arXiv preprint arXiv:1908.04942, 2019.
- Y. Deng, Y. Li, F. Sun, B. Ding, and W. Lam, “Unified conversational recommendation policy learning via graph-based reinforcement learning,” in Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021, pp. 1431–1441.
- X. V. Lin, R. Socher, and C. Xiong, “Multi-hop knowledge graph reasoning with reward shaping,” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, 2018, pp. 3243–3253.
- G. Wan, S. Pan, C. Gong, C. Zhou, and G. Haffari, “Reasoning like human: Hierarchical reinforcement learning for knowledge graph reasoning,” in Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, Yokohama, Yokohama, Japan, 2021, pp. 1926–1932.