Relational Graph Convolutional Networks Do Not Learn Sound Rules

Authors:

Matthew Morris、David J. Tena Cucala、Bernardo Cuenca Grau、Ian Horrocks

Paper:

Introduction

Knowledge graphs (KGs) are structured knowledge bases where nodes and edges represent entities and their relationships, respectively. They are widely used in various applications, but often suffer from incompleteness. To address this, the field of KG completion has emerged, aiming to predict missing facts in KGs. Graph neural networks (GNNs), particularly Relational Graph Convolutional Networks (R-GCNs), have shown promise in this area. However, the explainability of their predictions remains a challenge. This study investigates whether R-GCNs can learn sound rules that explain their predictions and proposes methods to extract and verify such rules.

Related Work

Knowledge Graph Completion

KG completion is typically approached as a classification problem, where the goal is to predict whether a candidate fact holds in the completion of an incomplete KG. Various methods have been proposed, including embedding-based approaches, tensor products, recurrent neural networks, and differentiable reasoning. Among these, GNNs, especially R-GCNs, have gained significant attention.

Explainability in GNNs

The lack of explainability in GNNs has led to efforts to explain their predictions using logic-based formalisms like Datalog. Previous works have focused on extracting Datalog rules from certain subclasses of GNNs, ensuring that the rules are sound and complete with respect to the model. However, these methods often impose restrictions on model weights and aggregation functions, limiting their applicability to popular models like R-GCNs.

Research Methodology

Sum-GNNs

This study focuses on sum-GNNs, a variant of GNNs that use sum aggregation without restrictions on model weights. The goal is to identify output channels that exhibit monotonic behavior, allowing for the extraction of sound Datalog rules. The study also aims to identify unbounded channels for which no sound Datalog rule exists, indicating non-monotonic behavior.

Soundness and Completeness

A Datalog rule is sound for a sum-GNN if every fact derived by the rule is also predicted by the GNN. Conversely, a rule is complete if every fact predicted by the GNN is derived by the rule. The study provides methods to verify the soundness of Datalog rules for sum-GNNs.

Experimental Design

Datasets

The experiments use benchmark datasets (WN18RRv1, FB237v1, NELLv1) and datasets created using the LogInfer framework, which augments datasets with consequences of Datalog rules. The datasets are split into training, validation, and test sets, with negative examples generated using predicate corruption.

Training Paradigms

The study trains sum-GNNs with ReLU activation functions and biases, using binary cross-entropy loss and the Adam optimizer. Two variations to the training paradigm are proposed to encourage the learning of sound rules:
1. MGCN: Clamps all negative weights to zero during training, ensuring monotonic behavior.
2. R-X: Iteratively clamps weights below a certain threshold to zero, balancing model accuracy and the number of sound rules.

Results and Analysis

Monotonic LogInfer Datasets

The results show that R-GCNs, when trained in the standard way, do not learn any sound rules, even with near-perfect accuracy on monotonic datasets. In contrast, MGCNs, which enforce monotonic behavior, learn many sound rules and achieve high accuracy on test sets.

Benchmark Datasets

On benchmark datasets, R-GCNs achieve higher AUPRC on validation sets but lower accuracy on test sets compared to MGCNs. The R-X models demonstrate a trade-off between accuracy and the number of sound rules, with higher thresholds leading to more sound rules but lower accuracy.

Mixed LogInfer Datasets

In mixed datasets with both monotonic and non-monotonic rules, R-GCNs again fail to learn sound rules. MGCNs achieve high recall but low precision, indicating that they learn versions of non-monotonic rules without negation, leading to false positives.

Channel Classification

The classification of channels into unbounded, stable, and increasing almost always adds up to 100%, indicating that the proposed methods effectively characterize channel behavior. The study also shows that the more complex classification into stable/increasing/decreasing/undetermined is necessary for accurate rule extraction.

Overall Conclusion

This study demonstrates that R-GCNs, when trained in the standard way, do not learn sound rules, raising concerns about their generalizability and explainability. The proposed training variations, MGCN and R-X, provide alternatives that encourage the learning of sound rules, with MGCNs being effective for monotonic datasets and R-X models offering a balance between accuracy and rule extraction. Future work will explore other GNN architectures, extend rule extraction to non-monotonic logics, and consider relaxed definitions of soundness.

What's Hot

AAAI.2024 – Humans and AI

How Diffusion Models Learn to Factorize and Compose

Temporal Fairness in Decision Making Problems

Relational Graph Convolutional Networks Do Not Learn Sound Rules

AAAI.2024 – Humans and AI

How Diffusion Models Learn to Factorize and Compose

Temporal Fairness in Decision Making Problems

NeCo: Improving DINOv2’s spatial representations in 19 GPU hours with Patch Neighbor Consistency

AAAI.2024 – Humans and AI

How Diffusion Models Learn to Factorize and Compose

Temporal Fairness in Decision Making Problems

NeCo: Improving DINOv2’s spatial representations in 19 GPU hours with Patch Neighbor Consistency

Our Picks

AAAI.2024 – Humans and AI

How Diffusion Models Learn to Factorize and Compose

Temporal Fairness in Decision Making Problems

Subscribe to Updates

What's Hot

Relational Graph Convolutional Networks Do Not Learn Sound Rules

Authors:

Paper:

Introduction

Related Work

Knowledge Graph Completion

Explainability in GNNs

Research Methodology

Sum-GNNs

Soundness and Completeness

Experimental Design

Datasets

Training Paradigms

Results and Analysis

Monotonic LogInfer Datasets

Benchmark Datasets

Mixed LogInfer Datasets

Channel Classification

Overall Conclusion

Related Posts