Structure-Aware Transformer for Graph Representation Learning 简单笔记

作者：小舞很执着 | 2024-07-19 16:07:46

踩

structure-aware transformer for graph representation learning

SAT 2022

Motivations

1、Transformer with positional encoding do not necessarily capture structural similarity between them（对于一些即便处于不同位置，但是有相似环境、相似结构的结点，应该有相似的表示）

2、suffer from problems of limited expressiveness, over-smoothing, and over-squashing(一般的图神经网络不能太深)

2、a new self-attention mechanism: extracting a subgraph representation rooted at each node before computing the attention（先提取基于每个结点的子图的表示）

Contributions

1、reformulate the self-attention mechanism as a kernel smoother

2、automatically generating the subgraph representations

3、making SAT an effortless enhancer of any existing GNN

4、SAT is more interpretable

Methods

Structure-aware self-attention

在这里插入图片描述

structure extractor

在这里插入图片描述

1、k-subtree GNN extractor
在这里插入图片描述

take the output node representation at u as the subgraph representation at u

2、k-subgraph GNN extractor
在这里插入图片描述

aggregates the updated node representations of all nodes within the k-hop neighborhood using a pooling function

Structure-aware transformer

1、include the degree factor in the skip-connection, reducing the overwhelming influence of highly connected graph components

Combination with absolute encoding

1、absolute positional encoding is not guaranteed to generate similar node representations even if two nodes have similar local structures

2、subgraph representations used in the structure-aware attention can be tailored to measure the structural similarity between nodes

Conclusions

1、The structure-aware framework achieves SOTA performance

2、k-subtree and k-subgraph SAT improve upon the base GNN

3、incorporating the structure via our structure-aware attention brings a notable improvement

4、a small value of k already leads to good performance, while not suffering from over-smoothing or over-squashing

5、a proper absolute positional encoding and a readout method improves performance, but to a much lesser extent than incorporating the structure into the approach

Limitations

it suffers from the same drawbacks as the Transformer, namely the quadratic complexity of the self attention computation

声明：本文内容由网友自发贡献，不代表【wpsshop博客】立场，版权归原作者所有，本站不承担相应法律责任。如您发现有侵权的内容，请联系我们。转载请注明出处：https://www.wpsshop.cn/w/小舞很执着/article/detail/852373