# Multi-Level Graph Contrastive Learning

https://arxiv.org/pdf/2107.02639

Multi-Level Graph Contrastive Learning ，2021，arxiv preprint

# 1. 简介

## 1.1 摘要

Graph representation learning has attracted a surge of interest recently, whose target at learning discriminant embedding for each node in the graph. Most of these representation methods focus on supervised learning and heavily depend on label information. However, annotating graphs are expensive to obtain in the real world, especially in specialized domains (i.e. biology), as it needs the annotator to have the domain knowledge to label the graph. To approach this problem, self-supervised learning provides a feasible solution for graph representation learning. In this paper, we propose a Multi-Level Graph Contrastive Learning (MLGCL) framework for learning robust representation of graph data by contrasting space views of graphs. Specifically, we introduce a novel contrastive view - topological and feature space views. The original graph is first-order approximation structure and contains uncertainty or error, while the kNN graph generated by encoding features preserves high-order proximity. Thus kNN graph generated by encoding features not only provide a complementary view, but is more suitable to GNN encoder to extract discriminant representation. Furthermore, we develop a multi-level contrastive mode to preserve the local similarity and semantic similarity of graph-structured data simultaneously. Extensive experiments indicate MLGCL achieves promising results compared with the existing state-of-the-art graph representation learning methods on seven datasets.

# 2. 方法

• GDA： 图数据增强，由输入图生成两个相关图用于对比。本文作者使用KNN图作为增强视角和原始图对比。
• GNN Encoder： 图编码器，学习节点嵌入用于下游任务。
• MLP： 将节点表示从嵌入空间映射到对比空间。
• Pool： 池化所有节点表示，计算图表示。
• Loss 函数： 作者提出的对层级损失函数同时保留低层级“local”和高层级“global”一致性。

## 2.1 数据增强

1. 基于节点特征$Z$，计算相似度矩阵$S$。相似度计算方法可以采用欧氏距离、余弦距离等等。
2. 对于每个节点，选取top-k相似节点，在它们之间添加边，最终得到KNN图的邻接矩阵$A_f$

• 马氏距离： 其中M是个版正定矩阵，起到逆协方差矩阵的作用

$S_{i j}=\sqrt{\left(x_{i}-x_{j}\right)^{T} M\left(x_{i}-x_{j}\right)}$

• 余弦距离

$S_{i j}=\frac{x_{i} \cdot x_{j}}{\left|x_{i}\right|\left|x_{j}\right|}$

• 高斯核

$S_{i j}=e^{-\frac{\left\|x_{i}-x_{j}\right\|^{2}}{2 \sigma^{2}}}$

## 2.2 编码器

$Z^{l+1}=f(A, X)=\sigma\left(\widetilde{A} Z^{l} W^{l}\right)$

$c=P(H)=\sigma\left(\frac{1}{N} \sum_{i=1}^{N} h_{i}\right)$

## 2.3 多层级损失

• node-level对比

给定正例对$(z_i,z_j)$，其损失函数定义如下：

$\mathcal{L}_{\text {node }}\left(z_{i}^{a}, z_{i}^{b}\right)=\log \frac{\exp \left(\left(z_{i}^{a}\right)^{T} z_{i}^{b} / \tau\right)}{\sum_{j=1, j \neq i}^{K} \exp \left(\left(z_{i}^{a}\right)^{T} z_{i}^{b} / \tau\right)+\exp \left(\left(z_{i}^{a}\right)^{T} z_{j}^{a} / \tau\right)+\exp \left(\left(z_{i}^{a}\right)^{T} z_{j}^{b} / \tau\right)}$

同样地，我们还可以定义$\mathcal{L}_{\text {node }}\left(z_{i}^{b}, z_{i}^{a}\right)$。这样整个节点级对比损失定义为：

$\mathcal{L}_{\text {node }}=\mathcal{L}_{\text {node }}\left(z_{i}^{a}, z_{i}^{b}\right)+\mathcal{L}_{\text {node }}\left(z_{i}^{b}, z_{i}^{a}\right)$

• graph-level对比

给定正例对$(s^a,s^b)$和负例对$(s^a,\widetilde s^a),(s^a,\widetilde s^b)$，对比损失定义为：

$\mathcal{L}_{\text {graph }}\left(s^{a}, s^{b}\right)=\log \frac{\exp \left(\left(s^{a}\right)^{T} s^{b} / \tau\right)}{\exp \left(\left(s^{a}\right)^{T} s^{b} / \tau\right)+\exp \left(\left(s^{a}\right)^{T} \tilde{s}^{a} / \tau\right)+\exp \left(\left(s^{a}\right)^{T} \tilde{s}^{b} / \tau\right)}$

对于负样本的生成是通过随机shuffle特征得到邻接矩阵$\widetilde A$$\widetilde A_f$得到的。对于另外一个视角同样可以定义$\mathcal{L}_{\text {graph }}\left(s^{b}, s^{a}\right)$。这样整个图层级对比损失定义为：

$\mathcal{L}_{\text {graph }}=\mathcal{L}_{\text {graph }}\left(s^{a}, s^{b}\right)+\mathcal{L}_{\text {graph }}\left(s^{b}, s^{a}\right)$

MLGCL的整体损失就定义为两者之和：

$\mathcal{L}=\mathcal{L}_{\text {node }}+\lambda \mathcal{L}_{\text {graph }}$

# 3. 实验

## 3.2 消融实验

• 版权声明： 本博客所有文章除特别声明外，著作权归作者所有。转载请注明出处！