Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering

阅读量：

Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering

1、主要贡献
2、算法介绍
- 2.1 学习局部化谱filters
- - k阶近似与ChebNet
- 2.2 图池化
- - 图粗化
  - 快速pooling
  - 整个GCN过程

1、主要贡献

1、谱方法的卷积公式。一种基于谱方法的CNN的形式化表述，基于GSP

2、严格的局部化的filters。局部化就是定义了一个参数K，K代表以一个节点为中心的K次跳跃（也就是距离为K）。

3、小的计算复杂度。复杂度是和K以及边数成正比。因为大部分现实生活中的图是高度稀疏的，那么就是边数远远低于点数的平方，即边 = k*点数。k可以看做是一个点的k最近邻。这也就是说复杂度是和输入数据的大小。另外，这个方法避免计算傅里叶因子（需要计算和储存傅里叶因子）。这个方法只用储存一个拉普拉斯矩阵，这是一个稀疏矩阵，只有边数个非零值。

4、高效的pooling。通过构建一个二叉树进行pooling。

2、算法介绍

实现图上的CNN需要满足一下一点要求：

(1)设计卷积核
(2)相似点聚集
(3)pooling操作

2.1 学习局部化谱filters

k阶近似与ChebNet

切比雪夫多项式及其应用
http://www.docin.com/p-524643989.html
http://bookshelf.docin.com/p-1702466525.html
https://www.ixueshu.com/document/2c40f2ae055f2240318947a18e7f9386.html
https://wenku.baidu.com/view/c9566591f56527d3240c844769eae009591ba2ce.html
https://wenku.baidu.com/view/544dab7502768e9951e73842.html
ChebNet的作者指出原始的频谱图卷积有两大缺点：
① 图卷积核是全局的且参数量大 (卷积核大小与输入信号相同，参数量与图节点数相同)
② 图卷积运算的复杂度高 (运算过程涉及高计算复杂度的特征分解)

首先，为了克服卷积核过“大”的缺点，ChebNet指出可以使用多项式展开近似计算图卷积，即对参数化的频率响应函数进行多项式近似：
在这里插入图片描述
其中k代表多项式的最高阶数，这一步近似将卷积核的参数数量从n个减少到了 $k$ 个，使卷积核变“小”了，变得“局部”了。但是整个图卷积运算的复杂度还是 $O(n^2)$ ，因为输入信号要与傅里叶基 $U$ 做矩阵乘法。为了进一步降低计算复杂度，作者使用迭代定义的切比雪夫多项式( $ChebShev\ Polynomial$ )作近似并证明可以将计算复杂度降低至 $O(K|E|)$ ，其中 $K$ 为多项式的阶数， $E$ 为图中边的数量。作者将基于切比雪夫多项式定义的卷积核称为切比雪夫卷积核，对应的卷积运算则称为切比雪夫卷积，他们的公式定义如下：
在这里插入图片描述
式中的 $\lambda_{max}$ 表示拉普拉斯矩阵的最大特征值 ，添加’是为了表明这是近似的参数。从替换和化简后的卷积运算式可以发现，切比雪夫多项式矩阵的运算是固定，可以在预处理阶段完成，且拉普拉斯矩阵一般是稀疏的 ，可以使用稀疏张量乘法加速 ，因此整个图卷积的计算复杂度大大降低。除了定义切比雪夫图卷积外，ChebNet的作者还提出了图池化层.
推导：
在这里插入图片描述

幂迭代法求特征值和特征向量

学习过滤器：通过梯度下降即可完成：

应用举例

根据切比雪夫多项式进行求解

2.2 图池化

图粗化

传统图池化方法：谱聚类
本文使用方法：Graclus多级聚类算法
Weighted graph cuts without eigenvectors a multilevel approach

另外：
Algebraic multigrid techniques on graphs and the Kron reduction are two methods worth exploring in future works

快速pooling

通过上述方法得到更粗糙的图，然后对节点重排序，然后构建一个二叉树。
在这里插入图片描述
每级粗化我们可以得到一个粗化图 $（G_0,G_1,G_2,G_3,G_4）$ ，将该粗化图添加上一些辅助节点 （上图右部分上,蓝色圆 ），使粗化图中每个顶点都有 $2$ 个孩子，从而构成一个二叉树，将二叉树摊平构成一维信号（上图右部分下），然后对该信号采样即可表示为一次图 $pooling$ 。

让我们对生活在 $G_0$ 上的信号 $x\inR^8$ 进行最大大小为 $4$ 的最大池化（或两个大小为 $2$ 的池化）,最细图作为输入给出。 请注意，它最初拥有 $n_0 = | V_0 | = 8$ 个顶点，任意排序。
对于大小为 $4$ 的池，需要对大小 $2$ 进行两次粗化：让 $Graclus$ 给出大小为 $n_1 = | V_1 |$ 的 $G1=5$ 则 $G_2$ 的大小为 $n_2 = | V_2 | = 3$ ，最粗糙的图。
因此，将大小设置为 $n_2 = 3，n_1 = 6，n_0 = 12$ ，并将假节点（蓝色）添加到 $V_1$ （ $1$ 个节点）和 $V_0$ （ $4$ 个节点）以与正弦（橙色）配对，以便每个节点具有正好有两个孩子 。
对 $V_2$ 中的节点进行任意排序 ，并因此对 $V_1$ 和 $V_0$ 中的节点进行排序 。此时， $V_0$ 中的顶点排列允许在 $x∈R^{12}$ 上进行规则的 $1D$ 合并，使得 $z = [max（x_0，x_1），max（x_4，x_5，x_6），max（x_8，x_9，x_10）]\inR^3$ ，其中信号分量 $x_2，x_3，x_7，x_{11}$ 设置为中性值。

整个GCN过程

在这里插入图片描述
# 代码地址

 MNIST.py

复制代码

    # -*- coding: utf-8 -*-
    
    import sys, os
    
    sys.path.insert(0, '..')
    from lib import models, graph, coarsening, utils
    import tensorflow.compat.v1 as tf
    import numpy as np
    import time
    
    # %matplotlib inline
    flags = tf.app.flags
    FLAGS = flags.FLAGS
    flags.DEFINE_integer('number_edges', 8, 'Graph: minimum number of edges per vertex.')
    flags.DEFINE_string('metric', 'euclidean', 'Graph: similarity measure (between features).')  # 相似度测量
    flags.DEFINE_bool('normalized_laplacian', True, 'Graph Laplacian: normalized.')
    flags.DEFINE_integer('coarsening_levels', 4, 'Number of coarsened graphs.')
    flags.DEFINE_string('dir_data', os.path.join('..', 'data', 'mnist'), 'Directory to store data.')
    
    """# Feature graph 图结构描述,即准备邻接矩阵A的拉普拉斯L"""
    
    
    def grid_graph(m, corners=False):
    z = graph.grid(m)  # 该函数说白了就产生一个28*28网格的每个点的坐标
    dist, idx = graph.distance_sklearn_metrics(z, k=FLAGS.number_edges, metric=FLAGS.metric)  # 顶点K邻近点计算
    A = graph.adjacency(dist, idx)  # 构建表示图的邻接矩阵 A
    
    # Connections are only vertical or horizontal on the grid. 网格上的连接只有水平或者垂直
    # Corner vertices are connected to 2 neightbors only. 转角处的顶点只与两个邻接点连接
    if corners:
        import scipy.sparse
        A = A.toarray()
        A[A < A.max() / 1.5] = 0
        A = scipy.sparse.csr_matrix(A)
        print('{} edges'.format(A.nnz))
    # A.nnz返回矩阵中非零结点的个数
    print("{} > {} edges".format(A.nnz // 2, FLAGS.number_edges * m ** 2 // 2))
    return A
    
    
    t_start = time.process_time()
    A = grid_graph(28, corners=False)  # "邻接矩阵"
    A = graph.replace_random_edges(A, 0)  # "添加噪声的邻接矩阵"
    # 借助图的粗化
    graphs, perm = coarsening.coarsen(A, levels=FLAGS.coarsening_levels, self_connections=False)  # 粗化图
    # # graphs为列表形式，图粗化分为5层对图进行简化。graphs列表的每个元素存储每层图粗化的相关矩阵信息。图结点坐标形式+距离权重
    # # perm为列表形式存储根据返回值可知此处perm为第一层所有节点索引，要基于此对图像结点进行重排序以适应多层的粗化图
    # print('graphs的类型',type(graphs),'\ngraphs的长度',len(graphs),'\ngrapgs的值\n',graphs,'\ngraphs[0]',graphs[0])
    # print('perm的类型',type(perm),'\nperm的长度',len(perm),'\nperm的值\n',perm)
    L = [graph.laplacian(A, normalized=True) for A in graphs]  # 对图粗化产生的每层邻接矩阵进行拉普拉斯变换
    print('Execution time: {:.2f}s'.format(time.process_time() - t_start))
    # graph.plot_spectrum(L)
    del A
    
    """# Data 准备"""
    from tensorflow.examples.tutorials.mnist import input_data
    
    mnist = input_data.read_data_sets(FLAGS.dir_data, one_hot=False)
    #
    train_data = mnist.train.images.astype(np.float32)
    # print('train_data的维度:',train_data.shape)    # train_data的维度: (55000, 784)
    val_data = mnist.validation.images.astype(np.float32)
    # print('val_data的维度:',val_data.shape)    # val_data的维度: (5000, 784)
    test_data = mnist.test.images.astype(np.float32)
    # print('test_data的维度:',test_data.shape)  # test_data的维度: (10000, 784)
    train_labels = mnist.train.labels
    # print('train_labels的维度:',train_labels.shape)    # train_labels的维度: (55000,)
    val_labels = mnist.validation.labels
    # print('val_labels的维度:',val_labels.shape)    # val_labels的维度: (5000,)
    test_labels = mnist.test.labels
    # print('test_labels的维度:',test_labels.shape)  # test_labels的维度: (10000,)
    
    t_start = time.process_time()
    train_data = coarsening.perm_data(train_data, perm)     # 根据第一层粗化的结点数对图像进行重排序
    val_data = coarsening.perm_data(val_data, perm)
    test_data = coarsening.perm_data(test_data, perm)
    print('Execution time: {:.2f}s'.format(time.process_time() - t_start))
    del perm
    
    """# Neural networks"""
    
    common = {}
    common['dir_name'] = 'mnist/'
    common['num_epochs'] = 20
    common['batch_size'] = 100
    common['decay_steps'] = mnist.train.num_examples / common['batch_size']
    common['eval_frequency'] = 30 * common['num_epochs']
    common['brelu'] = 'b1relu'
    common['pool'] = 'mpool1'
    C = max(mnist.train.labels) + 1  # number of classes
    model_perf = utils.model_perf()  # 模型对比函数
    
    # 参数 for LeNet5-like networks.
    common['regularization'] = 5e-4
    common['dropout'] = 0.5
    common['learning_rate'] = 0.02  # 0.03 in the paper but sgconv_sgconv_fc_softmax has difficulty to converge
    common['decay_rate'] = 0.95
    common['momentum'] = 0.9
    common['F'] = [32, 64]  # 每次卷积的输出feature大小
    common['K'] = [25, 25]  # 每个卷积核的多项式项数
    common['p'] = [4, 4]  # 每次卷积后的池化大小,池化次数与卷积次数一致
    common['M'] = [512, C]  # 全连接层输出
    
    # Architecture of TF MNIST conv model (LeNet-5-like).
    
    if True:
    name = 'Chebyshev_test'  # 'Chebyshev'
    params = common.copy()
    params['dir_name'] += name
    params['filter'] = 'chebyshev5'  # 使用chebyshev5所建的滤波器
    model_perf.test(models.cgcnn(L, **params), name, params,
                    train_data, train_labels, val_data, val_labels, test_data, test_labels)
    
    model_perf.show()
    print('end')
    model = models.cgcnn(L, **params)
    accuracy, loss, t_step = model.fit(train_data, train_labels, val_data, val_labels)
    ''' 
    #or 使用下面的代码可以在训练后不显示评价结果，而是直接训练。 
    model = models.cgcnn(L, **params)
    accuracy, loss, t_step = model.fit(train_data, train_labels, val_data, val_labels)
    '''

graph.py的部分注释

复制代码

    import sklearn.metrics
    import sklearn.neighbors
    import matplotlib.pyplot as plt
    import scipy.sparse
    import scipy.sparse.linalg
    import scipy.spatial.distance
    import numpy as np
    
    
    def grid(m, dtype=np.float32):
    """Return the embedding of a grid graph."""
    # print('创建网格点传入的参数m：', m) 28
    M = m**2
    x = np.linspace(0, 1, m, dtype=dtype)
    # print('x的维度：', x.shape) 28*1  0-1/28
    y = np.linspace(0, 1, m, dtype=dtype)
    # print('y的维度：', y.shape) 28*1  0-1/28
    xx, yy = np.meshgrid(x, y)
    # print('xx的维度：', xx.shape)     其值为0-1/28重复28次创建网格共784个点
    # print('yy的维度：', yy.shape)     其值为0-1/28重复28次创建网格共784个点
    z = np.empty((M, 2), dtype)
    z[:, 0] = xx.reshape(M)
    z[:, 1] = yy.reshape(M)
    # print('z的维度：', z.shape)   784*2 将初始化的网格摊平共有784个点每个点对应一对x，y坐标
    return z
    
    
    def distance_scipy_spatial(z, k=4, metric='euclidean'):
    """Compute exact pairwise distances."""
    d = scipy.spatial.distance.pdist(z, metric)
    d = scipy.spatial.distance.squareform(d)
    # k-NN graph.
    #argsort()函数是将x中的元素从小到大排列，提取其对应的index(索引)
    idx = np.argsort(d)[:, 1:k+1]
    d.sort()
    d = d[:, 1:k+1]
    return d, idx
    
    
    # 顶点K邻近点计算
    def distance_sklearn_metrics(z, k=4, metric='euclidean'):
    """Compute exact pairwise distances."""
    # print('z的维度：',z.shape)    z的维度为784*2相当于传入784个点的坐标
    d = sklearn.metrics.pairwise.pairwise_distances(
            z, metric=metric, n_jobs=-2)
    # print('d的维度:',d.shape)   d的维度为784*784,相当于根据传入的坐标计算每个点相对于其他点的距离。对角线元素均为0且为对称矩阵
    # k-NN graph.
    idx = np.argsort(d)[:, 1:k+1]   # 将距离从小到大排列从中选取距离最近的k个节点并提取索引到idx，从1开始是排除自身节点
    # print('idx的数据类型:', type(idx), 'idx的长度:', len(idx))     # idx的数据类型: <class 'numpy.ndarray'> idx的长度: 784*8
    d.sort()
    d = d[:, 1:k+1]
    # print('d的维度:', d.shape)  # 784*8        每个结点K临近点的距离矩阵
    # print('idx的长度', len(idx))    # 784*8    每个结点K临近点的索引列表
    return d, idx
    
    
    def distance_lshforest(z, k=4, metric='cosine'):
    """Return an approximation of the k-nearest cosine distances."""
    assert metric is 'cosine'
    lshf = sklearn.neighbors.LSHForest()
    lshf.fit(z)
    dist, idx = lshf.kneighbors(z, n_neighbors=k+1)
    assert dist.min() < 1e-10
    dist[dist < 0] = 0
    return dist, idx
    
    # TODO: other ANNs s.a. NMSLIB, EFANNA, FLANN, Annoy, sklearn neighbors, PANN
    
    
    # 根据每个结点K临近点的距离矩阵dist，每个结点K临近点的索引列表创建邻接矩阵idx
    # dist 784*8矩阵形式
    # idx 784*8列表形式
    def adjacency(dist, idx):
    """Return the adjacency matrix of a kNN graph."""
    M, k = dist.shape
    # print('M的值:',M)     # 784
    # print('k的值:',k)     # 8
    assert M, k == idx.shape
    assert dist.min() >= 0
    
    # Weights.
    sigma2 = np.mean(dist[:, -1])**2
    # print('sigma2的值',sigma2) 每个结点的距离的平方平均和
    dist = np.exp(- dist**2 / sigma2)
    # print('dist的维度:',dist.shape)    # 对距离矩阵的值进行统一处理
    
    # Weight matrix.
    I = np.arange(0, M).repeat(k)   # 0，M是784个点重复k次构建邻接矩阵 repeat方法在同一纬度上叠加 6272=784*8
    # print('I的维度:', I.shape)
    J = idx.reshape(M*k)            # 6272=784*8
    # print('J的维度:', J.shape)
    V = dist.reshape(M*k)           # 6272=784*8
    # print('V的维度:', V.shape)
    W = scipy.sparse.coo_matrix((V, (I, J)), shape=(M, M))      # 根据I,J提供的结点索引和权重矩阵V的值构建邻接矩阵
    # print('W的维度:', W.shape)
    
    # No self-connections.
    W.setdiag(0)        # 忽略自身连接，因此对角矩阵置为0
    
    # Non-directed graph.   # 无向图
    bigger = W.T > W
    W = W - W.multiply(bigger) + W.T.multiply(bigger)
    
    assert W.nnz % 2 == 0
    assert np.abs(W - W.T).mean() < 1e-10
    assert type(W) is scipy.sparse.csr.csr_matrix
    # print('W的维度:',W.shape)  # W的维度: (784, 784)
    return W
    
    
    # "添加噪声的邻接矩阵"
    def replace_random_edges(A, noise_level):
    """Replace randomly chosen edges by random edges."""
    M, M = A.shape  # 784，784
    n = int(noise_level * A.nnz // 2)
    # print(n)  noise_level传入参数为0
    
    indices = np.random.permutation(A.nnz//2)[:n]   # np.random.permutation()：随机排列序列。 noise_level传入参数为0 indices=[]
    rows = np.random.randint(0, M, n)   # noise_level传入参数为0 rows=[]
    cols = np.random.randint(0, M, n)   # noise_level传入参数为0 cols=[]
    vals = np.random.uniform(0, 1, n)   # noise_level传入参数为0 vals=[]
    assert len(indices) == len(rows) == len(cols) == len(vals)
    
    A_coo = scipy.sparse.triu(A, format='coo')  # scipy.sparse.triu(A, k=0, format=None) 以稀疏格式返回矩阵的上三角部分。foo代表稀疏矩阵格式
    # print('A_coo的维度',A_coo.shape)     # 784*784上三角矩阵784,783...1
    # print(A_coo)                        # 存储格式为矩阵结点索引+距离权重
    assert A_coo.nnz == A.nnz // 2
    assert A_coo.nnz >= n
    A = A.tolil()   # 对稀疏矩阵进程操作前的必要转换，提高计算效率
    
    for idx, row, col, val in zip(indices, rows, cols, vals):
        old_row = A_coo.row[idx]
        old_col = A_coo.col[idx]
    
        A[old_row, old_col] = 0
        A[old_col, old_row] = 0
        A[row, col] = 1
        A[col, row] = 1
    
    A.setdiag(0)    # 将对角矩阵置为0，不考虑自身连接
    A = A.tocsr()   # 转换数据格式
    A.eliminate_zeros()
    return A
    
    
    def laplacian(W, normalized=True):
    """Return the Laplacian of the weigth matrix."""
    
    # Degree matrix.
    d = W.sum(axis=0)   # 以行为推进计算每个点的度
    # print('d的维度:',d.shape)  # 得出每个点及其对应的度
    
    # Laplacian matrix.
    if not normalized:
        D = scipy.sparse.diags(d.A.squeeze(), 0)
        L = D - W
    else:
        d += np.spacing(np.array(0, W.dtype))
        d = 1 / np.sqrt(d)
        # print(d)
        D = scipy.sparse.diags(d.A.squeeze(), 0)    # 根据每个点的度构造度对角矩阵
        # print('D的维度:', D.shape)
        # print('D的值:', D)
        I = scipy.sparse.identity(d.size, dtype=W.dtype)
        L = I - D * W * D
    
    # assert np.abs(L - L.T).mean() < 1e-9
    assert type(L) is scipy.sparse.csr.csr_matrix
    return L
    
    
    def lmax(L, normalized=True):
    """Upper-bound on the spectrum."""
    if normalized:
        return 2
    else:
        return scipy.sparse.linalg.eigsh(
                L, k=1, which='LM', return_eigenvectors=False)[0]
    
    
    def fourier(L, algo='eigh', k=1):
    """Return the Fourier basis, i.e. the EVD of the Laplacian."""
    
    def sort(lamb, U):
        idx = lamb.argsort()
        return lamb[idx], U[:, idx]
    
    if algo is 'eig':
        lamb, U = np.linalg.eig(L.toarray())
        lamb, U = sort(lamb, U)
    elif algo is 'eigh':
        lamb, U = np.linalg.eigh(L.toarray())
    elif algo is 'eigs':
        lamb, U = scipy.sparse.linalg.eigs(L, k=k, which='SM')
        lamb, U = sort(lamb, U)
    elif algo is 'eigsh':
        lamb, U = scipy.sparse.linalg.eigsh(L, k=k, which='SM')
    
    return lamb, U
    
    
    def plot_spectrum(L, algo='eig'):
    """Plot the spectrum of a list of multi-scale Laplacians L."""
    # Algo is eig to be sure to get all eigenvalues.
    plt.figure(figsize=(17, 5))
    for i, lap in enumerate(L):
        lamb, U = fourier(lap, algo)
        step = 2**i
        x = range(step//2, L[0].shape[0], step)
        lb = 'L_{} spectrum in [{:1.2e}, {:1.2e}]'.format(i, lamb[0], lamb[-1])
        plt.plot(x, lamb, '.', label=lb)
    plt.legend(loc='best')
    plt.xlim(0, L[0].shape[0])
    plt.ylim(ymin=0)
    
    
    def lanczos(L, X, K):
    """
    Given the graph Laplacian and a data matrix, return a data matrix which can
    be multiplied by the filter coefficients to filter X using the Lanczos
    polynomial approximation.
    """
    M, N = X.shape
    assert L.dtype == X.dtype
    
    def basis(L, X, K):
        """
        Lanczos algorithm which computes the orthogonal matrix V and the
        tri-diagonal matrix H.
        """
        a = np.empty((K, N), L.dtype)
        b = np.zeros((K, N), L.dtype)
        V = np.empty((K, M, N), L.dtype)
        V[0, ...] = X / np.linalg.norm(X, axis=0)
        for k in range(K-1):
            W = L.dot(V[k, ...])
            a[k, :] = np.sum(W * V[k, ...], axis=0)
            W = W - a[k, :] * V[k, ...] - (
                    b[k, :] * V[k-1, ...] if k > 0 else 0)
            b[k+1, :] = np.linalg.norm(W, axis=0)
            V[k+1, ...] = W / b[k+1, :]
        a[K-1, :] = np.sum(L.dot(V[K-1, ...]) * V[K-1, ...], axis=0)
        return V, a, b
    
    def diag_H(a, b, K):
        """Diagonalize the tri-diagonal H matrix."""
        H = np.zeros((K*K, N), a.dtype)
        H[:K**2:K+1, :] = a
        H[1:(K-1)*K:K+1, :] = b[1:, :]
        H.shape = (K, K, N)
        Q = np.linalg.eigh(H.T, UPLO='L')[1]
        Q = np.swapaxes(Q, 1, 2).T
        return Q
    
    V, a, b = basis(L, X, K)
    Q = diag_H(a, b, K)
    Xt = np.empty((K, M, N), L.dtype)
    for n in range(N):
        Xt[..., n] = Q[..., n].T.dot(V[..., n])
    Xt *= Q[0, :, np.newaxis, :]
    Xt *= np.linalg.norm(X, axis=0)
    return Xt  # Q[0, ...]
    
    
    def rescale_L(L, lmax=2):
    """Rescale the Laplacian eigenvalues in [-1,1]."""
    M, M = L.shape
    I = scipy.sparse.identity(M, format='csr', dtype=L.dtype)
    L /= lmax / 2
    L -= I
    return L
    
    
    def chebyshev(L, X, K):
    """Return T_k X where T_k are the Chebyshev polynomials of order up to K.
    Complexity is O(KMN)."""
    M, N = X.shape
    assert L.dtype == X.dtype
    
    # L = rescale_L(L, lmax)
    # Xt = T @ X: MxM @ MxN.
    Xt = np.empty((K, M, N), L.dtype)
    # Xt_0 = T_0 X = I X = X.
    Xt[0, ...] = X
    # Xt_1 = T_1 X = L X.
    if K > 1:
        Xt[1, ...] = L.dot(X)
    # Xt_k = 2 L Xt_k-1 - Xt_k-2.
    for k in range(2, K):
        Xt[k, ...] = 2 * L.dot(Xt[k-1, ...]) - Xt[k-2, ...]
    return Xt

coarsening.py

复制代码

    import numpy as np
    import scipy.sparse
    
    
    def coarsen(A, levels, self_connections=False):
    """
    Coarsen a graph, represented by its adjacency matrix A, at multiple
    levels.
    """
    #粗化计算
    graphs, parents = metis(A, levels)
    #细化计算
    perms = compute_perm(parents)
    
    for i, A in enumerate(graphs):
        M, M = A.shape
    
        if not self_connections:
            A = A.tocoo()
            A.setdiag(0)
    
        if i < levels:
            A = perm_adjacency(A, perms[i])
    
        A = A.tocsr()
        A.eliminate_zeros()
        graphs[i] = A
    
        Mnew, Mnew = A.shape
        print('Layer {0}: M_{0} = |V| = {1} nodes ({2} added),'
              '|E| = {3} edges'.format(i, Mnew, Mnew-M, A.nnz//2))
    
    return graphs, perms[0] if levels > 0 else None
    
    
    def metis(W, levels, rid=None):
    """
    Coarsen a graph multiple times using the METIS algorithm.
    
    INPUT
    W: symmetric sparse weight (adjacency) matrix
    levels: the number of coarsened graphs
    
    OUTPUT
    graph[0]: original graph of size N_1
    graph[2]: coarser graph of size N_2 < N_1
    graph[levels]: coarsest graph of Size N_levels < ... < N_2 < N_1
    parents[i] is a vector of size N_i with entries ranging from 1 to N_{i+1}
        which indicate the parents in the coarser graph[i+1]
    nd_sz{i} is a vector of size N_i that contains the size of the supernode in the graph{i}
    
    NOTE
    if "graph" is a list of length k, then "parents" will be a list of length k-1
    """
    
    N, N = W.shape
    if rid is None:
        rid = np.random.permutation(range(N))
    parents = []
    #Degree matrix
    degree = W.sum(axis=0) - W.diagonal()
    graphs = []
    graphs.append(W)
    #supernode_size = np.ones(N)
    #nd_sz = [supernode_size]
    #count = 0
    
    #while N > maxsize:
    for _ in range(levels):
    
        #count += 1
    
        # CHOOSE THE WEIGHTS FOR THE PAIRING
        # weights = ones(N,1)       # metis weights
        weights = degree            # graclus weights
        # weights = supernode_size  # other possibility
        #squeeze 函数：从数组的形状中删除单维度条目，即把shape中为1的维度去掉
        weights = np.array(weights).squeeze()
    
        # PAIR THE VERTICES AND CONSTRUCT THE ROOT VECTOR
        idx_row, idx_col, val = scipy.sparse.find(W)
        perm = np.argsort(idx_row)
        rr = idx_row[perm]
        cc = idx_col[perm]
        vv = val[perm]
        cluster_id = metis_one_level(rr,cc,vv,rid,weights)  # rr is ordered
        parents.append(cluster_id)
    
        # TO DO
        # COMPUTE THE SIZE OF THE SUPERNODES AND THEIR DEGREE 
        #supernode_size = full(   sparse(cluster_id,  ones(N,1) , supernode_size )     )
        #print(cluster_id)
        #print(supernode_size)
        #nd_sz{count+1}=supernode_size;
    
        # COMPUTE THE EDGES WEIGHTS FOR THE NEW GRAPH
        nrr = cluster_id[rr]
        ncc = cluster_id[cc]
        nvv = vv
        Nnew = cluster_id.max() + 1
        # CSR is more appropriate: row,val pairs appear multiple times
        W = scipy.sparse.csr_matrix((nvv,(nrr,ncc)), shape=(Nnew,Nnew))
        W.eliminate_zeros()
        # Add new graph to the list of all coarsened graphs
        graphs.append(W)
        N, N = W.shape
    
        # COMPUTE THE DEGREE (OMIT OR NOT SELF LOOPS)
        degree = W.sum(axis=0)
        #degree = W.sum(axis=0) - W.diagonal()
    
        # CHOOSE THE ORDER IN WHICH VERTICES WILL BE VISTED AT THE NEXT PASS
        #[~, rid]=sort(ss);     # arthur strategy
        #[~, rid]=sort(supernode_size);    #  thomas strategy
        #rid=randperm(N);                  #  metis/graclus strategy
        ss = np.array(W.sum(axis=0)).squeeze()
        rid = np.argsort(ss)
    
    return graphs, parents
    
    
    # Coarsen a graph given by rr,cc,vv.  rr is assumed to be ordered
    def metis_one_level(rr,cc,vv,rid,weights):
    
    nnz = rr.shape[0]
    N = rr[nnz-1] + 1
    
    marked = np.zeros(N, np.bool)
    rowstart = np.zeros(N, np.int32)
    rowlength = np.zeros(N, np.int32)
    cluster_id = np.zeros(N, np.int32)
    
    oldval = rr[0]
    count = 0
    clustercount = 0
    
    #计算每个顶点的连接长度
    for ii in range(nnz):
        rowlength[count] = rowlength[count] + 1
        if rr[ii] > oldval:
            oldval = rr[ii]
            rowstart[count+1] = ii
            count = count + 1
    
    #rid为随机数，将顶点顺序打乱
    for ii in range(N):
        tid = rid[ii]
        #根据craclue算法，marked的作用是将每个点标记
        #找到x邻域内的未标记点y进行计算
        if not marked[tid]:
            wmax = 0.0
            rs = rowstart[tid]
            marked[tid] = True
            bestneighbor = -1
            #rowlength[tid]表示每个点的邻域内的点
            #并根据e(x,y)*(a.0/weight(x)+1.0/weight(y))最大化
            for jj in range(rowlength[tid]):
                nid = cc[rs+jj]
                if marked[nid]:
                    tval = 0.0
                else:
                    tval = vv[rs+jj] * (1.0/weights[tid] + 1.0/weights[nid])
                if tval > wmax:
                    wmax = tval
                    bestneighbor = nid
    
            cluster_id[tid] = clustercount
    
            if bestneighbor > -1:
                cluster_id[bestneighbor] = clustercount
                marked[bestneighbor] = True
    
            clustercount += 1
    
    return cluster_id
    
    def compute_perm(parents):
    """
    Return a list of indices to reorder the adjacency and data matrices so
    that the union of two neighbors from layer to layer forms a binary tree.
    """
    
    # Order of last layer is random (chosen by the clustering algorithm).
    indices = []
    if len(parents) > 0:
        M_last = max(parents[-1]) + 1
        indices.append(list(range(M_last)))
    
    # 取从后向前（相反）的元素
    for parent in parents[::-1]:
        #print('parent: {}'.format(parent))
    
        # Fake nodes go after real ones.
        pool_singeltons = len(parent)
    
        indices_layer = []
        for i in indices[-1]:
            indices_node = list(np.where(parent == i)[0])
            assert 0 <= len(indices_node) <= 2
            #print('indices_node: {}'.format(indices_node))
    
            # Add a node to go with a singelton.
            if len(indices_node) is 1:
                indices_node.append(pool_singeltons)
                pool_singeltons += 1
                #print('new singelton: {}'.format(indices_node))
            # Add two nodes as children of a singelton in the parent.
            elif len(indices_node) is 0:
                indices_node.append(pool_singeltons+0)
                indices_node.append(pool_singeltons+1)
                pool_singeltons += 2
                #print('singelton childrens: {}'.format(indices_node))
    
            indices_layer.extend(indices_node)
        indices.append(indices_layer)
    
    # Sanity checks.
    for i,indices_layer in enumerate(indices):
        M = M_last*2**i
        # Reduction by 2 at each layer (binary tree).
        assert len(indices[0] == M)
        # The new ordering does not omit an indice.
        assert sorted(indices_layer) == list(range(M))
    
    return indices[::-1]
    
    assert (compute_perm([np.array([4,1,1,2,2,3,0,0,3]),np.array([2,1,0,1,0])])
        == [[3,4,0,9,1,2,5,8,6,7,10,11],[2,4,1,3,0,5],[0,1,2]])
    
    def perm_data(x, indices):
    """
    Permute data matrix, i.e. exchange node ids,
    so that binary unions form the clustering tree.
    """
    if indices is None:
        return x
    
    N, M = x.shape
    Mnew = len(indices)
    assert Mnew >= M
    xnew = np.empty((N, Mnew))
    for i,j in enumerate(indices):
        # Existing vertex, i.e. real data.
        if j < M:
            xnew[:,i] = x[:,j]
        # Fake vertex because of singeltons.
        # They will stay 0 so that max pooling chooses the singelton.
        # Or -infty ?
        else:
            xnew[:,i] = np.zeros(N)
    return xnew
    
    def perm_adjacency(A, indices):
    """
    Permute adjacency matrix, i.e. exchange node ids,
    so that binary unions form the clustering tree.
    """
    if indices is None:
        return A
    
    M, M = A.shape
    Mnew = len(indices)
    assert Mnew >= M
    A = A.tocoo()
    
    # Add Mnew - M isolated vertices.
    if Mnew > M:
        rows = scipy.sparse.coo_matrix((Mnew-M,    M), dtype=np.float32)
        cols = scipy.sparse.coo_matrix((Mnew, Mnew-M), dtype=np.float32)
        A = scipy.sparse.vstack([A, rows])
        A = scipy.sparse.hstack([A, cols])
    
    # Permute the rows and the columns.
    perm = np.argsort(indices)
    A.row = np.array(perm)[A.row]
    A.col = np.array(perm)[A.col]
    
    # assert np.abs(A - A.T).mean() < 1e-9
    assert type(A) is scipy.sparse.coo.coo_matrix
    return A

在这里插入图片描述
models.py中的 class cgcnn(base_model)

复制代码

    def build_graph(self, M_0):
        """Build the computational graph of the model."""
        self.graph = tf.Graph()
        with self.graph.as_default():
     
            # Inputs.
            with tf.name_scope('inputs'):
                self.ph_data = tf.placeholder(tf.float32, (self.batch_size, M_0), 'data')
                self.ph_labels = tf.placeholder(tf.int32, (self.batch_size), 'labels')
                self.ph_dropout = tf.placeholder(tf.float32, (), 'dropout')
     
            # Model.
            op_logits = self.inference(self.ph_data, self.ph_dropout)  #！！！！！！#
            self.op_loss, self.op_loss_average = self.loss(op_logits, self.ph_labels, self.regularization)
            self.op_train = self.training(self.op_loss, self.learning_rate,
                    self.decay_steps, self.decay_rate, self.momentum)
            self.op_prediction = self.prediction(op_logits)
     
            # Initialize variables, i.e. weights and biases.
            self.op_init = tf.global_variables_initializer()
            
            # Summaries for TensorBoard and Save for model parameters.
            self.op_summary = tf.summary.merge_all()
            self.op_saver = tf.train.Saver(max_to_keep=5)
        
        self.graph.finalize()

以cgcnn中的_inference为例分析（models.py中的其他是论文中提到的其他模型或配置实现），cgcnn为作者提出的基准模型。

复制代码

    def _inference(self, x, dropout):
        # Graph convolutional layers.
        x = tf.expand_dims(x, 2)  # N x M x F=1
        for i in range(len(self.p)):  #依据池化次数，设置卷积（池化and卷积一对一）
            with tf.variable_scope('conv{}'.format(i+1)):  #此scope中的为一次图卷积
                with tf.name_scope('filter'):              #卷积中的滤波器
                    x = self.filter(x, self.L[i], self.F[i], self.K[i])
                with tf.name_scope('bias_relu'):           #卷积中的激活函数
                    x = self.brelu(x)
                with tf.name_scope('pooling'):             #卷积后的池化
                    x = self.pool(x, self.p[i])
        
        # Fully connected hidden layers.
        N, M, F = x.get_shape()
        x = tf.reshape(x, [int(N), int(M*F)])  # N x M
        for i,M in enumerate(self.M[:-1]):
            with tf.variable_scope('fc{}'.format(i+1)):
                x = self.fc(x, M)
                x = tf.nn.dropout(x, dropout)
        
        # Logits linear layer, i.e. softmax without normalization.
        with tf.variable_scope('logits'):
            x = self.fc(x, self.M[-1], relu=False)
        return x

分解上述代码scope(‘filter’)，发现他是借助getattr()重新与指定函数绑定，如和chebyshev2()绑定，这里我们以chebyshev2()（PS 该函数即为filter，且为作者提出的经典类型，功能应该和chebyshv5()相同，只不过通过numpy实现，而chebyshev5借助TF实现，）为例展开解释filter如何编写的（numpy比TF实现看起来更好理解）。

复制代码

    def chebyshev2(self, x, L, Fout, K):
        """
        Filtering with Chebyshev interpolation
        Implementation: numpy.
        
        Data: x of size N x M x F
            N: number of signals
            M: number of vertices
            F: number of features per signal per vertex
        """
        N, M, Fin = x.get_shape()
        N, M, Fin = int(N), int(M), int(Fin)
        # Rescale Laplacian. Copy to not modify the shared L.
        L = scipy.sparse.csr_matrix(L)
        L = graph.rescale_L(L, lmax=2)
        # Transform to Chebyshev basis
        x = tf.transpose(x, perm=[1, 2, 0])  # M x Fin x N
        x = tf.reshape(x, [M, Fin*N])  # M x Fin*N   #将X所有点所有特征放到一个维度上，得X
        def chebyshev(x):     #在X上用用chebyshev,返回T_k X(py_func将X转到numpy array运行)
            return graph.chebyshev(L, x, K)        
        x = tf.py_func(chebyshev, [x], [tf.float32])[0]  # K x M x Fin*N
        x = tf.reshape(x, [K, M, Fin, N])  # K x M x Fin x N            # 维度转换
        x = tf.transpose(x, perm=[3,1,2,0])  # N x M x Fin x K
        x = tf.reshape(x, [N*M, Fin*K])  # N*M x Fin*K                          
        # Filter: Fin*Fout filters of order K, i.e. one filterbank per feature.
        W = self._weight_variable([Fin*K, Fout], regularization=False)  #权重乘法
        x = tf.matmul(x, W)  # N*M x Fout
        return tf.reshape(x, [N, M, Fout])  # N x M x Fout

从上面可以看到，先将x转换到稀疏矩阵的形式，然后维度变形，之后在其上使用 $graph.chebyshev()$ 应用 $chebyshev$ 多项式的迭代计算函数，获得 $T_k$ ，即论文中使用 $chebyshev$ 来计算 $T_k-$ –下图左, 来近似 $diag(λ^k)$ —下图右）,同时我们注意到下面公式里含有求和(关于 $k$ )，而上面代码中并没有，而是通过增加权重参数（ $w$ 本为【 $Fin*Fout$ 】，现为【 $Fin*k,Fout$ 】）将其从求和变为 $x*w(shape(w) = [Fin*k,Fout])$ 的形式。
在这里插入图片描述

复制代码

    def chebyshev(L, X, K):
    """Return T_k X where T_k are the Chebyshev polynomials of order up to K.
    Complexity is O(KMN)."""
    '''来自于graph.py
    返回T_k X，其中T_k是最多为K阶的Chebyshev多项式。'''
    M, N = X.shape
    assert L.dtype == X.dtype
     
    # L = rescale_L(L, lmax)
    # Xt = T @ X: MxM @ MxN.
    Xt = np.empty((K, M, N), L.dtype)
    # Xt_0 = T_0 X = I X = X.
    Xt[0, ...] = X
    # Xt_1 = T_1 X = L X.
    if K > 1:
        Xt[1, ...] = L.dot(X)
    # Xt_k = 2 L Xt_k-1 - Xt_k-2.
    for k in range(2, K):
        Xt[k, ...] = 2 * L.dot(Xt[k-1, ...]) - Xt[k-2, ...]
    return Xt

上面的 $chebyshev$ 即借助 $chebyshev$ 的多项式递推性质而写。获得 $T_k$ 后经过适当变形处理，即可和需要学习的参数 $θ$ 相乘，然后返回结果，至此一次 $filter()$ 结束
Pool并不是作者提出，其重新组织为1D信号时需要采取的粗化与重排
而是将出粗化重排结果转换为拉普拉斯矩阵，用在卷积过程

复制代码

    def mpool1(self, x, p):
        """Max pooling of size p. Should be a power of 2."""
        if p > 1:
            x = tf.expand_dims(x, 3)  # N x M x F x 1
            x = tf.nn.max_pool(x, ksize=[1,p,1,1], strides=[1,p,1,1], padding='SAME')
            #tf.maximum
            return tf.squeeze(x, [3])  # N x M/p x F
        else:
            return x
    
    def apool1(self, x, p):
        """Average pooling of size p. Should be a power of 2."""
        if p > 1:
            x = tf.expand_dims(x, 3)  # N x M x F x 1
            x = tf.nn.avg_pool(x, ksize=[1,p,1,1], strides=[1,p,1,1], padding='SAME')
            return tf.squeeze(x, [3])  # N x M/p x F
        else:
            return x

全部评论 (0)

还没有任何评论哟~

Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering

ConvolutionalNeuralNetworksonGraphswithFastLocalizedSpectralFiltering 1、主要贡献 2、算法介绍 2.1学习局部化谱filters...

《Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering》阅读笔记

工程地址课前甜点——使用JupyterNotebook工程中的usage.ipynb JupyterNotebook（此前被称为IPythonnotebook）是一个交互式笔记本，支持40多种编程语...

图神经网络（五）Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering

把之前看的论文都补一下吧。简练代码源码展示一句话概括该论文：这篇论文提出了利用切比雪夫多项式来替代SCNN中可学习卷积核的思路，解决了以往图神经网络参数多且局部连接的问题，得出的结论是：通过选...

[论文笔记] [2016] [NIPS] Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering

这篇论文是2016NIPS上的一篇论文。从标题就可以看出这篇论文的特点所在：FastLocalizedSpectralFiltering。这篇论文的主要贡献在于： 1.改进了SpectralNetwo...

Spectral Networks and Deep Locally Connected Networks on Graphs

SpectralNetworksandDeepLocallyConnectedNetworksonGraphs 空域构建（SpatialConstruction）深度局部连接网络多层空域卷积频域...

Learning Convolutional Neural Networks for Graphs

LearningConvolutionalNeuralNetworksforGraphs introduction method NodeSequenceSelection NeighborhoodA...

Spectral Networks and Deep Locally Connected Networks on Graphs翻译

Aabstract 卷积神经网络在图像和音频识别任务中是非常有效的架构，因为它能够利用信号类在其域中的局部平移不变性。在本文中，我们考虑在没有平移作用的情况下，CNNs可以推广到一般域上定义的信号.特...

Notes on Convolutional Neural Networks

卷积神经网络的导出与实现概述本文档由JakeBouvrie撰写，讨论了卷积神经网络（CNNs）的导出和实现，以及一些简单的扩展。卷积神经网络通过架构本身实现了形式上的正则化，并自动提供了一定程度的...

论文阅读：Spectral Networks and Deep Locally Connected Networks on Graphs

论文阅读：SpectralNetworksandDeepLocallyConnectedNetworksonGraphs 目录 Abstract 1Introduction 1.1Contributi...

Spectral Networks and Deep Locally Connected Networks on Graphs 论文阅读

SpectralNetworksandDeepLocallyConnectedNetworksonGraphs Codelink: 摘要：CNN在图片识别与语音等任务上取得很大的成功，由于其具有局部转...

是否确定退出登录?

Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering

Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering

1、主要贡献

2、算法介绍

2.1 学习局部化谱filters

k阶近似与ChebNet

2.2 图池化

图粗化

快速pooling

整个GCN过程

全部评论 (0)

相关文章推荐

Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering

《Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering》阅读笔记

图神经网络（五）Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering

[论文笔记] [2016] [NIPS] Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering

Spectral Networks and Deep Locally Connected Networks on Graphs

Learning Convolutional Neural Networks for Graphs

Spectral Networks and Deep Locally Connected Networks on Graphs翻译

Notes on Convolutional Neural Networks

论文阅读：Spectral Networks and Deep Locally Connected Networks on Graphs

Spectral Networks and Deep Locally Connected Networks on Graphs 论文阅读