Advertisement

Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering

阅读量:

Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering

  • 1、主要贡献
  • 2、算法介绍
    • 2.1 学习局部化谱filters

      • k阶近似与ChebNet
    • 2.2 图池化

      • 图粗化
      • 快速pooling
      • 整个GCN过程

1、主要贡献

1、谱方法的卷积公式。一种基于谱方法的CNN的形式化表述,基于GSP

2、严格的局部化的filters。局部化就是定义了一个参数K,K代表以一个节点为中心的K次跳跃(也就是距离为K)。

3、小的计算复杂度。复杂度是和K以及边数成正比。因为大部分现实生活中的图是高度稀疏的,那么就是边数远远低于点数的平方,即 边 = k*点数。k可以看做是一个点的k最近邻。这也就是说复杂度是和输入数据的大小。另外,这个方法避免计算傅里叶因子(需要计算和储存傅里叶因子)。这个方法只用储存一个拉普拉斯矩阵,这是一个稀疏矩阵,只有边数个非零值。

4、高效的pooling。通过构建一个二叉树进行pooling。

2、算法介绍

实现图上的CNN需要满足一下一点要求:

  • (1)设计卷积核
  • (2)相似点聚集
  • (3)pooling操作

2.1 学习局部化谱filters

k阶近似与ChebNet

切比雪夫多项式及其应用
http://www.docin.com/p-524643989.html
http://bookshelf.docin.com/p-1702466525.html
https://www.ixueshu.com/document/2c40f2ae055f2240318947a18e7f9386.html
https://wenku.baidu.com/view/c9566591f56527d3240c844769eae009591ba2ce.html
https://wenku.baidu.com/view/544dab7502768e9951e73842.html
ChebNet的作者指出原始的频谱图卷积有两大缺点:
图卷积核是全局的且参数量大 (卷积核大小与输入信号相同,参数量与图节点数相同)
图卷积运算的复杂度高 (运算过程涉及高计算复杂度的特征分解)

首先,为了克服卷积核过“大”的缺点,ChebNet指出可以使用多项式展开近似计算图卷积,即对参数化的频率响应函数进行多项式近似:
在这里插入图片描述
其中k代表多项式的最高阶数,这一步近似将卷积核的参数数量从n个减少到了k个,使卷积核变“小”了,变得“局部”了。但是整个图卷积运算的复杂度还是O(n^2),因为输入信号要与傅里叶基U做矩阵乘法。为了进一步降低计算复杂度,作者使用迭代定义的切比雪夫多项式(ChebShev\ Polynomial)作近似并证明可以将计算复杂度降低至O(K|E|),其中K为多项式的阶数,E为图中边的数量。作者将基于切比雪夫多项式定义的卷积核称为切比雪夫卷积核,对应的卷积运算则称为切比雪夫卷积,他们的公式定义如下:
在这里插入图片描述
式中的\lambda_{max}​​表示拉普拉斯矩阵的最大特征值 ,添加’是为了表明这是近似的参数。从替换和化简后的卷积运算式可以发现,切比雪夫多项式矩阵的运算是固定,可以在预处理阶段完成,且拉普拉斯矩阵一般是稀疏的 ,可以使用稀疏张量乘法加速 ,因此整个图卷积的计算复杂度大大降低。除了定义切比雪夫图卷积外,ChebNet的作者还提出了图池化层.
推导:
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
幂迭代法求特征值和特征向量
在这里插入图片描述
在这里插入图片描述
学习过滤器:通过梯度下降即可完成:
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
应用举例
在这里插入图片描述
在这里插入图片描述
根据切比雪夫多项式进行求解
在这里插入图片描述
在这里插入图片描述

2.2 图池化

图粗化

传统图池化方法:谱聚类
本文使用方法:Graclus多级聚类算法
Weighted graph cuts without eigenvectors a multilevel approach

另外:
Algebraic multigrid techniques on graphs and the Kron reduction are two methods worth exploring in future works

快速pooling

通过上述方法得到更粗糙的图,然后对节点重排序,然后构建一个二叉树。
在这里插入图片描述
每级粗化我们可以得到一个粗化图(G_0,G_1,G_2,G_3,G_4),将该粗化图添加上一些辅助节点 (上图右部分上,蓝色圆 ),使粗化图中每个顶点都有2个孩子,从而构成一个二叉树,将二叉树摊平构成一维信号(上图右部分下),然后对该信号采样即可表示为一次图pooling

  • 让我们对生活在G_0上的信号x∈R^8进行最大大小为4的最大池化(或两个大小为2的池化),最细图作为输入给出。 请注意,它最初拥有 n_0 = | V_0 | = 8个顶点,任意排序。
  • 对于大小为4的池,需要对大小2进行两次粗化:让Graclus给出大小为n_1 = | V_1 |G1=5G_2的大小为n_2 = | V_2 | = 3,最粗糙的图。
  • 因此,将大小设置为n_2 = 3,n_1 = 6,n_0 = 12,并将假节点(蓝色 )添加到V_11个节点)和V_04个节点)以与正弦(橙色)配对,以便每个节点具有 正好有两个孩子
  • V_2中的节点进行任意排序 ,并因此对V_1V_0中的节点进行排序 。 此时,V_0中的顶点排列允许在x∈R^{12}上进行规则的1D合并,使得z = [max(x_0,x_1),max(x_4,x_5,x_6),max(x_8,x_9,x_10)]∈R^3 ,其中信号分量x_2,x_3,x_7,x_{11}设置为中性值。

整个GCN过程

在这里插入图片描述
# 代码地址
在这里插入图片描述
MNIST.py

复制代码
    # -*- coding: utf-8 -*-
    
    import sys, os
    
    sys.path.insert(0, '..')
    from lib import models, graph, coarsening, utils
    import tensorflow.compat.v1 as tf
    import numpy as np
    import time
    
    # %matplotlib inline
    flags = tf.app.flags
    FLAGS = flags.FLAGS
    flags.DEFINE_integer('number_edges', 8, 'Graph: minimum number of edges per vertex.')
    flags.DEFINE_string('metric', 'euclidean', 'Graph: similarity measure (between features).')  # 相似度测量
    flags.DEFINE_bool('normalized_laplacian', True, 'Graph Laplacian: normalized.')
    flags.DEFINE_integer('coarsening_levels', 4, 'Number of coarsened graphs.')
    flags.DEFINE_string('dir_data', os.path.join('..', 'data', 'mnist'), 'Directory to store data.')
    
    """# Feature graph 图结构描述,即准备邻接矩阵A的拉普拉斯L"""
    
    
    def grid_graph(m, corners=False):
    z = graph.grid(m)  # 该函数说白了就产生一个28*28网格的每个点的坐标
    dist, idx = graph.distance_sklearn_metrics(z, k=FLAGS.number_edges, metric=FLAGS.metric)  # 顶点K邻近点计算
    A = graph.adjacency(dist, idx)  # 构建表示图的邻接矩阵 A
    
    # Connections are only vertical or horizontal on the grid. 网格上的连接只有水平或者垂直
    # Corner vertices are connected to 2 neightbors only. 转角处的顶点只与两个邻接点连接
    if corners:
        import scipy.sparse
        A = A.toarray()
        A[A < A.max() / 1.5] = 0
        A = scipy.sparse.csr_matrix(A)
        print('{} edges'.format(A.nnz))
    # A.nnz返回矩阵中非零结点的个数
    print("{} > {} edges".format(A.nnz // 2, FLAGS.number_edges * m ** 2 // 2))
    return A
    
    
    t_start = time.process_time()
    A = grid_graph(28, corners=False)  # "邻接矩阵"
    A = graph.replace_random_edges(A, 0)  # "添加噪声的邻接矩阵"
    # 借助图的粗化
    graphs, perm = coarsening.coarsen(A, levels=FLAGS.coarsening_levels, self_connections=False)  # 粗化图
    # # graphs为列表形式,图粗化分为5层对图进行简化。graphs列表的每个元素存储每层图粗化的相关矩阵信息。图结点坐标形式+距离权重
    # # perm为列表形式存储根据返回值可知此处perm为第一层所有节点索引,要基于此对图像结点进行重排序以适应多层的粗化图
    # print('graphs的类型',type(graphs),'\ngraphs的长度',len(graphs),'\ngrapgs的值\n',graphs,'\ngraphs[0]',graphs[0])
    # print('perm的类型',type(perm),'\nperm的长度',len(perm),'\nperm的值\n',perm)
    L = [graph.laplacian(A, normalized=True) for A in graphs]  # 对图粗化产生的每层邻接矩阵进行拉普拉斯变换
    print('Execution time: {:.2f}s'.format(time.process_time() - t_start))
    # graph.plot_spectrum(L)
    del A
    
    """# Data 准备"""
    from tensorflow.examples.tutorials.mnist import input_data
    
    mnist = input_data.read_data_sets(FLAGS.dir_data, one_hot=False)
    #
    train_data = mnist.train.images.astype(np.float32)
    # print('train_data的维度:',train_data.shape)    # train_data的维度: (55000, 784)
    val_data = mnist.validation.images.astype(np.float32)
    # print('val_data的维度:',val_data.shape)    # val_data的维度: (5000, 784)
    test_data = mnist.test.images.astype(np.float32)
    # print('test_data的维度:',test_data.shape)  # test_data的维度: (10000, 784)
    train_labels = mnist.train.labels
    # print('train_labels的维度:',train_labels.shape)    # train_labels的维度: (55000,)
    val_labels = mnist.validation.labels
    # print('val_labels的维度:',val_labels.shape)    # val_labels的维度: (5000,)
    test_labels = mnist.test.labels
    # print('test_labels的维度:',test_labels.shape)  # test_labels的维度: (10000,)
    
    t_start = time.process_time()
    train_data = coarsening.perm_data(train_data, perm)     # 根据第一层粗化的结点数对图像进行重排序
    val_data = coarsening.perm_data(val_data, perm)
    test_data = coarsening.perm_data(test_data, perm)
    print('Execution time: {:.2f}s'.format(time.process_time() - t_start))
    del perm
    
    """# Neural networks"""
    
    common = {}
    common['dir_name'] = 'mnist/'
    common['num_epochs'] = 20
    common['batch_size'] = 100
    common['decay_steps'] = mnist.train.num_examples / common['batch_size']
    common['eval_frequency'] = 30 * common['num_epochs']
    common['brelu'] = 'b1relu'
    common['pool'] = 'mpool1'
    C = max(mnist.train.labels) + 1  # number of classes
    model_perf = utils.model_perf()  # 模型对比函数
    
    # 参数 for LeNet5-like networks.
    common['regularization'] = 5e-4
    common['dropout'] = 0.5
    common['learning_rate'] = 0.02  # 0.03 in the paper but sgconv_sgconv_fc_softmax has difficulty to converge
    common['decay_rate'] = 0.95
    common['momentum'] = 0.9
    common['F'] = [32, 64]  # 每次卷积的输出feature大小
    common['K'] = [25, 25]  # 每个卷积核的多项式项数
    common['p'] = [4, 4]  # 每次卷积后的池化大小,池化次数与卷积次数一致
    common['M'] = [512, C]  # 全连接层输出
    
    # Architecture of TF MNIST conv model (LeNet-5-like).
    
    if True:
    name = 'Chebyshev_test'  # 'Chebyshev'
    params = common.copy()
    params['dir_name'] += name
    params['filter'] = 'chebyshev5'  # 使用chebyshev5所建的滤波器
    model_perf.test(models.cgcnn(L, **params), name, params,
                    train_data, train_labels, val_data, val_labels, test_data, test_labels)
    
    model_perf.show()
    print('end')
    model = models.cgcnn(L, **params)
    accuracy, loss, t_step = model.fit(train_data, train_labels, val_data, val_labels)
    ''' 
    #or 使用下面的代码可以在训练后不显示评价结果,而是直接训练。 
    model = models.cgcnn(L, **params)
    accuracy, loss, t_step = model.fit(train_data, train_labels, val_data, val_labels)
    '''
    
    
    
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    

graph.py的部分注释

复制代码
    import sklearn.metrics
    import sklearn.neighbors
    import matplotlib.pyplot as plt
    import scipy.sparse
    import scipy.sparse.linalg
    import scipy.spatial.distance
    import numpy as np
    
    
    def grid(m, dtype=np.float32):
    """Return the embedding of a grid graph."""
    # print('创建网格点传入的参数m:', m) 28
    M = m**2
    x = np.linspace(0, 1, m, dtype=dtype)
    # print('x的维度:', x.shape) 28*1  0-1/28
    y = np.linspace(0, 1, m, dtype=dtype)
    # print('y的维度:', y.shape) 28*1  0-1/28
    xx, yy = np.meshgrid(x, y)
    # print('xx的维度:', xx.shape)     其值为0-1/28重复28次创建网格共784个点
    # print('yy的维度:', yy.shape)     其值为0-1/28重复28次创建网格共784个点
    z = np.empty((M, 2), dtype)
    z[:, 0] = xx.reshape(M)
    z[:, 1] = yy.reshape(M)
    # print('z的维度:', z.shape)   784*2 将初始化的网格摊平共有784个点每个点对应一对x,y坐标
    return z
    
    
    def distance_scipy_spatial(z, k=4, metric='euclidean'):
    """Compute exact pairwise distances."""
    d = scipy.spatial.distance.pdist(z, metric)
    d = scipy.spatial.distance.squareform(d)
    # k-NN graph.
    #argsort()函数是将x中的元素从小到大排列,提取其对应的index(索引)
    idx = np.argsort(d)[:, 1:k+1]
    d.sort()
    d = d[:, 1:k+1]
    return d, idx
    
    
    # 顶点K邻近点计算
    def distance_sklearn_metrics(z, k=4, metric='euclidean'):
    """Compute exact pairwise distances."""
    # print('z的维度:',z.shape)    z的维度为784*2相当于传入784个点的坐标
    d = sklearn.metrics.pairwise.pairwise_distances(
            z, metric=metric, n_jobs=-2)
    # print('d的维度:',d.shape)   d的维度为784*784,相当于根据传入的坐标计算每个点相对于其他点的距离。对角线元素均为0且为对称矩阵
    # k-NN graph.
    idx = np.argsort(d)[:, 1:k+1]   # 将距离从小到大排列从中选取距离最近的k个节点并提取索引到idx,从1开始是排除自身节点
    # print('idx的数据类型:', type(idx), 'idx的长度:', len(idx))     # idx的数据类型: <class 'numpy.ndarray'> idx的长度: 784*8
    d.sort()
    d = d[:, 1:k+1]
    # print('d的维度:', d.shape)  # 784*8        每个结点K临近点的距离矩阵
    # print('idx的长度', len(idx))    # 784*8    每个结点K临近点的索引列表
    return d, idx
    
    
    def distance_lshforest(z, k=4, metric='cosine'):
    """Return an approximation of the k-nearest cosine distances."""
    assert metric is 'cosine'
    lshf = sklearn.neighbors.LSHForest()
    lshf.fit(z)
    dist, idx = lshf.kneighbors(z, n_neighbors=k+1)
    assert dist.min() < 1e-10
    dist[dist < 0] = 0
    return dist, idx
    
    # TODO: other ANNs s.a. NMSLIB, EFANNA, FLANN, Annoy, sklearn neighbors, PANN
    
    
    # 根据每个结点K临近点的距离矩阵dist,每个结点K临近点的索引列表创建邻接矩阵idx
    # dist 784*8矩阵形式
    # idx 784*8列表形式
    def adjacency(dist, idx):
    """Return the adjacency matrix of a kNN graph."""
    M, k = dist.shape
    # print('M的值:',M)     # 784
    # print('k的值:',k)     # 8
    assert M, k == idx.shape
    assert dist.min() >= 0
    
    # Weights.
    sigma2 = np.mean(dist[:, -1])**2
    # print('sigma2的值',sigma2) 每个结点的距离的平方平均和
    dist = np.exp(- dist**2 / sigma2)
    # print('dist的维度:',dist.shape)    # 对距离矩阵的值进行统一处理
    
    # Weight matrix.
    I = np.arange(0, M).repeat(k)   # 0,M是784个点重复k次构建邻接矩阵 repeat方法在同一纬度上叠加 6272=784*8
    # print('I的维度:', I.shape)
    J = idx.reshape(M*k)            # 6272=784*8
    # print('J的维度:', J.shape)
    V = dist.reshape(M*k)           # 6272=784*8
    # print('V的维度:', V.shape)
    W = scipy.sparse.coo_matrix((V, (I, J)), shape=(M, M))      # 根据I,J提供的结点索引和权重矩阵V的值构建邻接矩阵
    # print('W的维度:', W.shape)
    
    # No self-connections.
    W.setdiag(0)        # 忽略自身连接,因此对角矩阵置为0
    
    # Non-directed graph.   # 无向图
    bigger = W.T > W
    W = W - W.multiply(bigger) + W.T.multiply(bigger)
    
    assert W.nnz % 2 == 0
    assert np.abs(W - W.T).mean() < 1e-10
    assert type(W) is scipy.sparse.csr.csr_matrix
    # print('W的维度:',W.shape)  # W的维度: (784, 784)
    return W
    
    
    # "添加噪声的邻接矩阵"
    def replace_random_edges(A, noise_level):
    """Replace randomly chosen edges by random edges."""
    M, M = A.shape  # 784,784
    n = int(noise_level * A.nnz // 2)
    # print(n)  noise_level传入参数为0
    
    indices = np.random.permutation(A.nnz//2)[:n]   # np.random.permutation():随机排列序列。 noise_level传入参数为0 indices=[]
    rows = np.random.randint(0, M, n)   # noise_level传入参数为0 rows=[]
    cols = np.random.randint(0, M, n)   # noise_level传入参数为0 cols=[]
    vals = np.random.uniform(0, 1, n)   # noise_level传入参数为0 vals=[]
    assert len(indices) == len(rows) == len(cols) == len(vals)
    
    A_coo = scipy.sparse.triu(A, format='coo')  # scipy.sparse.triu(A, k=0, format=None) 以稀疏格式返回矩阵的上三角部分。foo代表稀疏矩阵格式
    # print('A_coo的维度',A_coo.shape)     # 784*784上三角矩阵784,783...1
    # print(A_coo)                        # 存储格式为矩阵结点索引+距离权重
    assert A_coo.nnz == A.nnz // 2
    assert A_coo.nnz >= n
    A = A.tolil()   # 对稀疏矩阵进程操作前的必要转换,提高计算效率
    
    for idx, row, col, val in zip(indices, rows, cols, vals):
        old_row = A_coo.row[idx]
        old_col = A_coo.col[idx]
    
        A[old_row, old_col] = 0
        A[old_col, old_row] = 0
        A[row, col] = 1
        A[col, row] = 1
    
    A.setdiag(0)    # 将对角矩阵置为0,不考虑自身连接
    A = A.tocsr()   # 转换数据格式
    A.eliminate_zeros()
    return A
    
    
    def laplacian(W, normalized=True):
    """Return the Laplacian of the weigth matrix."""
    
    # Degree matrix.
    d = W.sum(axis=0)   # 以行为推进计算每个点的度
    # print('d的维度:',d.shape)  # 得出每个点及其对应的度
    
    # Laplacian matrix.
    if not normalized:
        D = scipy.sparse.diags(d.A.squeeze(), 0)
        L = D - W
    else:
        d += np.spacing(np.array(0, W.dtype))
        d = 1 / np.sqrt(d)
        # print(d)
        D = scipy.sparse.diags(d.A.squeeze(), 0)    # 根据每个点的度构造度对角矩阵
        # print('D的维度:', D.shape)
        # print('D的值:', D)
        I = scipy.sparse.identity(d.size, dtype=W.dtype)
        L = I - D * W * D
    
    # assert np.abs(L - L.T).mean() < 1e-9
    assert type(L) is scipy.sparse.csr.csr_matrix
    return L
    
    
    def lmax(L, normalized=True):
    """Upper-bound on the spectrum."""
    if normalized:
        return 2
    else:
        return scipy.sparse.linalg.eigsh(
                L, k=1, which='LM', return_eigenvectors=False)[0]
    
    
    def fourier(L, algo='eigh', k=1):
    """Return the Fourier basis, i.e. the EVD of the Laplacian."""
    
    def sort(lamb, U):
        idx = lamb.argsort()
        return lamb[idx], U[:, idx]
    
    if algo is 'eig':
        lamb, U = np.linalg.eig(L.toarray())
        lamb, U = sort(lamb, U)
    elif algo is 'eigh':
        lamb, U = np.linalg.eigh(L.toarray())
    elif algo is 'eigs':
        lamb, U = scipy.sparse.linalg.eigs(L, k=k, which='SM')
        lamb, U = sort(lamb, U)
    elif algo is 'eigsh':
        lamb, U = scipy.sparse.linalg.eigsh(L, k=k, which='SM')
    
    return lamb, U
    
    
    def plot_spectrum(L, algo='eig'):
    """Plot the spectrum of a list of multi-scale Laplacians L."""
    # Algo is eig to be sure to get all eigenvalues.
    plt.figure(figsize=(17, 5))
    for i, lap in enumerate(L):
        lamb, U = fourier(lap, algo)
        step = 2**i
        x = range(step//2, L[0].shape[0], step)
        lb = 'L_{} spectrum in [{:1.2e}, {:1.2e}]'.format(i, lamb[0], lamb[-1])
        plt.plot(x, lamb, '.', label=lb)
    plt.legend(loc='best')
    plt.xlim(0, L[0].shape[0])
    plt.ylim(ymin=0)
    
    
    def lanczos(L, X, K):
    """
    Given the graph Laplacian and a data matrix, return a data matrix which can
    be multiplied by the filter coefficients to filter X using the Lanczos
    polynomial approximation.
    """
    M, N = X.shape
    assert L.dtype == X.dtype
    
    def basis(L, X, K):
        """
        Lanczos algorithm which computes the orthogonal matrix V and the
        tri-diagonal matrix H.
        """
        a = np.empty((K, N), L.dtype)
        b = np.zeros((K, N), L.dtype)
        V = np.empty((K, M, N), L.dtype)
        V[0, ...] = X / np.linalg.norm(X, axis=0)
        for k in range(K-1):
            W = L.dot(V[k, ...])
            a[k, :] = np.sum(W * V[k, ...], axis=0)
            W = W - a[k, :] * V[k, ...] - (
                    b[k, :] * V[k-1, ...] if k > 0 else 0)
            b[k+1, :] = np.linalg.norm(W, axis=0)
            V[k+1, ...] = W / b[k+1, :]
        a[K-1, :] = np.sum(L.dot(V[K-1, ...]) * V[K-1, ...], axis=0)
        return V, a, b
    
    def diag_H(a, b, K):
        """Diagonalize the tri-diagonal H matrix."""
        H = np.zeros((K*K, N), a.dtype)
        H[:K**2:K+1, :] = a
        H[1:(K-1)*K:K+1, :] = b[1:, :]
        H.shape = (K, K, N)
        Q = np.linalg.eigh(H.T, UPLO='L')[1]
        Q = np.swapaxes(Q, 1, 2).T
        return Q
    
    V, a, b = basis(L, X, K)
    Q = diag_H(a, b, K)
    Xt = np.empty((K, M, N), L.dtype)
    for n in range(N):
        Xt[..., n] = Q[..., n].T.dot(V[..., n])
    Xt *= Q[0, :, np.newaxis, :]
    Xt *= np.linalg.norm(X, axis=0)
    return Xt  # Q[0, ...]
    
    
    def rescale_L(L, lmax=2):
    """Rescale the Laplacian eigenvalues in [-1,1]."""
    M, M = L.shape
    I = scipy.sparse.identity(M, format='csr', dtype=L.dtype)
    L /= lmax / 2
    L -= I
    return L
    
    
    def chebyshev(L, X, K):
    """Return T_k X where T_k are the Chebyshev polynomials of order up to K.
    Complexity is O(KMN)."""
    M, N = X.shape
    assert L.dtype == X.dtype
    
    # L = rescale_L(L, lmax)
    # Xt = T @ X: MxM @ MxN.
    Xt = np.empty((K, M, N), L.dtype)
    # Xt_0 = T_0 X = I X = X.
    Xt[0, ...] = X
    # Xt_1 = T_1 X = L X.
    if K > 1:
        Xt[1, ...] = L.dot(X)
    # Xt_k = 2 L Xt_k-1 - Xt_k-2.
    for k in range(2, K):
        Xt[k, ...] = 2 * L.dot(Xt[k-1, ...]) - Xt[k-2, ...]
    return Xt
    
    
    
    
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    

coarsening.py

复制代码
    import numpy as np
    import scipy.sparse
    
    
    def coarsen(A, levels, self_connections=False):
    """
    Coarsen a graph, represented by its adjacency matrix A, at multiple
    levels.
    """
    #粗化计算
    graphs, parents = metis(A, levels)
    #细化计算
    perms = compute_perm(parents)
    
    for i, A in enumerate(graphs):
        M, M = A.shape
    
        if not self_connections:
            A = A.tocoo()
            A.setdiag(0)
    
        if i < levels:
            A = perm_adjacency(A, perms[i])
    
        A = A.tocsr()
        A.eliminate_zeros()
        graphs[i] = A
    
        Mnew, Mnew = A.shape
        print('Layer {0}: M_{0} = |V| = {1} nodes ({2} added),'
              '|E| = {3} edges'.format(i, Mnew, Mnew-M, A.nnz//2))
    
    return graphs, perms[0] if levels > 0 else None
    
    
    def metis(W, levels, rid=None):
    """
    Coarsen a graph multiple times using the METIS algorithm.
    
    INPUT
    W: symmetric sparse weight (adjacency) matrix
    levels: the number of coarsened graphs
    
    OUTPUT
    graph[0]: original graph of size N_1
    graph[2]: coarser graph of size N_2 < N_1
    graph[levels]: coarsest graph of Size N_levels < ... < N_2 < N_1
    parents[i] is a vector of size N_i with entries ranging from 1 to N_{i+1}
        which indicate the parents in the coarser graph[i+1]
    nd_sz{i} is a vector of size N_i that contains the size of the supernode in the graph{i}
    
    NOTE
    if "graph" is a list of length k, then "parents" will be a list of length k-1
    """
    
    N, N = W.shape
    if rid is None:
        rid = np.random.permutation(range(N))
    parents = []
    #Degree matrix
    degree = W.sum(axis=0) - W.diagonal()
    graphs = []
    graphs.append(W)
    #supernode_size = np.ones(N)
    #nd_sz = [supernode_size]
    #count = 0
    
    #while N > maxsize:
    for _ in range(levels):
    
        #count += 1
    
        # CHOOSE THE WEIGHTS FOR THE PAIRING
        # weights = ones(N,1)       # metis weights
        weights = degree            # graclus weights
        # weights = supernode_size  # other possibility
        #squeeze 函数:从数组的形状中删除单维度条目,即把shape中为1的维度去掉
        weights = np.array(weights).squeeze()
    
        # PAIR THE VERTICES AND CONSTRUCT THE ROOT VECTOR
        idx_row, idx_col, val = scipy.sparse.find(W)
        perm = np.argsort(idx_row)
        rr = idx_row[perm]
        cc = idx_col[perm]
        vv = val[perm]
        cluster_id = metis_one_level(rr,cc,vv,rid,weights)  # rr is ordered
        parents.append(cluster_id)
    
        # TO DO
        # COMPUTE THE SIZE OF THE SUPERNODES AND THEIR DEGREE 
        #supernode_size = full(   sparse(cluster_id,  ones(N,1) , supernode_size )     )
        #print(cluster_id)
        #print(supernode_size)
        #nd_sz{count+1}=supernode_size;
    
        # COMPUTE THE EDGES WEIGHTS FOR THE NEW GRAPH
        nrr = cluster_id[rr]
        ncc = cluster_id[cc]
        nvv = vv
        Nnew = cluster_id.max() + 1
        # CSR is more appropriate: row,val pairs appear multiple times
        W = scipy.sparse.csr_matrix((nvv,(nrr,ncc)), shape=(Nnew,Nnew))
        W.eliminate_zeros()
        # Add new graph to the list of all coarsened graphs
        graphs.append(W)
        N, N = W.shape
    
        # COMPUTE THE DEGREE (OMIT OR NOT SELF LOOPS)
        degree = W.sum(axis=0)
        #degree = W.sum(axis=0) - W.diagonal()
    
        # CHOOSE THE ORDER IN WHICH VERTICES WILL BE VISTED AT THE NEXT PASS
        #[~, rid]=sort(ss);     # arthur strategy
        #[~, rid]=sort(supernode_size);    #  thomas strategy
        #rid=randperm(N);                  #  metis/graclus strategy
        ss = np.array(W.sum(axis=0)).squeeze()
        rid = np.argsort(ss)
    
    return graphs, parents
    
    
    # Coarsen a graph given by rr,cc,vv.  rr is assumed to be ordered
    def metis_one_level(rr,cc,vv,rid,weights):
    
    nnz = rr.shape[0]
    N = rr[nnz-1] + 1
    
    marked = np.zeros(N, np.bool)
    rowstart = np.zeros(N, np.int32)
    rowlength = np.zeros(N, np.int32)
    cluster_id = np.zeros(N, np.int32)
    
    oldval = rr[0]
    count = 0
    clustercount = 0
    
    #计算每个顶点的连接长度
    for ii in range(nnz):
        rowlength[count] = rowlength[count] + 1
        if rr[ii] > oldval:
            oldval = rr[ii]
            rowstart[count+1] = ii
            count = count + 1
    
    #rid为随机数,将顶点顺序打乱
    for ii in range(N):
        tid = rid[ii]
        #根据craclue算法,marked的作用是将每个点标记
        #找到x邻域内的未标记点y进行计算
        if not marked[tid]:
            wmax = 0.0
            rs = rowstart[tid]
            marked[tid] = True
            bestneighbor = -1
            #rowlength[tid]表示每个点的邻域内的点
            #并根据e(x,y)*(a.0/weight(x)+1.0/weight(y))最大化
            for jj in range(rowlength[tid]):
                nid = cc[rs+jj]
                if marked[nid]:
                    tval = 0.0
                else:
                    tval = vv[rs+jj] * (1.0/weights[tid] + 1.0/weights[nid])
                if tval > wmax:
                    wmax = tval
                    bestneighbor = nid
    
            cluster_id[tid] = clustercount
    
            if bestneighbor > -1:
                cluster_id[bestneighbor] = clustercount
                marked[bestneighbor] = True
    
            clustercount += 1
    
    return cluster_id
    
    def compute_perm(parents):
    """
    Return a list of indices to reorder the adjacency and data matrices so
    that the union of two neighbors from layer to layer forms a binary tree.
    """
    
    # Order of last layer is random (chosen by the clustering algorithm).
    indices = []
    if len(parents) > 0:
        M_last = max(parents[-1]) + 1
        indices.append(list(range(M_last)))
    
    # 取从后向前(相反)的元素
    for parent in parents[::-1]:
        #print('parent: {}'.format(parent))
    
        # Fake nodes go after real ones.
        pool_singeltons = len(parent)
    
        indices_layer = []
        for i in indices[-1]:
            indices_node = list(np.where(parent == i)[0])
            assert 0 <= len(indices_node) <= 2
            #print('indices_node: {}'.format(indices_node))
    
            # Add a node to go with a singelton.
            if len(indices_node) is 1:
                indices_node.append(pool_singeltons)
                pool_singeltons += 1
                #print('new singelton: {}'.format(indices_node))
            # Add two nodes as children of a singelton in the parent.
            elif len(indices_node) is 0:
                indices_node.append(pool_singeltons+0)
                indices_node.append(pool_singeltons+1)
                pool_singeltons += 2
                #print('singelton childrens: {}'.format(indices_node))
    
            indices_layer.extend(indices_node)
        indices.append(indices_layer)
    
    # Sanity checks.
    for i,indices_layer in enumerate(indices):
        M = M_last*2**i
        # Reduction by 2 at each layer (binary tree).
        assert len(indices[0] == M)
        # The new ordering does not omit an indice.
        assert sorted(indices_layer) == list(range(M))
    
    return indices[::-1]
    
    assert (compute_perm([np.array([4,1,1,2,2,3,0,0,3]),np.array([2,1,0,1,0])])
        == [[3,4,0,9,1,2,5,8,6,7,10,11],[2,4,1,3,0,5],[0,1,2]])
    
    def perm_data(x, indices):
    """
    Permute data matrix, i.e. exchange node ids,
    so that binary unions form the clustering tree.
    """
    if indices is None:
        return x
    
    N, M = x.shape
    Mnew = len(indices)
    assert Mnew >= M
    xnew = np.empty((N, Mnew))
    for i,j in enumerate(indices):
        # Existing vertex, i.e. real data.
        if j < M:
            xnew[:,i] = x[:,j]
        # Fake vertex because of singeltons.
        # They will stay 0 so that max pooling chooses the singelton.
        # Or -infty ?
        else:
            xnew[:,i] = np.zeros(N)
    return xnew
    
    def perm_adjacency(A, indices):
    """
    Permute adjacency matrix, i.e. exchange node ids,
    so that binary unions form the clustering tree.
    """
    if indices is None:
        return A
    
    M, M = A.shape
    Mnew = len(indices)
    assert Mnew >= M
    A = A.tocoo()
    
    # Add Mnew - M isolated vertices.
    if Mnew > M:
        rows = scipy.sparse.coo_matrix((Mnew-M,    M), dtype=np.float32)
        cols = scipy.sparse.coo_matrix((Mnew, Mnew-M), dtype=np.float32)
        A = scipy.sparse.vstack([A, rows])
        A = scipy.sparse.hstack([A, cols])
    
    # Permute the rows and the columns.
    perm = np.argsort(indices)
    A.row = np.array(perm)[A.row]
    A.col = np.array(perm)[A.col]
    
    # assert np.abs(A - A.T).mean() < 1e-9
    assert type(A) is scipy.sparse.coo.coo_matrix
    return A
    
    
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    

在这里插入图片描述
models.py中的 class cgcnn(base_model)

复制代码
    def build_graph(self, M_0):
        """Build the computational graph of the model."""
        self.graph = tf.Graph()
        with self.graph.as_default():
     
            # Inputs.
            with tf.name_scope('inputs'):
                self.ph_data = tf.placeholder(tf.float32, (self.batch_size, M_0), 'data')
                self.ph_labels = tf.placeholder(tf.int32, (self.batch_size), 'labels')
                self.ph_dropout = tf.placeholder(tf.float32, (), 'dropout')
     
            # Model.
            op_logits = self.inference(self.ph_data, self.ph_dropout)  #!!!!!!#
            self.op_loss, self.op_loss_average = self.loss(op_logits, self.ph_labels, self.regularization)
            self.op_train = self.training(self.op_loss, self.learning_rate,
                    self.decay_steps, self.decay_rate, self.momentum)
            self.op_prediction = self.prediction(op_logits)
     
            # Initialize variables, i.e. weights and biases.
            self.op_init = tf.global_variables_initializer()
            
            # Summaries for TensorBoard and Save for model parameters.
            self.op_summary = tf.summary.merge_all()
            self.op_saver = tf.train.Saver(max_to_keep=5)
        
        self.graph.finalize()
    
    
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    

以cgcnn中的_inference为例分析(models.py中的其他是论文中提到的其他模型或配置实现),cgcnn为作者提出的基准模型。

复制代码
    def _inference(self, x, dropout):
        # Graph convolutional layers.
        x = tf.expand_dims(x, 2)  # N x M x F=1
        for i in range(len(self.p)):  #依据池化次数,设置卷积(池化and卷积一对一)
            with tf.variable_scope('conv{}'.format(i+1)):  #此scope中的为一次图卷积
                with tf.name_scope('filter'):              #卷积中的滤波器
                    x = self.filter(x, self.L[i], self.F[i], self.K[i])
                with tf.name_scope('bias_relu'):           #卷积中的激活函数
                    x = self.brelu(x)
                with tf.name_scope('pooling'):             #卷积后的池化
                    x = self.pool(x, self.p[i])
        
        # Fully connected hidden layers.
        N, M, F = x.get_shape()
        x = tf.reshape(x, [int(N), int(M*F)])  # N x M
        for i,M in enumerate(self.M[:-1]):
            with tf.variable_scope('fc{}'.format(i+1)):
                x = self.fc(x, M)
                x = tf.nn.dropout(x, dropout)
        
        # Logits linear layer, i.e. softmax without normalization.
        with tf.variable_scope('logits'):
            x = self.fc(x, self.M[-1], relu=False)
        return x
    
    
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    

分解上述代码scope(‘filter’),发现他是借助getattr()重新与指定函数绑定,如和chebyshev2()绑定,这里我们以chebyshev2()(PS 该函数即为filter,且为作者提出的经典类型,功能应该和chebyshv5()相同,只不过通过numpy实现,而chebyshev5借助TF实现,)为例展开解释filter如何编写的(numpy比TF实现看起来更好理解)。

复制代码
    def chebyshev2(self, x, L, Fout, K):
        """
        Filtering with Chebyshev interpolation
        Implementation: numpy.
        
        Data: x of size N x M x F
            N: number of signals
            M: number of vertices
            F: number of features per signal per vertex
        """
        N, M, Fin = x.get_shape()
        N, M, Fin = int(N), int(M), int(Fin)
        # Rescale Laplacian. Copy to not modify the shared L.
        L = scipy.sparse.csr_matrix(L)
        L = graph.rescale_L(L, lmax=2)
        # Transform to Chebyshev basis
        x = tf.transpose(x, perm=[1, 2, 0])  # M x Fin x N
        x = tf.reshape(x, [M, Fin*N])  # M x Fin*N   #将X所有点所有特征放到一个维度上,得X
        def chebyshev(x):     #在X上用用chebyshev,返回T_k X(py_func将X转到numpy array运行)
            return graph.chebyshev(L, x, K)        
        x = tf.py_func(chebyshev, [x], [tf.float32])[0]  # K x M x Fin*N
        x = tf.reshape(x, [K, M, Fin, N])  # K x M x Fin x N            # 维度转换
        x = tf.transpose(x, perm=[3,1,2,0])  # N x M x Fin x K
        x = tf.reshape(x, [N*M, Fin*K])  # N*M x Fin*K                          
        # Filter: Fin*Fout filters of order K, i.e. one filterbank per feature.
        W = self._weight_variable([Fin*K, Fout], regularization=False)  #权重乘法
        x = tf.matmul(x, W)  # N*M x Fout
        return tf.reshape(x, [N, M, Fout])  # N x M x Fout
    
    
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    

从上面可以看到,先将x转换到稀疏矩阵的形式,然后维度变形,之后在其上使用graph.chebyshev()应用chebyshev多项式的迭代计算函数,获得T_k,即论文中使用chebyshev来计算T_k-–下图左, 来近似diag(λ^k)—下图右),同时我们注意到下面公式里含有求和(关于k),而上面代码中并没有,而是通过增加权重参数(w本为【Fin*Fout】,现为【Fin*k,Fout】)将其从求和变为x*w(shape(w) = [Fin*k,Fout])的形式。
在这里插入图片描述

复制代码
    def chebyshev(L, X, K):
    """Return T_k X where T_k are the Chebyshev polynomials of order up to K.
    Complexity is O(KMN)."""
    '''来自于graph.py
    返回T_k X,其中T_k是最多为K阶的Chebyshev多项式。'''
    M, N = X.shape
    assert L.dtype == X.dtype
     
    # L = rescale_L(L, lmax)
    # Xt = T @ X: MxM @ MxN.
    Xt = np.empty((K, M, N), L.dtype)
    # Xt_0 = T_0 X = I X = X.
    Xt[0, ...] = X
    # Xt_1 = T_1 X = L X.
    if K > 1:
        Xt[1, ...] = L.dot(X)
    # Xt_k = 2 L Xt_k-1 - Xt_k-2.
    for k in range(2, K):
        Xt[k, ...] = 2 * L.dot(Xt[k-1, ...]) - Xt[k-2, ...]
    return Xt
    
    
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    

上面的chebyshev即借助chebyshev的多项式递推性质而写。获得T_k后经过适当变形处理,即可和需要学习的参数θ相乘,然后返回结果,至此一次filter()结束
Pool并不是作者提出,其重新组织为1D信号时需要采取的粗化与重排
而是将出粗化重排结果转换为拉普拉斯矩阵,用在卷积过程

复制代码
    def mpool1(self, x, p):
        """Max pooling of size p. Should be a power of 2."""
        if p > 1:
            x = tf.expand_dims(x, 3)  # N x M x F x 1
            x = tf.nn.max_pool(x, ksize=[1,p,1,1], strides=[1,p,1,1], padding='SAME')
            #tf.maximum
            return tf.squeeze(x, [3])  # N x M/p x F
        else:
            return x
    
    def apool1(self, x, p):
        """Average pooling of size p. Should be a power of 2."""
        if p > 1:
            x = tf.expand_dims(x, 3)  # N x M x F x 1
            x = tf.nn.avg_pool(x, ksize=[1,p,1,1], strides=[1,p,1,1], padding='SAME')
            return tf.squeeze(x, [3])  # N x M/p x F
        else:
            return x
    
    
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    

全部评论 (0)

还没有任何评论哟~