AdamTechLouis's talk:Deep Learning on Graphs.

Introduction

We are currently formulating a novel approach to advancing machine learning with an emphasis on the emerging data fabric paradigm.
In the past article I gave my new definition of machine learning:
机器学习是一种自动化的方法,在数据织物中发现隐藏见解的过程中发挥了作用;它通过使用能够识别数据中隐藏见解的算法来实现这一目标;这些算法不需要预先编写针对特定任务的代码即可完成任务;最终目的是构建能够解决单一问题或多个问题的模型。
The foundation upon which understanding this relies is that we have established a data fabric. I consider the optimal solution tailored to my needs to be Anzo, which I have recommended in previous articles.

With Anzo, you can establish a system known as "The Enterprise Knowledge Graph," and naturally establish a data fabric for your organization.
Currently, my focus is on a specialized area within the realm of machine learning and specifically within the domain of deep learning. In another article, I provided a definition of.
Deep learning is a dedicated subfield of machine learning, representing an innovative approach that focuses on the ability to learn hierarchical representations from data. This method particularly emphasizes the learning of successive stages or layers within neural networks, each stage contributing to increasingly meaningful representations.
Let's delve into an integration of deep learning and graph theory to explore the potential it holds for advancing our research.
Objectives
General
Set the basis of doing deep learning on the data fabric.
Specifics
Introduce the fundamentals of deep learning on graph structures.
Investigate into the features and functionalities of Spektral.
Assess the feasibility of implementing deep learning techniques on a data fabric.
Main Hypothesis
If we can assemble a data fabric that incorporates all company-wide information, an automated process for extracting insights through learning can be integrated into this system by leveraging highly meaningful representations from data, utilizing neural networks (deep learning).
Section 1. Deep learning on graphs?

Typically, neural networks are constructed using tensors. However, it is important to note that tensors can be represented as matrices, and graphs can be defined through these matrices.
Within the documentation of the library Spektral, they mention that a graph is typically denoted by three fundamental matrices.
A \in \{0,1\}^{N \times N}: a binary adjacency matrix defined such that A_{i,j} = 1 if nodes i and j are connected, and A_{i,j} = 0 otherwise;
X \in \mathbb{R}^{N \times F}: a matrix encoding node attributes (or features), where an F-dimensional attribute vector is assigned to each node;
E \in \mathbb{R}^{N \times N \times S}: a tensor encoding edge attributes, where an S-dimensional attribute vector is assigned to each edge.
This brief overview introduces the fundamental concepts required for understanding deep learning on graphs. For a comprehensive exploration of this topic, particularly focusing on graph convolutional networks, visit the article by Tobias Skovgaard Jepsen.
The important part here is the concept of Graph Neural Networks (GNN).
Graph Neural Networks (GNN)

The fundamental concept behind GNN is straightforward: it aims to capture the inherent structural properties of the graph. Specifically, each node v_i within this framework is assigned a low-dimensional embedding vector s_i (where 1 ≤ i ≤ N). Note that vectors can be conceptualized as rank-1 tensors, while tensors themselves are often expressed using matrices.
Involving the development and training of deep graph models, the tasks for learning deep models on graphs can be significantly divided into two major categories.
- Node-focused tasks: These tasks involve individual nodes within a graph. Examples include node classification, link prediction, and node-based recommendation.
- Graph-focused tasks: These tasks revolve around the entire graph. Examples include graph classification, estimating graph properties, or generating new graph structures.
Section 2. Deep Learning with Spektral

The author introduced Spektral as a framework for relational representation learning, which was built upon the Keras Application Programming Interface (API).
Installation
We are going to use MatrixDS both for the purpose of handling tasks and executing our own code. Remember that in Anzo, you will have the ability to import this code and run it within their platform as well.
The initial step you must take is required to fork the MatrixDS project: https://community.platform.matrixds.com/community/project/5c6ae7c8c1b06ba1e18f2a6e/files
Click on:

you will have the library installed and everything working ?.
If you are executing this outside, keep in mind that the current setup was originally developed for Ubuntu 16.04 and 18.04, and it's advisable to install the appropriate dependencies beforehand.
sudo apt install graphviz libgraphviz-dev libcgraph6
AI助手
And then install the library with:
pip install spektral
AI助手
Data representation
In Spektral, certain layers and functions are designed to operate on individual graphs. In contrast, other modules process collections (such as datasets or batches) of graphs.
The framework distinguish between three main modes of operation:
In the single mode, we address the processing of a single graph, incorporating its topological structure and attributes.
For batch processing, we handle multiple graphs simultaneously, each with their unique topological features and attributes.
In the mixed mode, we process a graph that maintains its topology structure while aggregating diverse attributes. This can be viewed as a specific instance of batch processing (where all adjacency matrices are identical) but is treated separately for computational efficiency.

For example if we run
from spektral.datasets import citation
adj, node_features, edge_features, _, _, _, _, _ = citation.load_data('cora')
AI助手
We will be loading the data in sigle mode:
Our Adjacency matrix is:
In [3]: adj.shape
Out[3]: (2708, 2708)
AI助手
Out note attributes are:
In [3]: node_attributes.shape
Out[3]: (2708, 2708)
AI助手
And our edge attributes are:
In [3]: edge_attributes.shape
Out[3]: (2708, 7)
AI助手
Semisupervised classification with Graph Attention layers (GAT)
Disclaimer: I'm assuming that you are already acquainted with Keras from this point forward.
Please refer to the subsequent sections for a detailed explanation of the model structure, including its architecture, components, and implementation specifics. Additionally, these resources will provide insights into optimizing performance and enhancing scalability.
该平台为开发者提供了一个开放的技术生态社区环境,在矩阵平台的基础上构建了全新的协作模式
A Graph Attention Network (GAT) represents an innovative neural architecture designed to process data structured as graphs, employing masked self-attention mechanisms. Within Spektral, the GraphAttention layer performs convolutions akin to those in GraphConv layers but applies attention mechanisms to modulate the adjacency matrix rather than relying on normalized Laplacians.
Working through layers, nodes can focus on their neighborhood's features, which implicitly allows them to assign varying weights to different nodes within a locale. This approach avoids the need for costly matrix operations like inversion and does not rely on prior knowledge of the graph structure.

https://arxiv.org/pdf/1812.04202.pdf. The attention mechanism incorporated within the model is parameterized using a weight vector, which employs a LeakyReLU activation function.
The model we will use is simple enough:
# Layers
dropout_1 = Dropout(dropout_rate)(X_in)
graph_attention_1 = GraphAttention(gat_channels,
attn_heads=n_attn_heads,
attn_heads_reduction='concat',
dropout_rate=dropout_rate,
activation='elu',
kernel_regularizer=l2(l2_reg),
attn_kernel_regularizer=l2(l2_reg))([dropout_1, A_in])
dropout_2 = Dropout(dropout_rate)(graph_attention_1)
graph_attention_2 = GraphAttention(n_classes,
attn_heads=1,
attn_heads_reduction='average',
dropout_rate=dropout_rate,
activation='softmax',
kernel_regularizer=l2(l2_reg),
attn_kernel_regularizer=l2(l2_reg))([dropout_2, A_in])
# Build model
model = Model(inputs=[X_in, A_in], outputs=graph_attention_2)
optimizer = Adam(lr=learning_rate)
model.compile(optimizer=optimizer,
loss='categorical_crossentropy',
weighted_metrics=['acc'])
model.summary()
# Callbacks
es_callback = EarlyStopping(monitor='val_weighted_acc', patience=es_patience)
tb_callback = TensorBoard(log_dir=log_dir, batch_size=N)
mc_callback = ModelCheckpoint(log_dir + 'best_model.h5',
monitor='val_weighted_acc',
save_best_only=True,
save_weights_only=True)
AI助手
Btw, the model is quite big:
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) (None, 1433) 0
__________________________________________________________________________________________________
dropout_1 (Dropout) (None, 1433) 0 input_1[0][0]
__________________________________________________________________________________________________
input_2 (InputLayer) (None, 2708) 0
__________________________________________________________________________________________________
graph_attention_1 (GraphAttenti (None, 64) 91904 dropout_1[0][0]
input_2[0][0]
__________________________________________________________________________________________________
dropout_18 (Dropout) (None, 64) 0 graph_attention_1[0][0]
__________________________________________________________________________________________________
graph_attention_2 (GraphAttenti (None, 7) 469 dropout_18[0][0]
input_2[0][0]
==================================================================================================
Total params: 92,373
Trainable params: 92,373
Non-trainable params: 0
AI助手
Given that you lack substantial computational resources, it's advisable to tune the number of epochs and experiment with different configurations. Note that upgrading your account is a straightforward process.
Then we train it (this may take some hours if you don’t have enough power):
# Train model
validation_data = ([node_features, adj], y_val, val_mask)
model.fit([node_features, adj],
y_train,
sample_weight=train_mask,
epochs=epochs,
batch_size=N,
validation_data=validation_data,
shuffle=False, # Shuffling data means shuffling the whole graph
callbacks=[es_callback, tb_callback, mc_callback])
AI助手
Get the best model:
model.load_weights(log_dir + 'best_model.h5')
AI助手
And evaluate it:
print('Evaluating model.')
eval_results = model.evaluate([node_features, adj],
y_test,
sample_weight=test_mask,
batch_size=N)
print('Done.\n'
'Test loss: {}\n'
'Test accuracy: {}'.format(*eval_results))
AI助手
Learn more within the MatrixDS project: https://community.platform.matrixds.com/community/project/5c6ae7c8c1b06ba1e18f2a6e/files
Section 3. Where does this fit in the data fabric?
If you remember from last part that if we have a data fabric:

An insight can be thought as a dent in it:

Upon entering the MatrixDS platform while studying this tutorial, you may notice that our dataset isn't merely a straightforward CS, but instead, we supplied it with:
- a square adjacency matrix,
- an attribute matrix with D features per node, and
- a binary class matrix with E classes.
And that was stored is a series of files:
ind.dataset_str.x => the feature vectors of the training instances as scipy.sparse.csr.csr_matrix object; ind.dataset_str.tx => the feature vectors of the test instances as scipy.sparse.csr.csr_matrix object; ind.dataset_str.allx => the feature vectors of both labeled and unlabeled training instances (a superset of ind.dataset_str.x) as scipy.sparse.csr.csr_matrix object; ind.dataset_str.y => the one-hot labels of the labeled training instances as numpy.ndarray object; ind.dataset_str.ty => the one-hot labels of the test instances as numpy.ndarray object; ind.dataset_str.ally => the labels for instances in ind.dataset_str.allx as numpy.ndarray object; ind.dataset_str.graph => a dict in the format {index: [index_of_neighbor_nodes]} as collections.defaultdict object; ind.dataset_str.test.index => the indices of test instances in graph, for the inductive setting as list object.
AI助手
该数据存在于图中。我们通过将这些数据导入该库实现了这一功能。实际上,在库中你可以将数据从多种格式(如NetworkX、numpy和sdf)导入或导出。
This indicates that when storing data in a data fabric, it implies that by having a knowledge graph, we already possess numerous such features. What we need to do is establish connections between this resource and the library system. This presents challenges due to its complexity.
Following this, we can begin extracting meaningful insights from your data fabrics by executing deep learning algorithms on their graph structures via processes involving nodes and connections.
An intriguing aspect of this endeavor is that there exist methods to run these algorithms directly within the graph, which necessitates developing solutions tailored to process data embedded within its architecture. Among notable approaches addressing this challenge is Lauren Shin's innovative use of Neo4j.
Visual Representations and Machine Learning: Multivariate Linear Regression Analysis
But it’s also an ongoing project, which can be envisioned as a stepwise progression.

Meaning that a neural network is capable of living within the data fabric, while utilizing resources within it for functioning.
Perhaps, though I'm not explicitly discussing it here, one key point worth noting is the concept of non-euclidean data, which I will return to in due course.
Conclusions
A feasible approach to leveraging deep learning algorithms on a data fabric involves deploying graph neural network models tailored for handling graph-structured datasets. This can be achieved by integrating a knowledge graph with libraries like Spektral, potentially alongside other frameworks. The integration hinges on successfully connecting these components within a unified system architecture.
Beyond conventional graph inference tasks, including node and graph classification, graph-based deep learning methods have been extensively applied across numerous disciplines. These include modeling social influence, recommendation systems, chemistry, physics, disease or drug prediction, natural language processing (NLP), computer vision, traffic forecasting, programmatic induction, and solving complex NP problems. For further details, please refer to https://arxiv.org/pdf/1812.04202.pdf.
Possibilities are boundless, and this marks the start of a new thrilling chapter. Keep an eye out for more developments!
