Advertisement

Neural Network from Scratch in Cangjie: Part 1 - 仓颉从头开始的神经网络:第一部分

阅读量:

This tutorial series intends to simulate an artificial neural network as per the textbook Neural Networks from Scratch by Harrison Kinsley (nnfs.io) using Cangjie input instead of Python. It will begin with a single neuron's code and progressively create layers of neurons while integrating batching functionality, implementing activation and loss functions step-by-step alongside explaining the backpropagation algorithm through detailed explanations and examples at each incremental implementation step.

Introduction

Certainly, a good number of readers are acquainted with the concept behind neural networks, commonly referred to as NN. Here's an illustration: Below, we present a typical neural network architecture for binary classification tasks. This network comprises five distinct layers:

  • The initial layer functions as the input layer, characterized by a dimensionality corresponding to your dataset.
  • Each subsequent layer processes information progressively.
  • The final output represents the classification outcome.
    The first layer has 10 units, which corresponds to your dataset.

Each input corresponds to a data point. For example,in the context of Twitter-based bots,a set of features might include metrics like followers,friends,and active time measured in hours.Alternatively,it's essential to recognize that these indicators collectively help determine whether an account holds legitimate user status versus being automated - your y value,the predicted outcome.

Source Link: Bot or Not: A Comprehensive End-to-End Data Analysis in Python

It's crucial to observe that both X and the input layer's values must either be numeric or representable as such. For instance, as depicted above, the diversity feature refers to lexical diversity—defined as the ratio of unique tokens to total tokens within a document—rather than actual unique words.

Layers 2, 3, and 4 of size 16 are dense hidden layers. These layers form the core architecture of the model. A neural network consists of interconnected neuron layers that act as sieves or classifiers to process input data, filtering it through various stages to produce output. Each neuron is interconnected with preceding and succeeding neurons, forming a network that processes information through layers. The network initializes each neuron with random weights (typically within -1 to 1) and biases (set to zero). This initial state represents the "forward pass," where input propagates through the network to generate random output. The system then enters the "backward pass" or "backpropagation" phase, where it iteratively adjusts each weight and bias to better fit the data and learn underlying patterns. This optimization process utilizes an "optimizer," an algorithm that minimizes prediction errors by refining these parameters over successive iterations.

Upon completion of our neural network model training process, we arrive at the output layer. This layer is designed to contain two neurons whose states represent binary values: True/False, 1/0, or presence/absence. Notably, multi-class classification models can accommodate a variety of output possibilities (for example: examples like cats, dogs, and birds) or even continuous values such as price in regression models. The functionality and configuration of this layer are determined by specifying the y variable and selecting an appropriate activation function; these aspects will be elaborated upon later in this tutorial series.

Coding a single neuron

Let's start simple. Each neuron consists of three elements: weights, bias, and output, which become input for the next neuron in the sequence. Activation functions work on the output, but are not part of the neuron, so to speak. Imagine we have 4 inputs; these are either your X values or outputs from a previous layer of neurons. As a reminder, weights are initialized randomly and bias is set to 0; these are the parameters that get re-adjusted during the backward pass when the model actually learns from the data. Here we have a neuron that has already been re-adjusted by an optimizer.

What would be the output of this neuron?

复制代码
 main() {

    
     let inputs: Array<Float64> = [1.0, 2.0, 3.0, 2.5]
    
     let weights: Array<Float64> = [0.2, 0.8, -0.5, 1.0]
    
     let bias: Float64 = 2.0
    
  
    
     let output: Float64 = inputs[0] * weights[0] + inputs[1] * weights[1] + inputs[2] * weights[2] + inputs[3] * weights[3] +
    
     bias
    
  
    
     println(output)
    
 }
    
  
    
 >>> 4.800000

We calculate an output value of 48,000. The output value equals the summation of products between each input and its corresponding weight (referred to as a dot product) plus a unique bias assigned to this neuron. Well done for successfully implementing the fundamental building block of our neural network using仓颉编码! Now, what about creating an entire layer composed entirely of neurons?

Coding a layer of neurons

Representing a layer of neurons is not different from representing a single neuron. In this case, rather than having one set of weights and one bias, we have three sets of weights and three biases.

复制代码
 main() {

    
     let inputs: Array<Float64> = [1.0, 2.0, 3.0, 2.5]
    
  
    
     let weights1: Array<Float64> = [0.2, 0.8, -0.5, 1.0]
    
     let weights2: Array<Float64> = [0.5, -0.91, 0.26, -0.5]
    
     let weights3: Array<Float64> = [-0.26, -0.27, 0.17, 0.87]
    
  
    
     let bias1: Float64 = 2.0
    
     let bias2: Float64 = 3.0
    
     let bias3: Float64 = 0.5
    
  
    
     let outputs: Array<Float64>
    
     outputs = [
    
     inputs[0] * weights1[0] + inputs[1] * weights1[1] + inputs[2] * weights1[2] + inputs[3] * weights1[3] + bias1,
    
     inputs[0] * weights2[0] + inputs[1] * weights2[1] + inputs[2] * weights2[2] + inputs[3] * weights2[3] + bias2,
    
     inputs[0] * weights3[0] + inputs[1] * weights3[1] + inputs[2] * weights3[2] + inputs[3] * weights3[3] + bias3
    
     ]
    
  
    
     println(outputs)
    
 }
    
  
    
 >>> [4.800000, 1.210000, 2.385000]

This picture now has become clearer.
We easily observe that the output of one layer serves as the input for the subsequent layers, and so on.

The issue is that the fixed-length chain of multiplication and addition grows longer as more neurons are added to the layer. It becomes necessary to seek a more effective method.

Coding a layer of neurons, a better way

复制代码
 import std.collection.*

    
  
    
 main() {
    
     let inputs = ArrayList<Float64>([1.0, 2.0, 3.0, 2.5])
    
     let weights = ArrayList<Array<Float64>>([[0.2, 0.8, -0.5, 1.0], [0.5, -0.91, 0.26, -0.5], [-0.26, -0.27, 0.17, 0.87]]
    
     )
    
     let biases = ArrayList<Float64>([2.0, 3.0, 0.5])
    
  
    
     let layerOutputs = ArrayList<Float64>([])
    
  
    
     for (i in 0..weights.size) {
    
     let neuronWeights = weights[i]
    
     let neuronBias = biases[i]
    
  
    
     var neuronOutput = 0.0
    
     for (j in 0..inputs.size) {
    
         let nInput = inputs[j]
    
         let weight = neuronWeights[j]
    
  
    
         neuronOutput += nInput * weight
    
     }
    
     neuronOutput += neuronBias
    
  
    
     layerOutputs.append(neuronOutput)
    
     }
    
  
    
     println(layerOutputs)
    
 }
    
  
    
 >>> [4.800000, 1.210000, 2.385000]

In this context, the weights are denoted by multi-dimensional arrays, with one sub-array corresponding to each neuron. The biases constitute another set of arrays, one for each neuron.

Note the shift in variable declaration. In this updated version of our code, we access the ArrayList class from the collection standard library, which provides a data structure for appending values to arrays, akin to Python lists. The standard Array class does not support this feature. A distinguishing feature of ArrayList is that it is an instance of the Array class, not a distinct type. We achieve this by initializing the Array with our nested array using its constructor.

The code has become slightly more intricate. How can we merge each input with its corresponding weight, include a bias, and perform this operation repeatedly for each neuron? Imagine manually writing this process 100 times!

Regrettably, Cangjie lacks the practical zip function found in Python, which would enable simultaneous iteration through multiple arrays and efficient processing of their elements within a single tick. However, there is an alternative approach available!

I had made a mistake earlier; here's how to employ the zip function: [如何使用仓颉filter, contain, forEach, map, zip, enumerate, reduce Array函数-博客]( "如何使用仓颉filter, contain, forEach, map, zip, enumerate, reduce Array函数-博客)

Firstly, we establish a range by determining the element count from the longest array, specifically identifying that this is represented by weights with a length of 3, which corresponds to our 3 neurons. We then access each element within these arrays using their respective indices i. This method ensures that we can systematically iterate through all weights and biases within our network structure. The process enables us to collect a comprehensive set of parameters for further computation.

Then, we construct a subsequent loop in which we pair up input values with corresponding weights, allowing us to normalize the input values and compute their dot product.

After completing the computation of the first neuron, we add its result to the overall layer output. Once we have calculated it, we then perform the same operation for each of the remaining two neurons.

The second version of the code enables scaling the layer to hundreds and hundreds of neurons, potentially. Of course, since weights are initialized randomly, we will also delegate that functionality to a function in later tutorials.

The upcoming tutorial will delve into the matrix4cj package, which is commonly referred to as batching. The model tends to perform better when it handles multiple data batches instead of a single batch at a time.

全部评论 (0)

还没有任何评论哟~