Activation Functions in Neural Networks

阅读量：

This article draws inspiration from two sources: here and here.

激活函数的主要作用是为神经网络引入非线性特性。在神经科学中模拟的是一个神经元的动作机制。

A neural network lacking activation functions is essentially equivalent to a linear regression model. The term neural networks encompasses the concept of universal function approximators, which have been widely recognized for their versatility. This implies that neural networks are capable of computing and learning an array of functions with varying complexity. It is practical to represent nearly every conceivable process through functional computations within neural networks.

- 激活函数must be 'differentiable' for gradient descent.

- 一些常见的Function

Sigmoid (deprecated). Output in [0,1], 不利于优化. 存在Vanishing Gradient问题（gradients get smaller and smaller during back-propagation. Early laryers cannot learn.）。
Tanh (deprecated). Output in [-1,1]. Dominates Sigmoid but same vanishing gradient problem.
ReLU-Rectified Linear Unit. f=max(0,x). 非常简单有效地解决了VG问题。不过不能用在最后一层（使用linear unit or softmax）。另外有可能导致Dead Neurons (when x<0, the neuron never activates). 可以使用Leaky ReLU 或者maxout解决这个问=题。
1. Leaky ReLu: introduces a small slope to keep the updates alive.
2. Benefits of ReLU: Cheap to compute ,converges faster, capable of outputting a true zero value allowing Representational Sparsity
Maxout. is a generalization of the ReLU and the leaky ReLU functions.

This activation function is adaptive.

This function is a piecewise linear one, returning the maximum of its inputs and designed to be used alongside dropout regularization, where weights are set to zero with a certain probability.

Both Rectified Linear Unit (ReLU) and leaky ReLU are special cases of this activation function. The Maxout neuron benefits from the advantages of ReLU units (linear regime of operation without saturation), while avoiding their drawbacks (dying ReLU).

However, this approach increases the number of parameters per neuron by double, necessitating a larger parameter set during training.

Conclusion

employ ReLU for hidden layer activation but must pay attention to learning rates and track dead neurons.
If ReLU is causing issues, consider alternatives like Leaky ReLU, PReLU or Maxout; avoid using sigmoid.
normalize data to maximize validation accuracy; standardize when expediency is key.
These functions are unsuitable for deep networks because their gradients diminish during backpropagation.

全部评论 (0)

还没有任何评论哟~

Activation Functions in Neural Networks

Thisarticleisinspiredby这里and这里. \激活函数的主要意义是为NN加入非线性的元素。在神经学上模仿的是一个神经元是否有效。 ANeuralNetworkwithoutActi...

PACT: PARAMETERIZED CLIPPING ACTIVATION FOR QUANTIZED NEURAL NETWORKS

本文的主要贡献有三点： 1、提出了pact方法，它给激活函数设定了一个可学习的α截断参数。 2、在2bit以下，它的效果比所有当时的模型都好；4bit的精度接近单精度的表现。 3、它可以提供一种平衡模...

Neural Activation Constellations: Unsupervised Part Model Discovery with Convolutional Networks

本人是学生一枚，理解可能颇有偏颇写在前面：个人理解的一些比较关键的点过程：正常卷积，对某层产生P（通道数）个的激活图求梯度得到P个特征梯度图，在此图上找响应最强烈的位置,作为原图中的关键点。

论文笔记：Exploring Neural Networks with Activation Atlases

主要内容本文在spatialactivationvisualization的基础上，通过对每个样本的activationvector进行数据降维到二维空间中，并且使用归因向量平均（averageth...

UNDERSTANDING STRAIGHT-THROUGH ESTIMATOR IN TRAINING ACTIVATION QUANTIZED NEURAL NETS

STE作为一种方法已经被提出很久了，也出现了很多它的变体。但它在本质上，是用了一个与当前函数无关的梯度来代替原有的梯度，为什么这个朝着这个替代品的梯度下降的方向优化网络有用？面对已经提出的许多STE的...

【吴恩达深度学习专栏】浅层神经网络(Shallow neural networks)——激活函数（Activation functions）

文章目录 3.6激活函数（Activationfunctions） 3.6激活函数（Activationfunctions）使用一个神经网络时，需要决定使用哪种激活函数用隐藏层上，哪种用在输出节点上...

Neural Networks in Healthcare: LifeChanging Applications

1.背景介绍人工智能（ArtificialIntelligence,AI）和深度学习（DeepLearning,DL）技术在医疗健康领域的应用正以崭新的方式改变人们的生活。

激活函数Activation Functions

激活函数Activation Functions sigmoid σ(x)=1(1+e−x)σ(x)=1(1+e−x) \\sigma(x) = \\frac{1}{(1 + e^{-...

Learning Structured Sparsity in Deep Neural Networks

论文链接：<https://arxiv.org/pdf/1608.03665.pdf 代码链接：<https://github.com/wenwei202/caffe/tree/scnn 摘要计算资...

【深度学习】Loss Functions for Neural Networks for Image Processing

在目前的深度学习中，业界主流主要还是把调整深度学习网络结构作为主要的工作重心，即使损失函数lossfunctions对整个网络的训练起着十分重要的作用。 Nvidia和MIT最近发了一篇论文《loss...

是否确定退出登录?

Activation Functions in Neural Networks

Conclusion

全部评论 (0)

相关文章推荐

Activation Functions in Neural Networks

PACT: PARAMETERIZED CLIPPING ACTIVATION FOR QUANTIZED NEURAL NETWORKS

Neural Activation Constellations: Unsupervised Part Model Discovery with Convolutional Networks

论文笔记：Exploring Neural Networks with Activation Atlases

UNDERSTANDING STRAIGHT-THROUGH ESTIMATOR IN TRAINING ACTIVATION QUANTIZED NEURAL NETS

【吴恩达深度学习专栏】浅层神经网络(Shallow neural networks)——激活函数（Activation functions）

Neural Networks in Healthcare: LifeChanging Applications

激活函数Activation Functions

Learning Structured Sparsity in Deep Neural Networks

【深度学习】Loss Functions for Neural Networks for Image Processing