赞
踩
So far we have focused our discussion on design choices for neural networks that are common to most parametric machine learning models trained with gradientbased optimization. Now we turn to an issue that is unique to feedforward neural networks: how to choose the type of hidden unit to use in the hidden layers of the model.
The design of hidden units is an extremely active area of research and does not yet have many definitive guiding theoretical principles.
Unless indicated otherwise, most hidden units can be described as accepting a vector of inputs x, computing an affine transformation
Rectified linear units use the activation function
Rectified linear units are typically used on top of an affine transformation:
One drawback to rectified linear units is that they cannot learn via gradientbased methods on examples for which their activation is zero.
Rectified linear units and all of these generalizations of them are based on the principle that models are easier to optimize if their behavior is closer to linear.
Prior to the introduction of rectified linear units, most neural networks used the logistic sigmoid activation function
Many other types of hidden units are possible, but are used less frequently.
In general, a wide variety of differentiable functions perform perfectly well. Many unpublished activation functions perform just as well as the popular ones.
During research and development of new techniques, it is common to test many different activation functions and find that several variations on standard practice perform comparably. This means that usually new hidden unit types are published only if they are clearly demonstrated to provide a significant improvement.
A few other reasonably common hidden unit types include:
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。