Subscribe / Unsubscribe Enewsletters | Login | Register

Pencil Banner

What deep learning really means

Martin Heller | Feb. 7, 2017
GPUs in the cloud put the predictive power of deep neural networks within reach of every developer

By the way, the only way to find the best algorithm is to try them all. Some ML packages and services, such as Spark.ML, can parallelize that for you and help pick the best result.

Note that neural networks are an option for any of the three kinds of prediction problems. Also note that neural networks are known both for accuracy and long training times. So what are neural networks, other than one of the more time-consuming but accurate approaches to machine learning?

Neural networks

The ideas for neural networks go back to the 1940s. The essential concept is that a network of artificial neurons built out of interconnected threshold switches can learn to recognize patterns in the same way that an animal brain and nervous system (including the retina) does.

The learning occurs basically by strengthening the connection between two neurons when both are active at the same time during training; in modern neural network software this is most commonly a matter of increasing the weight values for the connections between neurons using a rule called back propagation of error, backprop, or BP.

How are the neurons modeled? Each has a propagation function that transforms the outputs of the connected neurons, often with a weighted sum. The output of the propagation function passes to an activation function, which fires when its input exceeds a threshold value.

In the 1940s and '50s artificial neurons used a step activation function and were called perceptrons. Modern neural networks may reference perceptrons, but actually have smooth activation functions, such as the logistic or sigmoid function, the hyperbolic tangent, and the Rectified Linear Unit (ReLU). ReLU is usually the best choice for fast convergence, although it has an issue of neurons "dying" during training if the learning rate is set too high.

The output of the activation function can pass to an output function for additional shaping. Often, however, the output function is the identity function, meaning that the output of the activation function is passed to the downstream connected neurons.

Now that we know about the neurons, we need to learn about the common neural network topologies. In a feed-forward network, the neurons are organized into distinct layers: one input layer, N hidden processing layers, and one output layer, and the outputs from each layer go only to the next layer.

In a feed-forward network with shortcut connections, some connections can jump over one or more intermediate layers. In recurrent neural networks, neurons can influence themselves, either directly, or indirectly through the next layer.

Supervised learning of a neural network is done exactly like any other machine learning: You present the network with groups of training data, compare the network output with the desired output, generate an error vector, and apply corrections to the network based on the error vector. Batches of training data that are run together before applying corrections are called epochs.


Previous Page  1  2  3  4  5  6  Next Page 

Sign up for CIO Asia eNewsletters.