Neural networks aren’t a new algorithm, but the availability of large data sets and more powerful processing (especially GPUs, which can handle large streams of data in parallel) have only recently made them useful in practice. Despite the name, neural networks are based only loosely on biological neurons. Each node in a neural network has connections to other nodes that are triggered by inputs. When triggered, each node adds a weight to its input to mark the probability that it does or doesn’t match that node’s function. The nodes are organized in fixed layers that the data flows through, unlike the brain, which creates, removes and reorganizes synapse connections regularly.
Deep learning is a subset of machine learning based on deep neural networks. Deep neural networks are neural network that have many layers for performing learning in multiple steps. Convolutional deep neural networks often perform image recognition by processing a hierarchy of features where each layer looks for more complicated objects. For example, the first layer of a deep network that recognizes dog breeds might be trained to find the shape of the dog in an image, the second layer might look at textures like fur and teeth, with other layers recognizing ears, eyes, tails and other characteristics, and the final level distinguishing different breeds. Recursive deep neural networks are used for speech recognition and natural language processing, where the sequence and context are important.
There are many open source deep learning toolkits available that you can use to build your own systems. Theano, Torch and Caffe are popular choices, and Google’s TensorFlow and Microsoft Cognitive Toolkit let you use multiple servers to build more powerful systems with more layers in your network.
Microsoft’s Distributed Machine Learning Toolkit packages up several of these deep learning toolkits with other machine learning libraries, and both AWS and Azure offer VMs with deep learning toolkits pre-installed.
Machine learning in practice
Machine learning results are a percentage certainty that the data you’re looking at matches what your machine learning model is trained to find. So, a deep network trained to identify emotions from photographs and videos of people’s faces might score an image as “97.6% happiness 0.1% sadness 5.2% surprise 0.5% neutral 0.2% anger 0.3% contempt 0.01% disgust 12% fear.” Using that information means working with probabilities and uncertainty, not exact results.
Probabilistic machine learning uses the concept of probability to enable you to perform machine learning without writing algorithms at all. Instead of the set values of variables in standard programming, some variables in probabilistic programming have values that fall in a known range and others have unknown values. Treat the data you want to understand as if it was the output of this code and you can work backwards to fill in what those unknown values would have to be to produce that result. With less coding, you can do more prototyping and experimenting; probabilistic machine learning is also easier to debug.
Sign up for CIO Asia eNewsletters.