Advances in machine learning are making security systems easier to train and more flexible in dealing with changing conditions, but not all use cases are benefitting at the same rate.
Machine learning, and artificial intelligence, has been getting a lot of attention lately and there's a lot of justified excitement about the technology.
One of the side effects is that pretty much everything is now being relabeled as "machine learning," making the term extremely difficult to pin down. Just as the word "cloud" has come to mean pretty much anything that happens online, so "artificial intelligence" is rapidly moving to the point where almost anything involving a computer is getting that label slapped on it.
"There is also a lot of hype," said Anand Rao, innovation lead for US analytics at PricewaterhouseCoopers LLC. "People talk about AI becoming super intelligent and will take over humanity and human decision making so on."
One common security tasks is to determine whether newly-downloaded or installed applications are malicious. The traditional approach is a very basic expert system -- does the application's signature match that of known malware?
The downside of this standard antivirus approach, however, is that it needs to be updated constantly as new malware shows up, and it is extremely brittle. A piece of malware that has only minor modifications in it can easily evade detection.
One startup, Deep Instinct, is looking to apply deep learning techniques to the problem, taking advantage of the fact that there are now close to 1 billion samples of known malware that can be used for training purposes.
"Deep learning has revolutionized many areas," said Eli David, the company's CTO. "Computer vision has improved 20 to 30 percent a year, to super-human vision in no time. Speech recognition. Why shouldn't that work in cyber security?"
Even a probability-based machine learning system is limited, he said. There are only so many factors that can be identified by experts, weighed and then tuned for optimum results. Meanwhile, uncounted other factors are dismissed as too minor or irrelevant.
"You're throwing away most of the data," he said.
The way that Deep Instinct works is that the deep learning system is trained, in the laboratory, on all the known samples of malware.
The process takes about a day, he said, and requires heavy-duty graphical processing units to analyze the data.
The resulting trained system is about a gigabyte in size, he said, too big for most applications, but then the company prunes it down to about 20 megabytes. It can then be installed on any endpoint device, including mobile, and can analyze incoming threats in a few milliseconds on the slowest machine.
Sign up for CIO Asia eNewsletters.