Speed and accuracy aren’t just bragging points. If Microsoft’s system is faster than TensorFlow by default, it means people have more options than just to throw more hardware at the problem—e.g., hardware acceleration of TensorFlow, via Google’s custom (and proprietary) TPU processors. It also means third-party projects that interface with both TensorFlow and CNTK, such as Spark, will gain a boost. TensorFlow and Spark already work together, courtesy of Yahoo, but if CNTK and Spark offer more payoff for less work, CNTK becomes an appealing option in all of those places that Spark has already conquered.
Graphcore and Wave Computing: The hardware’s the thing
One of the downsides to Google’s TPUs is that they’re only available in the Google cloud. For those already invested in GCP, that might not be an issue—but for everyone else, and there’s a lot of “everyone else,” it’s a potential blocker. Dedicated silicon for deep learning, such as general purpose GPUs from Nvidia, are available with fewer strings attached.
Several companies have recently unveiled specialized silicon that outperforms GPUs for deep learning applications. Startup Graphcore has a deep learning processor, a specialized piece of silicon designed to process the graph data used in neural networks. The challenge, according to the company, is to create hardware optimized to run networks that recur or feed into each other and into other networks.
One of the ways Graphcore has sped things up is by keeping the model for the network as close to the silicon as possible, and avoiding round trips to external memory. Avoiding data movement whenever possible is a common approach to speeding up machine learning, but Graphcore is taking that approach to another level.
Wave Computing is another startup offering special-purpose hardware for deep learning. Like Graphcore, the company believes GPUs can be pushed only so far for such applications before their inherent limitations reveal themselves. Wave Computing’s plan is to build “dataflow appliances,” rackmount systems using custom silicon that can deliver 2.9 petaops of compute (note that’s “petaops” for fixed-point operations, not “petaflops” for floating-point operations). Such speeds are orders of magnitude beyond the 92 teraops provided by Google’s TPU.
Claims like that will need independent benchmarks to bear them out, and it isn’t yet clear if the price-per-petaop will be competitive with other solutions. But Wave is ensuring that price aside, prospective users will be well supported. TensorFlow support is to be the first framework supported by the product, with CNTK, Amazon’s MXNet and others to follow thereafter.
Brodmann17: Less model, more speed
Whereas Graphcore and Wave Computing are out to one-up TPUs with better hardware, other third parties are out to demonstrate how better frameworks and better algorithms can deliver more powerful machine learning. Some are addressing environments that lack ready access to gobs of processing power, such as smartphones.
Sign up for CIO Asia eNewsletters.