Subscribe / Unsubscribe Enewsletters | Login | Register

Pencil Banner

Microsoft's custom voice recognition service hits public beta

Blair Hanley Frank | Feb. 8, 2017
The service lets developers tailor cloud voice recognition for specific scenarios

Companies building applications that leverage speech recognition have a new machine-learning based tool to improve their work. Microsoft is opening the public beta for its Custom Speech Service, the company said Tuesday.

The service, formerly known as CRIS, allows customers to train a speech recognition system to work in a specific scenario, allowing it to produce more accurate results. For example, the Custom Speech Service can be trained to provide better results in a noisy airport or set up to work better with voices from a particular group, like kids or people with different accents.

Right now, the Custom Speech Service works with English and Chinese, but one of its advantages is that it can be trained to work with accents from non-native speakers.

Microsoft is making it available as part of its suite of Cognitive Services, a set of cloud-based tools aimed at opening up the fruits of the company’s artificial intelligence and machine learning research to the rest of the world.

Right now, there are eight such cognitive services generally available, and an additional 17 in beta. More than 424,000 developers have tried the services since they launched, Microsoft said. Developers all over the world can access the services, many of which are available for purchase through Microsoft Azure.

Each of the services has a free tier with heavy limits on its use, so developers have the freedom to test the APIs out without spending a cent. The Custom Speech Service has a complicated, tiered pricing model that includes a subscription fee along with charges based on the number of voice samples fed into the system and the amount of acoustic adaptation training.

The Custom Speech Service is a key tool in the arsenal of Human Interact, a small game development shop using voice commands as the sole means of interaction for its forthcoming game Starship Commander. Custom speech recognition, along with Microsoft’s Language Understanding Intelligent Service (LUIS), makes up key parts of the voice recognition and understanding system that players use to guide their ship.

The service allows Human Interact to create its own dictionary specific to Starship Commander, which means the system can understand players when they ask about the Ecknians, the game’s alien antagonists. After players' speech has been translated into machine readable text, LUIS processes it and translates it into game commands.

Both systems are important to the core gameplay of Starship Commander. Human Interact set out to make an interactive experience for virtual reality that was broadly accessible to a wide range of players, not just those who have been playing video games for years, creative director Alexander Mejia said.


1  2  Next Page 

Sign up for CIO Asia eNewsletters.