"The answer was stupidly clear," Mejia said. "What if you just talk to somebody? I mean, if we put a person in front of you, and they start talking to you, would you talk back?"
To that end, the company opted to use the microphones that are built into the Oculus Rift and Gear VR systems and create a game that feels like a much more open-ended and immersive choose-your-own-adventure book.
Microsoft is far from the only company providing machine learning-based cloud voice recognition, but its services were the best for what the team is doing, Mejia said. The services provide what the team needs for not only custom dictionaries, but also fast response times and the ability to see and validate the results that the voice recognition system puts out.
Two other cognitive services from Microsoft will reach general availability next month. The Content Moderator service is designed to automatically block objectionable content in text, videos, and images while allowing for human review of questionable cases. It can detect profanity in more than 100 languages and allows customers to include custom lists of objectionable text as well.
The Bing Speech API is designed to give developers an easy, generalized way to convert speech to text and vice versa. It supports voice recognition from 18 languages and dialects from 28 countries, including German, French, Chinese, Spanish, and Arabic. Developers can also use the API to do text-to-speech work in 10 languages with support for dialects from 18 countries.
Microsoft is battling with a number of other cloud companies in this area, including Google, Amazon, and IBM, which each have their own set of machine intelligence-based tools.
Sign up for CIO Asia eNewsletters.