Subscribe / Unsubscribe Enewsletters | Login | Register

Pencil Banner

Inside Siri's brain: The challenges of extending Apple's virtual assistant

Marco Tabini | April 9, 2013
Siri is one of the biggest features to hit iOS in recent years, and yet it remains severely limited in its capabilities. Alas, Apple--and third-party developers--must overcome many obstacles before voice interaction becomes a pervasive part of the mobile experience.

Siri is, by far, my favorite recent addition to iOS. In an age when electronic devices keep getting smaller, faster, and thinner, the humble keyboard feels increasingly like a relic of a bygone era--an era when computers were designed to occupy entire rooms instead of the palm of a hand. Apple's virtual assistant has changed the ways in which I interact with my mobile devices in a significant, if small, way.

Alas, Siri is frustratingly limited. It's integrated well enough with Apple's own apps, as well as with services that the company decides to support, but it's otherwise nearly impossible to use with third-party software of any kind. That's disappointing, because in such a context it could be a game changer--especially for people who have difficulty interacting with a normal keyboard because of a disability.o"

In an ideal world, Siri would be the primary way for me to interact with many aspects of my iPhone and iPad, and the keyboard would be available as a backup when needed. I'm sure plenty of developers would love to be able to take advantage of Siri, if only Apple would make it possible for them to do so. Unfortunately, the technology behind Siri makes that a significant challenge for the company.

There and back again

What we know as "Siri" is not just an app built into our phones and tablet, but rather a collection of software and Internet-based services that Apple operates in cooperation with a number of partners. By keeping the majority of the functionality server-side, the company can offload much of the work to powerful computers rather than taxing the limited resources of its mobile devices; plus, Apple can use the data it collects to continuously improve the service and offer new functionality without having to release a new version of iOS.

When you issue Siri a command, your device is mainly responsible for collecting the sound of your voice and converting it into an audio file, which it then sends to Apple's data center for processing. This is not as trivial a task as it sounds--you'd be surprised at how much noise a microphone picks up, even when you're in what seems like a quiet environment. For this reason, Apple has been investing heavily in technologies that make that sound as clear as possible: Most recent iOS devices feature multiple microphones, along with sophisticated hardware that analyzes the mics' input to produce a signal scrubbed of most of its noise--a cleaner signal requires less data to transmit and is easier to process.

Once it reaches Apple's servers, your audio file goes through a series of steps that progressively transform it into an action that a computer program can perform--such as figuring out what the weather looks like. The output of that action is then transformed back into text that can be read to you in a natural way.

 

1  2  3  4  Next Page 

Sign up for CIO Asia eNewsletters.