Correct me if I’m wrong
October 14th, 2008 by Tobias GöbelIll-designed voice dialogs can get unnecessarily slow and tedious when they try to over-compensate for speech recognition challenges, may they occur or not. This can drive me nuts: They verify each and every input individually instead of first collecting all input and then confirming all of it at once. With 4 items to collect, for instance, this would reduce the input states by 3 (!), let alone the time those three steps would take. Using VoiceObjects Analyzer, you could easily measure that time and understand that that would be time spent in vain, which might actually cost your organisation money spent in vain! But what can you do?
Well, VoiceObjects 7.3 makes it pretty easy to apply concepts like implicit confirmation and thus correction to your dialogs. As this occurs so frequently in everyday voice applications, I thought I’d write a post on this.
So check out the following call flow excerpt:

The Input object Get Credit Card Type asks the initial question: “What is the type of your new credit card?” and accepts responses such as “Visa”, “It’s mastercard”, “I have an Amex card”, etc. The following Input object Get Credit Card Number collects the number only – at first sight, that is. In reality, it does more. It
- implicitly confirms the collected card type by prompting the caller with “And what is the number of your Mastercard?” (The speech bubble icon behind the object name – denoting a comment on that object – hints at the additional functionality of this object; hovering over it in VoiceObjects Desktop would show you the developer’s comment as a tool-tip).
- allows a correction of the card type in case it was misrecognized.
This behaviour is not visible per se from the flow as VoiceObjects call flows are usually optimized for readability, deliberately omitting certain details. Showing too much of the innards would make it harder to follow what’s happening on the surface. (I have positive experience with this approach. But here’s an idea: to mention the correction capability of this input state, the developer could have called the Input object Get Credit Card Number (or Correct Card Type) instead).
Now imagine the recognizer got it wrong and the caller actually said “AmEx card”. What could be the caller’s reaction to this question? Maybe something like “No I said AmEx!” (Damnit!). The Input object Get Credit Card Number has an additional grammar defined (via a second Grammar item in the Grammar section of the Input object) that matches corrections like this. But how can the grammar instruct the server to accept this as a correction of credit card type, reset the corresponding variable, apologize, and ask again for the number, as in “I’m sorry. So what’s the number of your Amex card?” VoiceObjects 7.3 introduced the notion of grammar-driven application control to accomplish this. (Did you notice that the server even adapts to the caller’s choice of words in this example, by saying “Amex” instead of “American Express”? I guess that would make for another nice post on naturalness…)
The grammar can return instructions for the server via a special slot vogrammarcontrol; instructions such as “change the value of variable CCType” and “re-process object Get Credit Card Number”. In our example, the corresponding grammar snippet could look like this:

By detecting the slot vogrammarcontrol in the speech platform’s request after caller input and parsing the slot value to arrive at the three instructions varCCType=Amex (set variable CCType to “Amex”), gosub=Apology (process the object with ReferenceID “Apology”, which happens to be an Output object), and continuation=return (continue by returning to the current object and re-processing it), the server naturally responds to the caller’s correction with “Sorry for that. So then, what’s the number of your Amex card?”.
This is one of many possible steps towards more natural man-machine interaction. We at VoiceObjects like to call it Natural Dialog Management. Find more details on the feature of grammar-driven application control in the Input object section of the Object Reference, which is part of the VoiceObjects product documentation. I plan to provide some more examples on other Natural Dialog Management features in upcoming posts… Stay tuned!



