Only on the dark side of the moon could you have missed the impact of the iPhone. Its sweeping success has brought mobile services to the mainstream. As the first device to convincingly integrate traditional phone capabilities with Web access, it highlights the multi-channel shape of things to come.
Mobile Web currently has its time in the limelight. But in truth, “mobile applications” have been with us for quite some time. Think phone banking, or sending a text message to check on your remaining pre-paid minutes.
Doesn’t sound right? That’s because in terms of convenience and usability, phone applications have historically played in the minor league. When visiting eBay, have you ever been asked to “look carefully, because the order of our menu options may have changed”? When returning to Amazon, did you ever need to navigate through five levels of menus just to see whether your order has shipped? It’s with the focus on the user that Web applications have set a new standard: dynamic adjustments based on identity, preferences, and past interactions, as well as on-the-fly personalization create a custom experience that makes you want to come back.
In this article, we explore how to build innovative applications that bring the success of the Web to all phone channels. And it’s easier than you may think, by utilizing the latest tools and techniques available to developers.
As different as the phone channels mobile Web, voice, and text may be, they actually have quite a lot in common. Users on the go are users with a goal. Instead of just browsing to pass the time, they want to get a specific job done. They want to track an order, pay a bill, or check a movie show time. Some may want to do it by sending a text message, some by going to a Web site, and some by calling an 800 number. Yet all of them want to do it as efficiently as possible, with a focus on their goal.
Applications ought to be mindful of this need for efficiency, as it relates to both the caller interaction and presentation design. Limited bandwidth across all the different phone channels needs to be taken into account from the start to achieve an optimal caller experience. The W3C has assembled a valuable set of guidelines that can serve as a check list:
- Keep content consistent and structurally simple
- Provide easy means of navigation
- Avoid free text input whenever possible
- Use small individual markup documents
- Avoid embedded objects or scripts
Developers face the flip side of the callers’ requirements: They need to efficiently build and maintain applications that serve multiple phone channels and act consistently across all of them.
The backbone of application development is the core path of interaction between caller and system, the “dialog flow”. As pointed out above, mobile applications are focused on achieving a caller’s goal. They typically go through a number of steps to gather information (such as an amount of money and a recipient) to then perform a transaction (such as transferring money). This basic flow remains the same across different phone channels, though what is presented as a yes/no question in the voice channel may be a radio button in the mobile Web channel. The development environment needs to be able to isolate the channel differences and allow developers to first focus on the commonalities when building the dialog flow. Once this is in place, there must be an efficient way of selectively applying channel- and caller-specific modifications to achieve the adaptive and personalized experience callers have rightly come to expect.
Likewise, the communication with backend systems needs to be integrated across all phone channels. Since this is the place where most of the custom coding is required, seamless interaction with proven SOA frameworks must be ensured. Finally, interoperability with complementary tools to serve the needs of individual channels, such as audio file or speech recognition grammar management, is required.
Through the success of standards like VoiceXML, the previously disparate worlds of Interactive Voice Response (IVR) and Web have merged. So not only do callers get to benefit from better applications, but developers’ lives have been made easier by a unified architecture that provides more flexibility, scalability, and interoperability while allowing for faster turnaround than ever before.
The standard architecture for multi-channel phone applications today consists of the following:
- a unified service creation environment based on the Eclipse framework,
- a unified service execution environment based on a phone application server, and
- a unified backend infrastructure based on a service-oriented architecture (SOA).
In the remainder of this article we focus on the service creation environment to see how it helps developers create a better experience for callers.
Eclipse provides an open framework to combine best-of-breed tools into a powerful integrated workbench. Instead of having to make compromises when selecting a monolithic IDE, developers can pick and choose what suits them best from a wide variety of open-source and commercial components. Acting as plug-ins within the Eclipse architecture, these components blend into the overall workbench and smoothly interact with each other.
The benefits of this approach are particularly strong when building multi-channel phone applications, since dedicated tools can be used for the specific technical needs of the different channels. Even better, many of these tools are available as free downloads such as the ones we take a closer look at here.
VoiceObjects Developer Edition is a comprehensive multi-channel framework, which provides integrated support for the voice, video, text, and Web channels. Included are a graphical IDE as well as an embedded phone application server for one-click testing and deployment. Applications are built using an object-oriented approach on the basis of a set of core components modeling caller interactions as well as backend integration and application logic. Rapid prototyping and object re-use are facilitated by a drag-and-drop GUI.
Adaptive personalization is achieved through the concept of “layers”, which also covers topics such as multi-lingual or multi-persona applications. Integrated testing and debugging is available for all phone channels, including a Phone Simulator that shows text and Web applications as they would look on a variety of mobile handsets. To test voice applications end-to-end Voxeo’s Prophecy is the ideal choice.
Grammars are an important aspect of voice application development. Things a caller might say, such as “There’s a problem with my bill” or “Transfer five hundred dollars”, must be modeled so that the speech recognition engine can successfully understand them. Nu Echo’s NuGram IDE offers a suite of tools to efficiently manage these grammars. Productivity features such as auto-completion and on-the-fly validation assist in building grammar rules. For testing and tuning, sample caller utterances can be parsed to analyze grammar coverage and ensure correct semantic interpretation.
Access to backend systems is a crucial part of development regardless of the channels served by an application. Within the Eclipse eco-system, several frameworks are available to help with this task. Two important ones are the Web Tools Platform (WTP) and the SOA Tools Platform (STP).
For simple or one-off tasks, JavaServer Pages (JSPs) are often the solution of choice because of their low overhead and straightforward integration of static and dynamic content. The WTP offers a rich set of features to support their development, testing, and deployment.
For more complex and reusable tasks, Web services are the preferred way to go. The STP provides a broad scope of capabilities covering SOA aspects from business process modeling and service orchestration to code generation, deployment, testing, and documentation.
The Eclipse plug-ins highlighted here, apart from being excellent tools in their own right, offer the added benefit of smooth interoperability within the Eclipse workbench: You can check and tweak a speech recognition grammar while looking at the dialog flow that reacts to the corresponding caller input. You can adjust the application logic with simple drag-and-drops while building the Web service code that connects to the backend. For the first time, developers have simultaneous control of all application aspects without the need to switch between IDEs, or to compromise on features when selecting a single environment.
Mobile applications are here to stay.
Users have come to depend on retrieving information and performing transactions on-the-go. And they expect the same level of convenience and efficiency they know from home – regardless of whether they call an 800 number, send a text message, or visit a mobile Web site. The challenge lies with the developers to efficiently deliver multi-channel phone applications that adapt dynamically to each caller’s needs and expectations.
The merging of IVR and internet technologies has made it possible to apply the lessons learned on the Web to all phone channels: Benefit from a scalable multi-tier architecture centered on an application server. Unify backend access through the use of Web services and SOA.
On the IDE side, the Eclipse framework has provided the fertile ground on which a multitude of interoperable plug-ins has sprung up that presents developers with a comprehensive suite of capabilities. Every aspect of multi-channel application development can be addressed – and not in isolation, but in correspondence and coordination with each other.
Just as importantly, most of these Eclipse plug-ins can be downloaded for free, giving developers a choice and allowing them to evaluate each tool’s respective benefits.
Never before has it been easier to get from idea to implementation. The flexible and scalable infrastructure is in place, and the tools to realize innovation are in the developers’ hands.
The time for better phone applications has finally come.