Posts Tagged ‘phone channels’

How to IMify your voice application

Tuesday, August 25th, 2009

One thing that’s so great about VoiceObjects Server is its multi-channel capabilities. This has been introduced way back in 2007. Design an application once – deploy it on any channel available on modern handsets, including voice, video, text, and mobile Web, and benefit from common maintenance, deployment, reporting, and analytics. Customers like T-Mobile Czech are embracing this to provide better support to their customer base (see also this announcement).

One thing that’s so great about Voxeo is our commitment to emerging technologies. Voxeo’s recent acquisition of IMified again demonstrates that we are at the forefront of any development within the industry that promises better self-service experience for the mobile customer. We have coined a term for this:

Unified Self-Service

With IMified, developers can build applications that interact with users over instant messaging (IMR – Interactive Messaging Response). The beauty of IMified lies in the fact that it provides a staggeringly simple API to access various different providers of IM: AIM, MSN, Yahoo, Google Talk, Jabber, … Plus even Twitter, and – brand-new –  SMS text messages. Together with VoiceObjects, IMified extends the scope of our Phone Application Server to all these new modalities, which are technically all part of VoiceObjects’ text channel.

The following picture shows the high-level architecture of IMified with VoiceObjects:

IMified architecture including VoiceObjects

This blog post is about how to build your first IM bot using VoiceObjects. Believe me, it takes longer to read this text than to build, deploy and test the app. If you are slow reader, that is…

How to set it up

Short version:

  1. Create a text application in VoiceObjects
  2. Set up a bot in IMified and point it to your app

That’s it!

Well ok, here are slightly more verbose instructions:

  1. Install VoiceObjects Developer Edition 9.0 R1 (or use your existing VoiceObjects 9.0 R1 installation)
  2. Create an application of your choice and provide prompts and grammars for the text channel
    1. Hint: If your application isn’t going to become multi-channel, you can leave the Channel layer at Default.
    2. If you have never built an application for the text channel before, you might want to read chapters 7 – How to Use Layers and 10 – How to Support Multiple Phone Channels of the Design Guide.
  3. Deploy that application (e.g. by clicking Test Application in the context menu of your root Module object, in case you’re using VoiceObjects Desktop for Eclipse)
  4. Register at www.IMified.com and create a developer account
  5. Click Create a New Bot
  6. Configure your bot
    1. Give it a bot name
    2. Give it a screen name. This will be the name (plus @bot.im) under which your bot will automatically be accessible via Jabber and Google Talk. For all other networks, you need to create accounts there first, which you can then associate with your new bot.
    3. Configure the bot URL. This must be the URL pointing to your VoiceObjects Server on which you deployed your app. Example: http://myserver.com:8070/VoiceObjects/DialogMapping?VSN=testService&User-Agent=IMified&vsDriver=173
      1. Use your VSN if you don’t deploy to VoiceObjects Desktop for Eclipse’s embedded VoiceObjects Server (if you do, testService is the fixed name for your service)
      2. Add User-Agent=IMified&vsDriver=173 to the URL, so that VoiceObjects Server knows this is an IM session based on IMified
      3. Make sure your VoiceObjects Server is reachable from the outside Internet world
    4. Click Create new Bot

That’s it! Now open Google Talk or any Jabber client, invite your bot to your contacts (the screen name plus @bot.im) and start a chat! If you have created accounts on other IM networks and activated them on IMified, invite these contacts and start a chat there!

How to move on from here

Wondering what kind of apps you can build with this? Well, think about any customer self-service or other portals that you could automate – or that you have automated already over IVR. Go ahead and IMify those! You are basically extending your customer interaction to other channels, giving your customers more options, accommodating the young generation that might be chatting more than talking these days… Have a look at the following interaction. It stems from our sample application, Prime Telecom. This application had been text-enabled for long now. With the IMified integration, chat is another means of communication that is – simply – THERE. Just use it!

PrimeTelecom sample chat

Stay tuned for more things to read about the IMification of customer self-service applications. I plan to write something about usability, UI, and implementation best practices soon. Also, we plan a Developer Jam Session on this topic in October. Announcements will follow soon!

As usual, feel free to add comments or questions to this post.

Finally, if you’re in New York City during this week of August 24, please make sure to visit us at SpeechTEK in the Marriott Marquis! It’s the unique chance to see our 20-netbook cluster in action, serving thousands of calls on multiple phone channels using cheap and power-sipping hardware. You can find us in booth 800. We’re looking forward to talking to you.

Building Better Phone Applications with Eclipse

Wednesday, January 28th, 2009

Only on the dark side of the moon could you have missed the impact of the iPhone. Its sweeping success has brought mobile services to the mainstream. As the first device to convincingly integrate traditional phone capabilities with Web access, it highlights the multi-channel shape of things to come.

Mobile Web currently has its time in the limelight. But in truth, “mobile applications” have been with us for quite some time. Think phone banking, or sending a text message to check on your remaining pre-paid minutes.

Doesn’t sound right? That’s because in terms of convenience and usability, phone applications have historically played in the minor league. When visiting eBay, have you ever been asked to “look carefully, because the order of our menu options may have changed”? When returning to Amazon, did you ever need to navigate through five levels of menus just to see whether your order has shipped? It’s with the focus on the user that Web applications have set a new standard: dynamic adjustments based on identity, preferences, and past interactions, as well as on-the-fly personalization create a custom experience that makes you want to come back.

In this article, we explore how to build innovative applications that bring the success of the Web to all phone channels. And it’s easier than you may think, by utilizing the latest tools and techniques available to developers.

Phone Channels

As different as the phone channels mobile Web, voice, and text may be, they actually have quite a lot in common. Users on the go are users with a goal. Instead of just browsing to pass the time, they want to get a specific job done. They want to track an order, pay a bill, or check a movie show time. Some may want to do it by sending a text message, some by going to a Web site, and some by calling an 800 number. Yet all of them want to do it as efficiently as possible, with a focus on their goal.

Applications ought to be mindful of this need for efficiency, as it relates to both the caller interaction and presentation design. Limited bandwidth across all the different phone channels needs to be taken into account from the start to achieve an optimal caller experience. The W3C has assembled a valuable set of guidelines that can serve as a check list:

  • Keep content consistent and structurally simple
  • Provide easy means of navigation
  • Avoid free text input whenever possible
  • Use small individual markup documents
  • Avoid embedded objects or scripts

Developers face the flip side of the callers’ requirements: They need to efficiently build and maintain applications that serve multiple phone channels and act consistently across all of them.

The backbone of application development is the core path of interaction between caller and system, the “dialog flow”. As pointed out above, mobile applications are focused on achieving a caller’s goal. They typically go through a number of steps to gather information (such as an amount of money and a recipient) to then perform a transaction (such as transferring money). This basic flow remains the same across different phone channels, though what is presented as a yes/no question in the voice channel may be a radio button in the mobile Web channel. The development environment needs to be able to isolate the channel differences and allow developers to first focus on the commonalities when building the dialog flow. Once this is in place, there must be an efficient way of selectively applying channel- and caller-specific modifications to achieve the adaptive and personalized experience callers have rightly come to expect.

Likewise, the communication with backend systems needs to be integrated across all phone channels. Since this is the place where most of the custom coding is required, seamless interaction with proven SOA frameworks must be ensured. Finally, interoperability with complementary tools to serve the needs of individual channels, such as audio file or speech recognition grammar management, is required.

Architecture

Through the success of standards like VoiceXML, the previously disparate worlds of Interactive Voice Response (IVR) and Web have merged. So not only do callers get to benefit from better applications, but developers’ lives have been made easier by a unified architecture that provides more flexibility, scalability, and interoperability while allowing for faster turnaround than ever before.

 

The standard architecture for multi-channel phone applications today consists of the following:

  • a unified service creation environment based on the Eclipse framework,
  • a unified service execution environment based on a phone application server, and
  • a unified backend infrastructure based on a service-oriented architecture (SOA).

In the remainder of this article we focus on the service creation environment to see how it helps developers create a better experience for callers.

Eclipse Framework

Eclipse provides an open framework to combine best-of-breed tools into a powerful integrated workbench. Instead of having to make compromises when selecting a monolithic IDE, developers can pick and choose what suits them best from a wide variety of open-source and commercial components. Acting as plug-ins within the Eclipse architecture, these components blend into the overall workbench and smoothly interact with each other.

The benefits of this approach are particularly strong when building multi-channel phone applications, since dedicated tools can be used for the specific technical needs of the different channels. Even better, many of these tools are available as free downloads such as the ones we take a closer look at here.

VoiceObjects Developer Edition is a comprehensive multi-channel framework, which provides integrated support for the voice, video, text, and Web channels. Included are a graphical IDE as well as an embedded phone application server for one-click testing and deployment. Applications are built using an object-oriented approach on the basis of a set of core components modeling caller interactions as well as backend integration and application logic. Rapid prototyping and object re-use are facilitated by a drag-and-drop GUI.

Adaptive personalization is achieved through the concept of “layers”, which also covers topics such as multi-lingual or multi-persona applications. Integrated testing and debugging is available for all phone channels, including a Phone Simulator that shows text and Web applications as they would look on a variety of mobile handsets. To test voice applications end-to-end Voxeo’s Prophecy is the ideal choice.

Grammars are an important aspect of voice application development. Things a caller might say, such as “There’s a problem with my bill” or “Transfer five hundred dollars”, must be modeled so that the speech recognition engine can successfully understand them. Nu Echo’s NuGram IDE offers a suite of tools to efficiently manage these grammars. Productivity features such as auto-completion and on-the-fly validation assist in building grammar rules. For testing and tuning, sample caller utterances can be parsed to analyze grammar coverage and ensure correct semantic interpretation.

Access to backend systems is a crucial part of development regardless of the channels served by an application. Within the Eclipse eco-system, several frameworks are available to help with this task. Two important ones are the Web Tools Platform (WTP) and the SOA Tools Platform (STP).

For simple or one-off tasks, JavaServer Pages (JSPs) are often the solution of choice because of their low overhead and straightforward integration of static and dynamic content. The WTP offers a rich set of features to support their development, testing, and deployment.

For more complex and reusable tasks, Web services are the preferred way to go. The STP provides a broad scope of capabilities covering SOA aspects from business process modeling and service orchestration to code generation, deployment, testing, and documentation.

The Eclipse plug-ins highlighted here, apart from being excellent tools in their own right, offer the added benefit of smooth interoperability within the Eclipse workbench: You can check and tweak a speech recognition grammar while looking at the dialog flow that reacts to the corresponding caller input. You can adjust the application logic with simple drag-and-drops while building the Web service code that connects to the backend. For the first time, developers have simultaneous control of all application aspects without the need to switch between IDEs, or to compromise on features when selecting a single environment.

Summary

Mobile applications are here to stay.

Users have come to depend on retrieving information and performing transactions on-the-go. And they expect the same level of convenience and efficiency they know from home – regardless of whether they call an 800 number, send a text message, or visit a mobile Web site. The challenge lies with the developers to efficiently deliver multi-channel phone applications that adapt dynamically to each caller’s needs and expectations.

The merging of IVR and internet technologies has made it possible to apply the lessons learned on the Web to all phone channels: Benefit from a scalable multi-tier architecture centered on an application server. Unify backend access through the use of Web services and SOA.

On the IDE side, the Eclipse framework has provided the fertile ground on which a multitude of interoperable plug-ins has sprung up that presents developers with a comprehensive suite of capabilities. Every aspect of multi-channel application development can be addressed – and not in isolation, but in correspondence and coordination with each other.

Just as importantly, most of these Eclipse plug-ins can be downloaded for free, giving developers a choice and allowing them to evaluate each tool’s respective benefits.

Never before has it been easier to get from idea to implementation. The flexible and scalable infrastructure is in place, and the tools to realize innovation are in the developers’ hands.

The time for better phone applications has finally come.