1.4 Getting Started
It's "Hello World" time! There are two routes you can take to get up and running with VoiceXML. The first is to set up an account with one of the online VoiceXML development services and then set up a directory that can be accessed from the Internet where you will keep your VoiceXML files. The second is to download a VoiceXML interpreter, an ASR engine, and a TTS engine, and install these on your desktop PC.
Depending on which route you take you should look at either 1.4.1, "Setting up a remote hosted environment," on page 18 or 1.4.2, "Setting up an IDE environment," on page 20 to get your environment up and running.
1.4.1 Setting up a remote hosted environment
Using a remote hosted environment is probably the easiest way to get up and running quickly. It does, however, require both Internet and telephone access during the testing process.
Setting up this environment consists of two steps. First you need to find server space where you can put VoiceXML documents on the Internet. The best solution is to have access to a Web server that can be accessed by domain name or IP address from the Internet. The next best solution is to use a free Web-hosting service. It is important to find one that doesn't litter your pages with advertisements, as their HTML code will cause the VoiceXML interpreter to freak out.
Let's go ahead and assume you don't have a Web server of your own and would like to develop on one of the free hosting services and VoiceXML development platforms. The following example will use GeoCities as our free Web server and VoiceGenie as our VoiceXML development platform.
We'll start by setting up an account on GeoCities. To do this you'll need to follow the "Sign up for a free website" link on their hope page at http://www.geocities.com. This will take you through the registration process. Once you have completed this and are fully logged in, you'll want to go to the File Manager application. You can do this by going to http://geocities.yahoo.com/filemanager. Here you will see a Web-based file manager. You'll want to click on "New (Create a new HTML file)". This will open up a Web-based text editor with some skeleton HTML markup. Delete all of this text and enter instead the contents of Example 11.
There should be a text field labeled "Filename:". Enter hello.xml. Now press the button labeled "Save". Note that we used the .xml extension instead of the more typical .vxml extension. This is because many of the free sites insist that you use a common Web extension and don't recognize .vxml. VoiceXML interpreters rarely care what the file is called and .xml certainly is as accurate as .vxml, if not as specific.
You have now published your first VoiceXML document to the Web. To verify this visit your new VoiceXML website by typing
Example 11. hello.xml
<vxml version="2.0"> <form> <field name="hello" type="boolean"> <prompt>Isn't this exciting?</prompt> <filled> <prompt> You said <value expr="hello"/> </prompt> </filled> </field> </form> </vxml>
http://www.geocities.com/yourname/hello.xml into the address field of your browser, replacing yourname with whatever account name you gave yourself when creating your GeoCities account. You should see more or less what you typed in. Depending on your browser settings you may see only:
Isn't this exciting? You said
For example, in Internet Explorer, you can see the entire source by selecting the menu View and then the menu item Source.
The next step is testing your application using one of the free VoiceXML development platforms. Let's use the VoiceGenie platform as it is relatively easy to use. This is a VoiceXML interpreter running on a computer with telephony hardware. You can call in to this machine over the telephone and interact with your VoiceXML application.
In order to test your application you will need to create an account on VoiceGenie's development server. You can do this by visiting http://developer.voicegenie.com and clicking on the "Register" link. This will guide you through the account creation process.
Next, you will need to assign an "extension" to your application. An extension is just a five digit number that you need to dial after dialing in to the VoiceGenie server. To assign a new extension click on the tab labeled "Tools". You will then see a link labeled "Extension Manager". Click on this link.
The table showing all of your extensions will be empty. At the bottom of this list will be a text field labeled "Add:"; type into this text field the same URL you used in your browser to look at hello.xml, namely http://www.geocities.com/yourname/hello.xml. Click the Add button. You should now see a five digit number followed by the aforementioned URL.
Now you will need to pick up the phone and dial the telephone number for one of their development boxes. They have two boxes configured using different TTS and ASR technologies. For this application, either one should work. When you are connected you will hear a welcome message and then you will be asked for your extension. When you say the five digit extension shown in the table you will be transferred to a VoiceXML interpreter running your application. Your dialog with the interpreter might be similar to the one in Example 12.
Example 12. Interaction with hello.xml
Interpreter : Isn't this exciting? You : Yes. Interpreter : You said yes.
You now have a hosted VoiceXML environment suitable for developing static VoiceXML applications. As we go on to the examples that require dynamic document generation technologies like ASP, JSP, etc. you will need to find a more sophisticated server environment.
In addition, you can try different VoiceXML hosts for your hello.xml file now that it is on the Web. The process for creating developer accounts and assigning extensions is pretty much the same for all of voice-hosting service providers.
1.4.2 Setting up an IDE environment
If you want to test right on your desktop PC, you'll need to download a VoiceXML IDE (Interactive Development Environment) system. This will need to include ASR, TTS, a VoiceXML interpreter, and optionally some sort of development tools. The two most mature candidates in this arena are IBM's WebSphere Voice Toolkit and Nuance's V-Builder.
There are a few caveats with this approach. First, the downloads are enormous! (On the order of hundreds of megabytes for all of the ASR and TTS data.) Second, they require considerable CPU power and RAM to run properly. A third issue is the tedium of installation. None of the packages has a "one-button" installer, but instead they require you to find the right packages and install them in the proper order. This can be time consuming and frustrating.
The advantage to this approach is the fact that your development environment is completely self-contained. You don't need Internet connectivity, nor do you need to continuously call into a VoiceXML interpreter to test your application which, over the long haul, might prove to be more frustrating. These IDE products provide a telephone simulation-mode, where you do not have to use a telephone, though you will need a headset and microphone connected to your PC.
To install IBM's IDE, Voice Toolkit, you will also need to download their Voice Server SDK.
You can start by going to http://www-3.ibm.com/pvc/products/voice/voice_technologies.shtml and scrolling down the page for information about both products. You must also download at least one language package along with the main installation package. After you have downloaded the Voice Server SDK product, you can remove the download package(s) and the extracted installation program files.
To launch the IBM WebSphere Voice Server SDK 2.0 Installation Wizard, run the setup.exe file, which is located in the directory where you unpacked the installation package. Follow the instructions in the Installation Wizard to install the SDK.
Repeat the procedure for downloading Voice Toolkit. Run the setup.exe file to begin the Installation Wizard and follow the instructions.
You are now ready to develop VoiceXML applications on your PC without requiring telephone connectivity using telephone simulation.
The Voice Toolkit IDE provides the following:
- VoiceXML editor,
- VoiceXML debugger,
- grammar editor,
- grammar test tool,
- pronunciation builder,
- built-in audio recorder,
- VoiceXML reusable dialog components,
- speech recognition engine,
- Text-To-Speech engine.
We've now gotten our first VoiceXML application to work. In a mere nine lines of code we've demonstrated TTS, ASR, and a trivial call flow. The next chapter will pick up where this one left off, starting with simple forms and going on to cover all of the major language features of VoiceXML.