Home > Articles > Web Services > XML

  • Print
  • + Share This
Like this article? We recommend

Creating Dynamic Voice XML

Dynamic Voice XML takes us to the next power level, enabling you to create more robust and up-to-date applications by dynamically creating Voice XML using server data. With dynamic Voice XML, callers can reap the benefit of the latest updates to your data repository.

To do dynamic Voice XML we’ll need a server technology to trigger the conversion of data in a data repository to Voice XML. Server technologies capable of handling this include Java Servlets, Java Server Pages (JSPs), Active Server Pages (ASPs), PHP scripts and many other server-side scripting technologies. The basic idea is to use a program to extract data from a repository and generate a valid Voice XML document.

While this might sound complex, involving setting up a relational database and writing code to extract and generate XML, I want to remind you that we’re in XML-land and have at our disposal a variety of XML tools—one of the most powerful being XSLT, the XML transform language.

With XSLT you can take any XML document as input and generate literally anything that is text as output.

Setting things up for dynamic Voice XML generation is a multistep process:

  1. Define the structure of the Voice XML you want to generate. Work with sample data and write a series of forms and menus. Some parts will be the same no matter how the data changes; some will have all or part of their content be dependent on the data.
  2. Identify the parts that will be dependent on the data.
  3. Write XSLT to extract the data you need and insert it into the Voice XML. Test offline.
  4. Select a server technology to execute the transform. Options include Java servlets, JSPs, ASPs, PHP, and so on. Here we’ll be using a Java servlet.
  5. Map the application phone number to the URI of the servlet that will execute the transform. Done.

So let’s look at our boss example dynamically. As the boss is getting ready for the meeting, his staff is preparing data on the participants. All the staff has to do is modify an XML file on the server. No programming is required since the XSLT will have already be written (we’ll see it shortly) based on the structure of the XML data.

The updatable XML data file looks like this:

<players>
 <player name="dave">Wife jane. Two kids, mark and tamara.</player>
 <player name="karen">Husband jimmy. Two kids, brett and courtney.</player>
 <player name="roger">Separated from wife marla. No kids</player>
</players>

The goal is to have the boss dial in and hear a dialogue that might go like this:

C: who?
U: Karen (the boss can also press 2)
C: Husband Jimmy. Two kids, Brett and Courtney.

The Voice XML behind this dialogue is given in Listing 4.

Listing 4 Voice XML person lookup

<?xml version="1.0" encoding="UTF-8"?> 
<vxml version="2.0" xmlns="http://www.w3.org/2001/vxml" 
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
 xsi:schemaLocation="http://www.w3.org/2001/vxml 
  http://www.w3.org/TR/Voice XML20/vxml.xsd"> 
  
 
  <menu id="main" dtmf="true">
    <prompt> 
     who?
     <enumerate/>
    </prompt> 
    <choice next="#dave">dave</choice>
    <choice next="#karen">karen</choice>
    <choice next="#roger">roger</choice>
  </menu> 
  
  <form id="dave">   
   <block>
    Wife jane. Two kids, mark and tamara
   </block> 
  </form> 
  
  <form id="karen">   
   <block>
    Husband jimmy. Two kids, brett and courtney.
   </block> 
  </form> 
  
  <form id="roger">   
   <block>
    Separated from wife marla. No kids.
   </block> 
  </form>   

</vxml>

The beauty of this approach is that we can test the XSLT offline to make certain that it generates the Voice XML we want. The critical piece is the XML data file that requires very little technical expertise to maintain and can be managed with tools that ensure the XML stays well-formed and conforms to whatever DTD or schema we might have defined.

Although this is not a series on XSLT, let’s look at the transformation code in Listing 5 that turns the XML data file into a Voice XML document.

Listing 5 XSLT Transform to create Voice XML

1  <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
        version="1.0">
2   <xsl:output method="xml" 
    doctype-system="http://www.w3.org/TR/Voice XML20/vxml.dtd"/>
  
3  <xsl:template match="/">

   <!-- all elements that do NOT begin with xsl: are output verbatim -->

4  <vxml version="2.0" 
5      xmlns="http://www.w3.org/2001/vxml" 
6      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
      
    
7  <menu id="main" dtmf="true">
8   <prompt> 
9     who?
10     <enumerate/>
11     </prompt> 
     
     <!-- xslt action here: for each player element in data -->
12   <xsl:for-each select="/players/player">
13   <xsl:element name="choice">
14    <xsl:attribute name="next">#<xsl:value-of select="@name"/>
15      </xsl:attribute>
16     <xsl:value-of select="@name"/>
17   </xsl:element>
18   </xsl:for-each>
19   
20  </menu>
      
   <!-- now set up individual forms, one for each player  -->
21   <xsl:for-each select="/players/player">
22      <xsl:element name="form">
23    <xsl:attribute name="id"><xsl:value-of select="@name"/>
24      </xsl:attribute>

25    <block>
26     <prompt>
27    <xsl:value-of select="."/>          
28    </prompt>
29    </block>

30   </xsl:element>
        
31   </xsl:for-each>
      
32  </vxml>
  
33 </xsl:template>
 
34 </xsl:stylesheet>

Some things to note about the XSLT in Listing 5 that generates the Voice XML include the following:

Line 1: XSLT is an XML vocabulary whose root element is <stylesheet>. We use xsl: as a prefix to distinguish the XSLT elements from the Voice XML elements.

Line 2: Here we tell the transform engine that our output will be XML and we provide the URI of the DTD for Voice XML.

Line 3: This begins the XSLT transformation. The transform engine matches against the root (/) of the XML data. Since there is always a root, the content of this template element, which includes just about the entire remainder of the document, will be output.

Lines 4–11: This begins the output. Note that what you see here is simply the first lines of the Voice XML in Listing 4.

Line 12: Here the XSLT engine stops to process one of its own elements (that begins with xsl). The for-each construct selects all the player elements in the XML data file and for each one, outputs a choice element, setting up the initial dialogue.

Line 21: Another for-each, this time to construct a separate form for each person in the data file. We name the form based on the name of the attribute (@name).

Line 27: We insert the message by selecting the contents of the person element (select ".")

  • + Share This
  • 🔖 Save To Your Account