Home > Articles > Programming > PHP

  • Print
  • + Share This
This chapter is from the book

This chapter is from the book

Manipulating DOM Trees

The preceding section discussed traversing an already-extant DOM tree, demonstrating how the nodes of the tree can be processed in a recursive manner. That's not all you can do with the DOM, though; it's also possible to programmatically construct DOM trees from scratch, or modify existing tree structures, and save the result as one or more XML documents. This section discusses the details.

Creating New DOM Trees

If you go back to the section dealing with PHP's DOM classes, you'll see that both the DomDocument and DomElement objects include functions to create new documents, nodes, and attributes. The first of these is the new_xmldoc() method of the DomDocument object, which constructs and returns a new instance of the DomDocument object.

After a DomDocument instance is available, it becomes possible to add new element and text nodes with the add_root() and new_child() methods. And why stop at elements? The set_attribute() method allows you to define and add attributes to specific elements as well. The following code snippet (see Listing 3.14) demonstrates this by creating a complete XML document tree on the fly with the add_root() and new_child() methods:

Listing 3.14 Creating an XML Document Tree


// create DomDocument object
$doc = new_xmldoc("1.0");

// add root node
$root = $doc->add_root("article");

// set attribute for root node
$root->set_attribute("id", "567");

// add children to the root
$title = $root->new_child("title", "Goat milk for dinner can cause insomnia");
$author = $root->new_child("author", "K. Kittle");

// note how I can programatically generate node values!
$date = $root->new_child("date", date("d-M-Y", mktime()));

// dump the tree as a string
echo $doc->dumpmem();

After the tree is constructed to your satisfaction, you need to output it, either for display or storage. The DomDocument object's dumpmem() method returns a representation of the current tree as a string. You can then format it for printing, save it to a file, or transmit it to another agent.

Here, Pretty!

Note that if you intend to print the dynamically generated DOM tree, it might be a good idea to run your own formatting functions on it first to pretty it up a little. This is because dumpmem() outputs the document tree as a single string, without formatting or indenting it; in the case of long and/or complex XML documents, it can be fairly difficult to read.

The ability to construct new DOM trees on the fly comes in particularly handy if you need to build a customized DOM tree from another data source. This data source may be a text file that needs to be parsed, a database that needs to be queried, or even another XML document tree that needs to be pruned or combined with other data. Consider Listing 3.15, which uses MySQL database records to construct an XML book catalog and display it to the user.

Listing 3.15 Constructing a DOM Tree from a MySQL Resultset


// create DomDocument object
$doc = new_xmldoc("1.0");

// add root node
$root = $doc->add_root("collection");

// query database for records
$connection = mysql_connect("localhost", "us8749", "secret") or die ("Unable to
connect!"); mysql_select_db("db633") or die ("Unable to select database!"); $query = "SELECT id, title, author, price FROM books"; $result = mysql_query($query) or die ("Error in query: $query. " . mysql_error()); // iterate through resultset while($row = mysql_fetch_object($result)) { $record = $root->new_child("record", ""); $record->set_attribute("id", $row->id); $record->new_child("title", $row->title); $record->new_child("author", $row->author); $record->new_child("price", $row->price); } // close connection mysql_close($connection); // dump the tree as a string echo $doc->dumpmem(); ?>

Nothing too complicated here—I'm connecting to the database, extracting a list of titles and authors, and creating an XML document from the result. After the document tree has been created in memory, I can either display it (which is what I've done) or save it to a file (demonstrated in Listing 3.17).

Manipulating Existing DOM Trees

It's also possible to use the functions described previously to modify an existing DOM tree. Consider the XML document in Listing 3.16, which contains the outline for a book chapter.

Listing 3.16 A Book Chapter Marked up in XML (ch9.xml)

<?xml version="1.0"?>

<chapter id="9">

   <!-- chapter 9 of a really bad pulp fiction novel -->

   <title>Luke Gets Angry</title>

   <para>As the black-suited warriors swarmed off the HUMVEE, Luke turned to Jo
and said quietly, "Don't go anywhere. I'll just be a minute."</para> <para>The first warrior reached Luke and aimed a roundhouse kick at his head.
Luke ducked easily, twisting under the leg and breaking with a sharp crack.
The warrior moaned and tumbled backwards. Luke grinned. "Bring it on, " he
hollered.</para> <para>The second soldier approached more cautiously. Moving carefully, he
crept up behind Luke and leaped at him. Sensing movement, Luke moved aside
at the last moment, knocked the soldier unconscious with a well-placed
punch and stripped him of his portable grenade launcher. A few seconds
later, the HUMVEE was in flames, and the soldiers had fled in panic.</para> <para>Luke laughed crazily. He was just beginning to enjoy himself.</para> </chapter>

Now, let's suppose the author decides that "Luke" is actually a pretty wimpy name for the lead character. Instead, he decides to go with "Crazy Dan," which has a much more macho ring to it. Because he's already nine chapters into the book, he needs to change every previous occurrence of "Luke" to "Crazy Dan." All he needs to do is write a PHP program to construct a DOM tree from the XML file, scan through it for every occurrence of "Luke," alter it to "Crazy Dan," and save the result to a new file (see Listing 3.17).

Search and Destroy

I know, I know, he could use any text editor's search-and-replace function. But this chapter's about the DOM, smart guy.

Listing 3.17 Performing a Search-and-Replace Operation on a DOM Tree

// XML file
$xml_file = "/tmp/ch9.xml";

// parse document
if(!$doc = xmldocfile($xml_file))
   die("Error in XML document");

// get the root
$root = $doc->root();

// children of the root
$children = $root->children();
// start traversing the tree
search_and_replace($children, "Luke", "Crazy Dan");   

// all done, save the new tree to a file
// or display it if file write not possible
if (is_writable(dirname($xml_file)))
   $filename = dirname($xml_file) . "/_new_" . basename($xml_file);
   $fp = fopen($filename,"w+");
   echo $doc->dumpmem();

// this is a recursive function to traverse the DOM tree
// when it finds a text node, it will look for the search string and replace with
// the replacement string   
function search_and_replace($nodeCollection, $search, $replace)
   for ($x=0; $x<sizeof($nodeCollection); $x++)

     if ($nodeCollection[$x]->type == XML_ELEMENT_NODE)
        // if element, it may contain child text nodes
        // go one level deeper
        $nextCollection = $nodeCollection[$x]->children();
        search_and_replace($nextCollection, $search, $replace);
     else if ($nodeCollection[$x]->type == XML_TEXT_NODE)
        // if text node, perform replacement
        $str = str_replace($search, $replace, $nodeCollection[$x]-
>content); // remember to write the value of the text node back to the tree! $nodeCollection[$x]->set_content($str); } } } ?>

This example is similar to Listing 3.10, in that it too uses a recursive function to process the DOM tree. In this case, though, the recursive function limits its activities to two types of nodes: element nodes and text nodes. If the node is an element node, I ignore it, and call the recursive function again to move one level deeper into the tree; if it's a text node, I scan it for a match to the search string, substitute the replacement text, and write the new string back to the tree.

After the process has concluded, the new DOM tree is written to a file (or, in the event that the directory is not accessible, displayed to the user).

If you examine the resulting output, you'll notice one interesting thing about the set_content() method—it automatically replaces special characters (such as the double quotation marks in Listing 3.16) with the corresponding XML entities (in this case, &quot;).

Going Native

You may sometimes come across situations that require you to convert raw XML markup into native data structures such as variables, arrays, or custom objects. For these situations, PHP's DOM parser includes a very specialized little function named xmltree().

The xmltree() function parses an XML string, and constructs a hierarchical tree of PHP objects representing the structured markup. This tree includes many of the same objects you've become familiar with—instances of the DomDocument, DomElement, DomText, and DomAttribute objects.

xmltree() provides an easy way to quickly see the structure of a complete XML document. For the moment, though, that's all it's useful for; it's not possible to write the tree back to a file, or to memory, after manipulating it.

Note also that, as of this writing, xmltree() only accepts an XML string. You cannot pass it a file name or file reference.

  • + Share This
  • 🔖 Save To Your Account