Home > Articles > Web Services > XML

  • Print
  • + Share This
Like this article? We recommend

Like this article? We recommend

Serialization and Deserialization

A more elegant solution is to use the serialization and deserialization features of XML. A serialization is a series of data values, usually text, which represents an object. It contains the object's public and private variables, and any other information needed to completely describe the object. For example, suppose a Student object contains LastName and FirstName variables, and an array of test scores. The XML serialization of a Student object might look like this:

<Student>
 <LastName>Stephens</LastName>
 <FirstName>Rod</FirstName>
 <TestScore TestNumber="1" Score="80" />
 <TestScore TestNumber="2" Score="92" />
 <TestScore TestNumber="3" Score="87" />
 <TestScore TestNumber="4" Score="94" />
</Student>

Building a serialization for a class is simple enough following roughly the same steps you would take to save the object in a database, as described in the previous section. The class just concatenates the object's variable values into a string. Later, it could parse the serialization string to restore the object. To make an XML serialization, the class includes the proper tags.

This method works, but has some of the same drawbacks as the previous method. In particular, if you modify the class, you need to modify the code that reads and writes the serialization.

Here's the cool trick: VB.NET's XmlSerializer class can serialize and deserialize your classes for you. XmlSerializer uses VB.NET's reflection tools to examine the class and figure out what it needs to do to serialize and deserialize it. If you later modify your class, the XmlSerializer automatically changes the way it serializes and deserializes objects.

Serialization

The following code serializes a Person object named customer, and displays the serialization. It begins by creating an XmlSerialization object. The call to GetType(Person) tells the serializer about the Person class.

Next, the code calls the serializer's Serialize method. This method writes the serialization of the customer object into the StringWriter named string_writer.

The code finishes by using the StringWriter's ToString method to retrieve the serialization's value and display it in a message box.

Dim customer As Person
Dim xml_serializer As XmlSerializer
Dim string_writer As New StringWriter()

' Initialize the customer object somehow.
...

' Create the XmlSerializer. GetType(Person) tells the
' serializer about the Person class.
xml_serializer As New XmlSerializer(GetType(Person))

' Make the serialization, writing it into the StringWriter.
xml_serializer.Serialize(string_writer, customer)

' Display the results.
MsgBox(string_writer.ToString)

Notice that this code contains no information about the Person class. When the code creates the XmlSerializer object, the call to GetType(Person) gives the serializer all the information it needs.

Here's what an actual serialization produced by the XmlSerializer might look like:

<?xml version="1.0" encoding="utf-16"?>
<Person xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
 <LastName>Stephens</LastName>
 <FirstName>Rod</FirstName>
 <EmailAddress>RodStephens@vb-helper.com</EmailAddress>
</Person>

Don't worry about the XML declaration at the beginning of the serialization and the namespace information inside the opening Person tag. The XmlSerializer knows how to read them when it deserializes the XML data. The important thing is that this text contains a Person tag. The Person tag holds LastName, FirstName, and EmailAddress tags that record the original Person object's data.

Deserialization

The following code shows how a program can deserialize a serialization stored in a string variable. It begins by making a StringReader and filling it with the serialization.

It then calls the XmlSerializer's Deserialize method, passing it the StringReader. Deserialize creates and returns the new Person object. The code sets the variable customer to point to the new object.

Dim customer As Person
Dim serialization As String
Dim string_reader As StringReader
Dim xml_serializer As New XmlSerializer(GetType(Bouncer))

' Load the serialization string from a database,
' text file, or somewhere.
...

' Make a StringReader holding the serialization.
string_reader = New StringReader(serialization)

' Create the new Person object from the serialization.
customer = xml_serializer.Deserialize(string_reader)

The XmlSerializer is pretty amazing. Using it, you can build a database that stores any kind of object using its serialization. If you change the object's class, you can continue using the same storage method without any modifications to your code. You can even continue to use the same database tables to store the objects.

What's the Catch?

If this all sounds too good to be true, you're sort of right—there are two small catches. First, if a class definition changes, its serialization changes so the serializer may not know what to do with older serializations. Second, if you make changes to a class, you may want to change the way the database stores its objects. These two issues are described in the following sections.

Serialization Changes

If you change a class definition, the XmlSerializer will not know exactly what to do with old serializations. For example, suppose you have a Person class with a serialization that looks like this:

<?xml version="1.0" encoding="utf-16"?>
<Person xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
 <LastName>Stephens</LastName>
 <FirstName>Rod</FirstName>
 <EmailAddress>RodStephens@vb-helper.com</EmailAddress>
</Person>

Suppose you add a PhoneNumber field to the Person class. Now, suppose your program tries to load an older serialization that was generated before you added the PhoneNumber field. That serialization doesn't contain information about a PhoneNumber field, so the XmlSerializer cannot initialize that field from the information it is given.

Similarly, if you remove a field from the Person class definition, XmlSerializer won't know what to do with the information it sees in older serializations. For example, suppose you remove the EmailAddress field from the Person class. When XmlSerializer reads an old serialization, it won't know what to do with the EmailAddress information because Person no longer has an EmailAddress field.

Fortunately, the XmlSerializer is smart enough to not crash in these cases. It simply ignores any information it doesn't understand. In this example, it would leave the PhoneNumber field uninitialized, and it would ignore the obsolete EmailAddress field.

You can avoid some complications by making the class constructor initialize any fields that will later be required by the program. For instance, suppose you are building a company phone directory, and your program absolutely positively must have a value for the Person object's new PhoneExtension field. If the extension for the operator is 0, you could initialize new objects like this:

Public Sub New()
  PhoneExtension = "0"
End Sub

Now if the XmlSerializer loads an old serialization that doesn't include PhoneExtension information, the new object's PhoneExtension value is 0.

Database Changes

Sometimes, when you make a change to a class, you will want to change the way objects are stored in the database.

The simplest way to store an object is to put its serialization in a database table. The table could have a single field that contained object serializations. You can retrieve the serializations and restore the objects they represent.

Although the database knows nothing about the class definition, it still gives you some simple options for finding specific objects. For example, suppose you want to load the Person object with the name Terry Pratchett. You could find that object's serialization using this database query:

SELECT Serialization
FROM People
WHERE Serialization LIKE '%<FirstName>Terry</FirstName>%'
 AND Serialization LIKE '%<LastName>Pratchett</LastName>%'

This query finds records in the People table in which the Serialization field has a value containing the strings "<FirstName>Terry</FirstName>" and "<LastName>Pratchett</LastName>".

This method is simple and effective. You can even use it to perform partial matches, as in this example, which finds records with first name Terry and last name starting with P.

SELECT Serialization
FROM People
WHERE Serialization LIKE '%<FirstName>Terry</FirstName>%'
 AND Serialization LIKE '%<LastName>P%</LastName>%'

Unfortunately this kind of search can be slow. To locate the right record, the database needs to examine every record in the table individually. You can't speed things up by making a useful index on a serialization field that contains values stuck in the middle like this.

This method also restricts the flexibility of your queries. For example, suppose you are building a personnel system, and the Person class includes a Salary field. You cannot use wild card matches like those in the previous examples to select records in which the Salary is greater than 30,000.

In cases like these, you may want to store key information in separate fields to use in searching. For example, you could add a Salary field to the table. Then, you can perform searches like this one:

SELECT Serialization
FROM People
WHERE Salary > 30000

Of course if you add extra fields to the table, you will need to modify your code if you make changes to those fields. If you remove the Salary field, you will need to modify your code to use different table columns.

  • + Share This
  • 🔖 Save To Your Account