Home > Articles > Web Services > XML

  • Print
  • + Share This
This chapter is from the book

This chapter is from the book

14.6 | -Processing an E-mail Message by Using the rfc822 Module

The two functions DoLinuxMailBox and DoEudoraMailBox chop up mailboxes into individual messages that are processed by the ProcessMessage function. This function uses the rfc822 module to separate the headers from the body of the message.

CD-ROM reference=14007.txt
def ProcessMessage(lines,out):
      """
      Given the lines that make up an e-mail message,
      create an XML message element. Uses the rfc822
      module to parse the e-mail headers.
      """
      out.write("<message>\n")
      # Create a single string from these lines.
      MessageString = string.joinfields(lines,"")
      # Create a file object from the string for use
      # by the rfc822 module.
      fo = StringIO.StringIO(MessageString)
      m = rfc822.Message (fo)
      # The m object now contains all the headers.
      # The headers can be accessed as a Python dictionary.
      out.write("<headers>\n")
      for (h,v) in m.items():
            out.write("<field>\n")
            out.write("<name>%s</name>\n" % XMLEscape(h))
            out.write("<value>%s</value>\n" % XMLEscape(v))
            out.write("</field>\n")
      out.write("</headers>\n")
      out.write("<body>\n")
      out.write(XMLEscape(fo.read()))
      out.write("</body>\n")
      out.write("</message>\n")

Time to illustrate the program in action. The -l (Linux) or -e (Eudora) command-line switch tells the program what type of mailbox to process.

Here is a small Eudora mailbox.

CD-ROM reference=14008.txt

C>type test.mbx

From ???@??? Mon Sep 06 14:07:14 1999
To: sean@p13
From: Sean Mc Grath <sean@digitome.com>
Subject: Hello
Cc:
Bcc:
X-Attachments:
In-Reply-To:
References:
X-Eudora-Signature: <Standard>

World

From ???@??? Mon Sep 06 14:07:31 1999
To: sean@p13
From: Sean Mc Grath <sean@digitome.com>
Subject: Message 2
Cc:
Bcc:
X_Attachments:
In-Reply-To:
References:
X-Eudora-Signature: <Standard>
Hello
From ???@??? Mon Sep 06 14:13:41 1999
To: sean@p13
From: Sean Mc Grath <sean@digitome.com>
Subject: Message 2
Cc:
Bcc:
X-Attachments:
In-Reply-To:
References:
X-Eudora-Signature: <Standard>

From sean@digitome.com

Hello

The file can be converted to XML as follows.

CD-ROM reference=14009.txt

C>python xmail.py -e test.mbx

<?xml version="1.0"?>
<!DOCTYPE xmail SYSTEM "xmail.dtd">
<xmail>
<message>
<headers>
<field>
<name>subject</name>
<value>Hello</value>
</field>
<field>
<name>references</name>
<value></value>
</field>
<field>
<name>bcc</name>
<value></value>
</field>
<field>
<name>x-attachments</name>
<value></value>
</field>
<field>
<name>cc</name>
<value></value>
</field>
<field>
<name>in-reply-to</name>
<value></value>
</field>
<field>
<name>x-eudora-signature</name>
<value>&lt;Standard></value>
</field>
<field>
<name>from</name>
<value>Sean Mc Grath &lt;sean@digitome.com></value>
</field>
<field>
<name>to</name>
<value>sean@p13</value>
</field>
</headers>
<body>
World

</body>
</message>
<message>
<headers>
<field>
<name>subject</name>
<value>Message 2</value>
</field>
<field>
<name>references</name>
<value></value>
</field>
<field>
<name>bcc</name>
<value></value>
</field>
<field>
<name>x-attachments</name>
<value></value>
</field>
<field>
<name>cc</name>
<value></value>
</field>
<field>
<name>in-reply-to</name>
<value></value>
</field>
<field>
<name>x-eudora-signature</name>
<value>&lt;Standard></value>
</field>
<field>
<name>from</name>
<value>Sean Mc Grath &lt;sean@digitome.com></value>
</field>
<field>
<name>to</name>
<value>sean@p13</value>
</field>
</headers>
<body>
Hello

</body>
</message>
<message>
<headers>
<field>
<name>subject</name>
<value>Message 2</value>
</field>
<field>
<name>references</name>
<value></value>
</field>
<field>
<name>bcc</name>
<value></value>
</field>
<field>
<name>x-attachments</name>
<value></value>
</field>
<field>
<name>cc</name>
<value></value>
</field>
<field>
<name>in-reply-to</name>
<value></value>
</field>
<field>
<name>x-eudora-signature</name>
<value>&lt;Standard></value>
</field>
<field>
<name>from</name>
<value>Sean Mc Grath &lt;sean@digitome.com></value>
</field>
<field>
<name>to</name>
<value>sean@p13</value>
</field>
</headers>
<body>
From sean@digitome.com

Hello

</body>
</message>
</xmail>

Notice how the & character has been escaped to &amp; whenever it occurs in a header or the body of an e-mail message.

Here is a small, Linux-style mailbox.

CD-ROM reference=14010.txt

$cat test

From sean@digitome.com  Mon Sep  6 13:58:36 1999
Return-Path: <sean@digitome.com>
Received: from gateway ([100.100.100.105])
        by p13.digitome.com (8.9.3/8.8.7) with SMTP id NAA07403
        for <sean@p13>; Mon, 6 Sep 1999 13:58:36 GMT
Message-Id: <3.0.6.32.19990906140714.009b0ac0@p13>
X-Sender: sean@p13
X-Mailer: QUALCOMM Windows Eudora Light Version 3.0.6 (32)
Date: Mon, 06 Sep 1999 14:07:14 +0100
To: sean@p13.digitome.com
From: Sean Mc Grath <sean@digitome.com>
Subject: Hello
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
World

From sean@digitome.com  Mon Sep  6 13:58:53 1999
Return-Path: <sean@digitome.com>
Received: from gateway ([100.100.100.105])
        by p13.digitome.com (8.9.3/8.8.7) with SMTP id NAA07407
        for <sean@p13>; Mon, 6 Sep 1999 13:58:52 GMT
Message-Id: <3.0.6.32.19990906140731.009b6a40@p13>
X-Sender: sean@p13
X-Mailer: QUALCOMM Windows Eudora Light Version 3.0.6 (32)
Date: Mon, 06 Sep 1999 14:07:31 +0100
To: sean@p13.digitome.com
From: Sean Mc Grath <sean@digitome.com>
Subject: Message 2
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"

Hello

It can be converted to XML with the following command.

CD-ROM reference=14011.txt

$python xmail.py -l test

<?xml version="1.0"?>
<!DOCTYPE xmail SYSTEM "xmail.dtd">
<xmail>
<message>
<headers>
<field>
<name>subject</name>
<value>Hello</value>
</field>
<field>
<name>x-sender</name>
<value>sean@p13</value>
</field>
<field>
<name>x-mailer</name>
<value>QUALCOMM Windows Eudora Light Version 3.0.6 (32)</value>
</field>
<field>
<name>content-type</name>
<value>text/plain; charset="us-ascii"</value>
</field>
<field>
<name>message-id</name>
<value>&lt;3.0.6.32.19990906140714.009b0ac0@p13></value>
</field>
<field>
<name>to</name>
<value>sean@p13.digitome.com</value>
</field>
<field>
<name>date</name>
<value>Mon, 06 Sep 1999 14:07:14 +0100</value>
</field>
<field>
<name>mime-version</name>
<value>1.0</value>
</field>
<field>
<name>return-path</name>
<value>&lt;sean@digitome.com></value>
</field>
<field>
<name>from</name>
<value>Sean Mc Grath &lt;sean@digitome.com></value>
</field>
<field>
<name>received</name>
<value>from gateway ([100.100.100.105])
 by p13.digitome.com (8.9.3/8.8.7) with SMTP id NAA07403
 for &lt;sean@p13>; Mon, 6 Sep 1999 13:58:36 GMT</value>
</field>
</headers>
<body>
World

</body>
</message>
<message>
<headers>
<field>
<name>subject</name>
<value>Message 2</value>
</field>
<field>
<name>x-sender</name>
<value>sean@p13</value>
</field>
<field>
<name>x-mailer</name>
<value>QUALCOMM Windows Eudora Light Version 3.0.6 (32)</value>
</field>
<field>
<name>content-type</name>
<value>text/plain; charset="us-ascii"</value>
</field>
<field>
<name>message-id</name>
<value>&lt;3.0.6.32.19990906140731.009b6a40@p13></value>
</field>
<field>
<name>to</name>
<value>sean@p13.digitome.com</value>
</field>
<field>
<name>date</name>
<value>Mon, 06 Sep 1999 14:07:31 +0100</value>
</field>
<field>
<name>mime-version</name>
<value>1.0</value>
</field>
<field>
<name>return-path</name>
<value>&lt;sean@digitome.com></value>
</field>
<field>
<name>from</name>
<value>Sean Mc Grath &lt;sean@digitome.com></value>
</field>
<field>
<name>received</name>
<value>from gateway ([100.100.100.105])
 by p13.digitome.com (8.9.3/8.8.7) with SMTP id NAA07407
 for &lt;sean@p13>; Mon, 6 Sep 1999 13:58:52 GMT</value>
</field>
</headers>
<body>
Hello

</body>
</message>
</xmail>
  • + Share This
  • 🔖 Save To Your Account