Home > Articles > Operating Systems, Server > Microsoft Servers

  • Print
  • + Share This

Decomposing a PDB File

My sample "PDB File Exploder" w2k_pdbx.exe is a barebones Win32 console-mode utility that performs the following processing steps:

  • First, it allocates a virtual memory block large enough to hold the entire PDB file data, and copies the file from disk to memory.

  • Before attempting any interpretation, the data has to undergo a simple verification test, performed by the PdbValid() function in Listing 4. Given a pointer to the memory block, which is supposed to start with a PDB_HEADER structure (function argument pph) and the number of bytes read from the file (function argument dData), this function first ensures that there is at least enough space for a complete PDB_HEADER structure. Otherwise, accessing any of its members might cause an exception. Next, the presence of a PDB V2.00 signature is verified. Finally, PdbValid() computes the number of data bytes indicated by the page count and page size, and matches the result against the file size. Of course, this test is very raw—a good PDB reader should also consider verifying that all page numbers in the header and root stream are within the proper range.

  • Depending on the user-supplied command switches, the utility writes the main components of the PDB file to separate files. It recognizes the options h (extract header), a (extract allocation bits), r (extract root stream), and d (extract data streams). The command line may comprise multiple PDB file paths, and for each file, these four options can be turned on or off by prefixing the option ID with a plus or minus sign. For example, the command w2k_pdbx+hardD:\WINNT\Symbols\exe\ntoskrnl.pdb extracts all valid PDB data that is buried in the Windows 2000 kernel's symbol file.

Listing 4 Simple PDB Sanity Check

BOOL PdbValid (PPDB_HEADER pph,
        DWORD    dData)
  {
  return (pph != NULL) && (dData >= PDB_HEADER_) &&
      (!lstrcmpA (pph->abSignature, PDB_SIGNATURE_200))&&
      ((DWORD) pph->wFilePages * pph->dPageSize == dData);
  }

If the +h option is specified, the PDB_HEADER portion is saved, including all valid root stream page numbers. In this case, w2k_pdbx.exe simply writes the first PDB page to disk. If you assume a weird scenario, in which you have 65,536 zero-length data streams, it turns out that the size of the root stream would be 524,292 bytes, or 513 pages in 1-KB mode. Because each page number takes up 16 bits in the header's awRootPages[] array, it is apparent that the header size would exceed 1,024 bytes. The situation gets even worse if the data streams are not empty. Frankly, I currently don't know how the Microsoft PDB tools handle this special case. However, I doubt that you will ever run into such a pathological file in real life.

Saving the allocation bits on behalf of the +a option is also quite simple. The number of pages occupied by the bit array is given by the wStartPage member of the PDB_HEADER minus 1, again assuming that the header doesn't exceed the one-page limit. Extracting the root stream (command option +r) requires a bit more work because the program must first find out the size of the root stream. This calculation is not trivial because the size depends on the number and size of the data streams, and you must take into account that the root stream may span multiple pages that are not necessarily contiguous. Listing 5 shows a possible iterative solution. The PdbRoot() function uses a three-step approximation procedure to find out the exact size in bytes, and copies the data to a contiguous memory block.

Listing 5 Copying the Root Stream

PPDB_ROOT PdbRoot (PPDB_HEADER pph,
          PDWORD   pdBytes)
  {
  DWORD   dBytes, i, n;
  PPDB_ROOT ppr = NULL;
  
  if ((ppr = PdbRead (pph, PDB_ROOT_, pph->awRootPages)) != NULL)
    {
    dBytes = PDB_ROOT__ ((DWORD) ppr->wCount);
    free (ppr);

    if ((ppr = PdbRead (pph, dBytes, pph->awRootPages)) != NULL)
      {
      for (n = i = 0; i < (DWORD) ppr->wCount; i++)
        {
        n += PdbPages (pph, ppr->aStreams [i].dStreamSize);
        }
      dBytes += n * sizeof (WORD);
      free (ppr);

      ppr = PdbRead (pph, dBytes, pph->awRootPages);
      }
    }
  *pdBytes = (ppr != NULL ? dBytes : 0);
  return ppr;
  }
  • The first approximation is based on the fact that the root stream starts out with the fixed-size portion of a PDB_ROOT structure, which will always fit into a single page. Therefore, PdbRoot() uses the general-purpose PdbRead() function defined in Listing 6 to load the first root stream page. PdbRead() is sort of the workhorse of the w2k_pdbx.exe utility—it copies pages from the PDB memory image to a contiguous memory block, given a page number array and the number of bytes to copy. It relies on the PdbPages() function at the top of Listing 6 that computes the number of stream pages from the stream size in bytes and the current page size.

  • In step 2, PdbRoot() can compute the size of the PDB_ROOT structure including all aStreams[] entries, but not including the following page number array. Although not very probable, this data might already exceed one page. In 1-KB page mode, this would happen as soon as the stream directory contained 128 or more data streams. However, PdbRead() comes to the rescue, and builds a faithful and contiguous copy in memory.

  • Now that the entire PDB_STREAM array of the PDB_ROOT structure is in memory, it is easy to find out the overall size of the root stream by adding up the number of pages taken up by each data stream, yielding the required size of the page number array following the PDB_ROOT data. Again, PdbRead() is employed to reshuffle all root stream pages into a newly allocated memory block.

Listing 6 Joining Stream Pages In a Contiguous Memory Block

DWORD PdbPages (PPDB_HEADER pph,
        DWORD    dBytes)
  {
  return (dBytes ? (((dBytes-1) / pph->dPageSize) + 1) : 0);
  }

// -----------------------------------------------------------------

PVOID PdbRead (PPDB_HEADER pph,
        DWORD    dBytes,
        PWORD    pwPages)
  {
  DWORD i, j;
  DWORD dPages = PdbPages (pph, dBytes);
  PVOID pPages = malloc (dPages * pph->dPageSize);

  if (pPages != NULL)
    {
    for (i = 0; i < dPages; i++)
      {
      j = pwPages [i];

      CopyMemory ((PBYTE) pPages + (i * pph->dPageSize),
            (PBYTE) pph  + (j * pph->dPageSize),
            pph->dPageSize);
      }
    }
  return pPages;
  }

By now, we are almost done. Saving the data streams is almost trivial once the root stream is assembled in memory. As you might have guessed, the PdbRead() function does the hard work again. Listing 7 shows the PdbStream() function that produces a virtual-memory copy of the data stream identified by the zero-based dStream index. Before calling PdbRead(), the function locates the page number subarray associated with the requested stream by looping through the stream directory, and passes this pointer to PdbRead() as its third argument. PdbStream() returns the size of the stream via its output parameter pdBytes.

Listing 7 Copying Data Streams

PVOID PdbStream (PPDB_HEADER pph,
         PPDB_ROOT  ppr,
         DWORD    dStream,
         PDWORD   pdBytes)
  {
  DWORD dBytes, i;
  PWORD pwPages;
  PVOID pPages = NULL;

  if (dStream < (DWORD) ppr->wCount)
    {
    pwPages = (PWORD) ((PBYTE) ppr +
              PDB_ROOT__ ((DWORD) ppr->wCount));

    for (i = 0; i < dStream; i++)
      {
      pwPages += PdbPages (pph,
                 ppr->aStreams [i].dStreamSize);
      }
    dBytes = ppr->aStreams [dStream].dStreamSize;
    pPages = PdbRead (pph, dBytes, pwPages);
    }
  *pdBytes = (pPages != NULL ? dBytes : 0);
  return pPages;
  }

If the w2k_pdbx.exe utility is run without any command arguments, it displays the help screen shown in Example 1. By default, the various output files are written to the current directory. However, you can override this setting by explicitly specifying a target directory. This can be a relative or absolute path—with or without a trailing backslash. In any case, this directory must exist, and the path specification must be prefixed by a slash character.

Example 1: The w2k_pdbx.exe Command Help Screen

D:\tmp>w2k_pdbx

// w2k_pdbx.exe
// SBS Program Database Exploder V1.00
// 07-07-2001 Sven B. Schreiber
// sbs@orgon.com

Usage: w2k_pdbx { [+-hard] [/<target>] <PDB path> }

    +  enable subsequent options
    -  disable subsequent options
    h  extract header
    a  extract allocation bits
    r  extract root stream
    d  extract data streams

Target paths:
    +h <target>\<PDB file>.header
    +a <target>\<PDB file>.alloc
    +r <target>\<PDB file>.root
    +d <target>\<PDB file>.<###>

    <###> = 0-based stream number.

If /<target> is omitted, the files are
written to the current directory.

Example 2 is another sample run of the w2k_pdbx.exe utility, this time specifying all available options (+h, +a, +r, and +d), as well as the path of the ntoskrnl.exe symbol file on the command line. Before writing the output files, w2k_pdbx.exe displays a summary of PDB file properties extracted from the file header and the root stream.

Example 2: Parsing the ntoskrnl.exe Symbol File

w2k_pdbx +hard e:\winnt\symbols\exe\ntoskrnl.pdb

// w2k_pdbx.exe
// SBS Program Database Exploder V1.00
// 07-07-2001 Sven B. Schreiber
// sbs@orgon.com

Properties of "e:\winnt\symbols\exe\ntoskrnl.pdb":

   67108864 bytes maximum size
    738304 bytes allocated
    706239 bytes used by 8 data streams
     1456 bytes used by the root stream
     1024 bytes per page
     721 pages allocated
     694 pages used by 8 data streams
      2 pages used by the root stream

Saving "ntoskrnl.pdb.header"... 1024 bytes

Saving "ntoskrnl.pdb.alloc"... 8192 bytes

Saving "ntoskrnl.pdb.root"... 1456 bytes

Saving "ntoskrnl.pdb.000"... 1456 bytes

Saving "ntoskrnl.pdb.001"... 58 bytes

Saving "ntoskrnl.pdb.002"... 56 bytes

Saving "ntoskrnl.pdb.003"... 262825 bytes

Saving "ntoskrnl.pdb.004"... 0 bytes

Saving "ntoskrnl.pdb.005"... 16388 bytes

Saving "ntoskrnl.pdb.006"... 106164 bytes

Saving "ntoskrnl.pdb.007"... 319292 bytes
  • + Share This
  • 🔖 Save To Your Account

Related Resources

There are currently no related titles. Please check back later.