Home > Articles > Operating Systems, Server > Microsoft Servers

  • Print
  • + Share This

Data Structures

Some portions of the sample code following in this chapter are concerned with low-level memory management and peek inside the mechanisms outlined above. For convenience, I have defined several C data structures that make this task easier. Because many data items inside the i386 CPU are concatenations of single bits or bit groups, C bit-fields come in handy. Bit-fields are an efficient way to access individual bits of or extract contiguous bit groups from larger data words. Microsoft Visual C/C++ generates quite clever code for bit-field operations. Listing 4-2 is part one of a series of CPU data type definitions, containing the following items:

  • X86_REGISTER is a basic unsigned 32-bit integral type that can represent various CPU registers. This comprises all general-purpose, index, pointer, control, debug, and test registers.

  • X86_SELECTOR represents a 16-bit segment selector, as stored in the segment registers CS, DS, ES, FS, GS, and SS. In Figures 4-1 and 4-2, selectors are depicted as the upper third of a logical 48-bit address, serving as an index into a descriptor table. For computational convenience, the 16-bit selector value is extended to 32 bits, with the upper half marked "reserved." Note that the X86_SELECTOR structure is a union of two structures. The first one specifies the selector value as a packed 16-bit WORD named wValue, and the second breaks it up into bit-fields. The RPL field specifies the Requested Privilege Level, which is either 0 (kernel-mode) or 3 (user-mode) on Windows 2000. The TI bit switches between the Global and Local Descriptor Tables (GDT/LDT).

  • X86_DESCRIPTOR defines the format of a table entry pointed to by a selector. It is a 64-bit quantity with a very convoluted structure resulting from its historic evolution. The linear base address defining the start location of the associated segment is scattered among three bit-fields named Base1, Base2, and Base3, with Base1 being the least significant part. The segment limit specifying the segment size minus one is divided into the pair Limit1 and Limit2, with the former representing the least significant half. The remaining bit-fields store various segment properties (cf. Intel 1999c, pp. 3-11). For example, the G bit defines the segment granularity. If zero, the segment limit is specified in bytes; otherwise, the limit value has to be multiplied by 4 KB. Like X86_SELECTOR, the X86_DESCRIPTOR structure is composed of a union to allow different interpretations of its value. The dValueLow and dValueHigh members are helpful if you have to copy descriptors without regard to their internal structure.

  • X86_GATE looks somewhat similar to X86_DESCRIPTOR. In fact, the structures are related: X86_DESCRIPTOR is a GDT entry and describes the memory properties of a segment, and X86_GATE is an entry inside the Interrupt Descriptor Table (IDT) and describes the memory properties of an interrupt handler. The IDT can contain task, interrupt, and trap gates. (No, Bill Gates is not stored in the IDT!) The X86_GATE structure matches all three types, with the Type bit-field determining the identity. Type 5 identifies a task gate; types 6 and 14, interrupt gates; and types 7 and 15, trap gates. The most significant type bit specifies the size of the gate: 16-bit gates have this bit set to zero; otherwise it is a 32-bit gate.

  • X86_TABLE is a tricky structure that is used to read the values of the GDTR or IDTR by means of the assembly language instructions SGDT (store GDT register) and SIDT (store IDT register) respectively (cf. Intel 1999b, pp. 3-636). Both instructions require a 48-bit memory operand, where the limit and base address values will be stored. To maintain DWORD alignment for the 32-bit base address, X86_TABLE starts out with the 16-bit dummy member wReserved. Depending on whether the SGDT or SIDT instruction is applied, the base address must be interpreted as a descriptor or gate pointer, as suggested by the union of PX86_DESCRIPTOR and PX86_GATE types. The wLimit member is the same for both table types.

Listing 4-2. i386 Registers, Selectors, Descriptors, Gates, and Tables

// =================================================================
// INTEL X86 STRUCTURES, PART 1 OF 3
// =================================================================

typedef DWORD X86_REGISTER, *PX86_REGISTER, **PPX86_REGISTER;

// -----------------------------------------------------------------

typedef struct _X86_SELECTOR
    {
    union
        {
        struct
            {
            WORD wValue;            // packed value
            WORD wReserved;
            };
        struct
            {
            unsigned RPL      :  2; // requested privilege level
            unsigned TI       :  1; // table indicator: 0=gdt, 1=ldt
            unsigned Index    : 13; // index into descriptor table
            unsigned Reserved : 16;
            };
        };
    }
    X86_SELECTOR, *PX86_SELECTOR, **PPX86_SELECTOR;

#define X86_SELECTOR_ sizeof (X86_SELECTOR)

// -----------------------------------------------------------------

typedef struct _X86_DESCRIPTOR
    {
    union
        {
        struct
            {
            DWORD dValueLow;        // packed value
            DWORD dValueHigh;
            };
        struct
            {
            unsigned Limit1   : 16; // bits 15..00
            unsigned Base1    : 16; // bits 15..00
            unsigned Base2    :  8; // bits 23..16
            unsigned Type     :  4; // segment type
            unsigned S        :  1; // type (0=system, 1=code/data)
            unsigned DPL      :  2; // descriptor privilege level
            unsigned P        :  1; // segment present
            unsigned Limit2   :  4; // bits 19..16
            unsigned AVL      :  1; // available to programmer 
            unsigned Reserved :  1;
            unsigned DB       :  1; // 0=16-bit, 1=32-bit
            unsigned G        :  1; // granularity (1=4KB)
            unsigned Base3    :  8; // bits 31..24
            };
        };
    }
    X86_DESCRIPTOR, *PX86_DESCRIPTOR, **PPX86_DESCRIPTOR;

#define X86_DESCRIPTOR_ sizeof (X86_DESCRIPTOR)

// -----------------------------------------------------------------

typedef struct _X86_GATE
    {
    union
        {
        struct
            {
            DWORD dValueLow;          // packed value
            DWORD dValueHigh;
            };
        struct
            {
            unsigned Offset1    : 16; // bits 15..00
            unsigned Selector   : 16; // segment selector
            unsigned Parameters :  5; // parameters
            unsigned Reserved   :  3;            unsigned Type       :  4; // gate type and size
            unsigned S          :  1; // always 0
            unsigned DPL        :  2; // descriptor privilege level
            unsigned P          :  1; // segment present
            unsigned Offset2    : 16; // bits 31..16
            };
        };
    }
    X86_GATE, *PX86_GATE, **PPX86_GATE;

#define X86_GATE_ sizeof (X86_GATE)

// -----------------------------------------------------------------

typedef struct _X86_TABLE
    {
    WORD wReserved;                   // force 32-bit alignment
    WORD wLimit;                      // table limit
    union
        {
        PX86_DESCRIPTOR pDescriptors; // used by sgdt instruction
        PX86_GATE       pGates;       // used by sidt instruction
        };
    }
    X86_TABLE, *PX86_TABLE, **PPX86_TABLE;

#define X86_TABLE_ sizeof (X86_TABLE)

// =================================================================

The next set of i386 memory management structures, collected in Listing 4-3, relates to demand paging and contains several items illustrated in Figures 4-3 and 4-4:

  • X86_PDBR is, of course, a structural representation of the CPU's CR3 register, also known as the page-directory base register (PDBR). The upper 20 bits contain the PFN, which is an index into the array of physical 4-KB pages. PFN=0 corresponds to physical address 0x00000000, PFN=1 to 0x00001000, and so forth. Twenty bits are just enough to cover the entire 4-GB address space. The PFN in the PDBR is the index of the physical page that holds the page-directory. Most of the remaining bits are reserved, except for bit #3, controlling page-level write-through (PWT), and bit #4, disabling page-level caching if set.

  • X86_PDE_4M and X86_PDE_4K are alternative incarnations of page-directory entries (PDEs) for 4-MB and 4-KB pages, respectively. A page-directory contains a maximum of 1,024 PDEs. Again, PFN is the page-frame number, pointing to the subordinate page. For a 4-MB PDE, the PFN bit-field is only 10 bits wide, addressing a 4-MB data page. The 20-bit PFN of 4-KB PDE points to a page-table that ultimately selects the physical data pages. The remaining bits define various properties. The most interesting ones are the "Page Size" bit PS, controlling the page size (0 = 4-KB, 1 = 4-MB), and the "Present" bit P, indicating whether the subordinate data page (4-MB mode) or page-table (4-KB mode) is present in physical memory.

  • X86_PTE_4K defines the internal structure of a page-table entry (PTE) contained in a page-table. Like a page-directory, a page-table can contain up to 1,024 entries. The only difference between X86_PTE_4K and X86_PDE_4K is that the former lacks the PS bit, which is not required because the page size must be 4-KB, as determined by the PDE's PS bit. Note that there is no such thing as a 4-MB PTE, because the 4-MB memory model doesn't require an intermediate page-table layer.

  • X86_PNPE represents a "page-not-present entry" (PNPE), that is, a PDE or PTE in which the P bit is zero. According to the Intel manuals, the remaining 31 bits are "available to operating system or executive" (Intel 1999c, pp. 3-28). If a linear address maps to a PNPE, this means either that this address is unused or that it points to a page that is currently swapped out to one of the pagefiles. Windows 2000 uses the 31 unassigned bits of the PNPE to store status information of the page. The structure of this information is undocumented, but it seems that bit #10, named PageFile in Listing 4-3, is set if the page is swapped out. In this case, the Reserved1 and Reserved2 bit-fields contain values that enable the system to locate the page in the pagefiles, so it can be swapped in as soon as one of its linear addresses is touched by a memory read/write instruction.

  • X86_PE is included for convenience. It is merely a union of all possible forms a page entry can take, comprising the PDBR contents, 4-MB and 4-KB PDEs, PTEs, and PNPEs.

Listing 4-3. i386 PDBR, PDE, PTE, and PNPE Values

// =================================================================
// INTEL X86 STRUCTURES, PART 2 OF 3
// =================================================================

typedef struct _X86_PDBR // page-directory base register (cr3)
    {
    union
        {
        struct
            {
            DWORD dValue;            // packed value
            };
        struct
            {
            unsigned Reserved1 :  3;
            unsigned PWT       :  1; // page-level write-through
            unsigned PCD       :  1; // page-level cache disabled
            unsigned Reserved2 :  7;
            unsigned PFN       : 20; // page-frame number
            };
        };
    }
    X86_PDBR, *PX86_PDBR, **PPX86_PDBR;

#define X86_PDBR_ sizeof (X86_PDBR)

// -----------------------------------------------------------------

typedef struct _X86_PDE_4M // page-directory entry (4-MB page)
    {
    union
        {
        struct
            {
            DWORD dValue;            // packed value
            };
        struct
            {
            unsigned P         :  1; // present (1 = present)
            unsigned RW        :  1; // read/write
            unsigned US        :  1; // user/supervisor
            unsigned PWT       :  1; // page-level write-through
            unsigned PCD       :  1; // page-level cache disabled
            unsigned A         :  1; // accessed
            unsigned D         :  1; // dirty
            unsigned PS        :  1; // page size (1 = 4-MB page)
            unsigned G         :  1; // global page
            unsigned Available :  3; // available to programmer
            unsigned Reserved  : 10;
            unsigned PFN       : 10; // page-frame number
            };
        };
    }
    X86_PDE_4M, *PX86_PDE_4M, **PPX86_PDE_4M;

#define X86_PDE_4M_ sizeof (X86_PDE_4M)

// -----------------------------------------------------------------

typedef struct _X86_PDE_4K // page-directory entry (4-KB page)
    {
    union
        {
        struct
            {
            DWORD dValue;            // packed value
            };
        struct
            {
            unsigned P         :  1; // present (1 = present)
            unsigned RW        :  1; // read/write
            unsigned US        :  1; // user/supervisor
            unsigned PWT       :  1; // page-level write-through
            unsigned PCD       :  1; // page-level cache disabled
            unsigned A         :  1; // accessed
            unsigned Reserved  :  1; // dirty
            unsigned PS        :  1; // page size (0 = 4-KB page)
            unsigned G         :  1; // global page
            unsigned Available :  3; // available to programmer
            unsigned PFN       : 20; // page-frame number
            };
        };
    }
    X86_PDE_4K, *PX86_PDE_4K, **PPX86_PDE_4K;

#define X86_PDE_4K_ sizeof (X86_PDE_4K)

// -----------------------------------------------------------------

typedef struct _X86_PTE_4K // page-table entry (4-KB page)
    {
    union
        {
        struct
            {
            DWORD dValue;            // packed value
            };
        struct
            {
            unsigned P         :  1; // present (1 = present)
            unsigned RW        :  1; // read/write
            unsigned US        :  1; // user/supervisor
           unsigned PWT       :  1; // page-level write-through
            unsigned PCD       :  1; // page-level cache disabled
            unsigned A         :  1; // accessed
            unsigned D         :  1; // dirty
            unsigned Reserved  :  1;
            unsigned G         :  1; // global page
            unsigned Available :  3; // available to programmer
            unsigned PFN       : 20; // page-frame number
            };
        };
    }
    X86_PTE_4K, *PX86_PTE_4K, **PPX86_PTE_4K;

#define X86_PTE_4K_ sizeof (X86_PTE_4K)

// -----------------------------------------------------------------

typedef struct _X86_PNPE // page not present entry
    {
    union
        {
        struct
            {
            DWORD dValue;            // packed value
            };
        struct
            {
            unsigned P         :  1; // present (0 = not present)
            unsigned Reserved1 :  9;
            unsigned PageFile  :  1; // page swapped to pagefile
            unsigned Reserved2 : 21;
            };
        };
    }
    X86_PNPE, *PX86_PNPE, **PPX86_PNPE;

#define X86_PNPE_ sizeof (X86_PNPE)

// -----------------------------------------------------------------

typedef struct _X86_PE // general page entry
    {
    union
        {
        DWORD      dValue; // packed value
        X86_PDBR   pdbr;   // page-directory Base Register
        X86_PDE_4M pde4M;  // page-directory entry (4-MB page)
        X86_PDE_4K pde4K;  // page-directory entry (4-KB page)
        X86_PTE_4K pte4K;  // page-table entry (4-KB page)
        X86_PNPE   pnpe;   // page not present entry
        };    }
    X86_PE, *PX86_PE, **PPX86_PE;

#define X86_PE_ sizeof (X86_PE)

// =================================================================

In Listing 4-4, I have added structural representations of linear addresses. These structures are formal definitions of the "Linear Address" boxes in Figures 4-3 and 4-4:

  • X86_LINEAR_4M is the format of linear addresses that point into a 4-MB data page, as shown in Figure 4-4. The page-directory index PDI is an index into the page-directory currently addressed by the PDBR, selecting one of its PDEs. The 22-bit Offset member points to the target address within the corresponding 4-MB physical page.

  • X86_LINEAR_4K is the 4-KB variant of a linear address. As outlined in Figure 4-3, it is composed of three bit-fields: Like in a 4-MB address, the upper 10 PDI bits select a PDE. The page-table index PTI has a similar duty, pointing to a PTE inside the page-table addressed by this PDE. The remaining 12 bits are the offset into the resulting 4-KB physical page.

  • X86_LINEAR is another convenience structure that simply unites X86_LINEAR_4M and X86_LINEAR_4K in a single data type.

  • + Share This
  • 🔖 Save To Your Account

Related Resources

There are currently no related titles. Please check back later.