Home > Articles > Data > DB2

  • Print
  • + Share This
This chapter is from the book

2.6 Database Partitioning Feature

In this section we introduce you to the database partitioning feature (DPF) available on DB2 UDB Enterprise Server Edition (ESE). DPF lets you partition your database across multiple servers or within a large SMP server. This allows for scalability, since you can add new machines and spread your database across them. That means more CPUs, more memory, and more disks from each of the additional machines for your database!

DB2 UDB ESE with DPF is ideal to manage large databases, whether you are doing data warehousing, data mining, online analytical processing (OLAP), or working with online transaction processing (OLTP) workloads. You do not have to install any new code to enable this feature, but you must purchase the license before enabling the database partitioning feature. Users connect to the database and issue queries as usual without the need to know the database is spread among several partitions.

Up to this point, we have been discussing a single partition environment, and all of those concepts apply to a multipartition environment as well. We will now point out some implementation differences and will introduce a few new concepts, including database partitions, partition groups, and the coordinator partition, that are relevant only to a multipartition environment.

2.6.1 Database Partitions

A database partition is an independent part of a partitioned database with its own data, configuration files, indexes, and transaction logs. You can assign multiple partitions across several physical servers or to a single physical server. In the latter case, the partitions are called logical partitions and they can share the machine's resources.

A single-partition database is a database with only one partition. We described the DB2 environment for this type of database in section 2.3, The DB2 Environment. A multipartition database (also referred to as a partitioned database) is a database with two or more database partitions. Depending on your hardware environment, there are several topologies for database partitioning. Figure 2.8 shows configurations of physical partitions, one partition per machine. The illustration at the top of the figure shows an SMP machine with one partition (single-partition environment). This means the entire database resides on this one machine. The illustration at the bottom shows two SMP machines, one partition per machine (multipartition environment). This means the database is split between the two partitions.

02fig08.gif

Figure 2.8 Database partition configurations with one partition per machine

Figure 2.9 shows multipartition configurations with multiple partitions per machine. Unlike Figure 2.8 where there was only one partition per machine, this figure illustrates two (or more) partitions per machine.

02fig09.gif

Figure 2.9 Database partition configurations with multiple partitions per machine

To visualize how a DB2 environment is split in a DPF system, Figure 2.10 illustrates a partial reproduction of Figure 2.3, and shows it split into three physical partitions, one partition per server. (We have changed the machine in the original Figure 2.3 to use the Linux operating system instead of the Windows operating system.)

02fig10.jpg

Figure 2.10 The DB2 environment in DB2 UDB ESE with DPF

In Figure 2.10, the DB2 environment is "split" so that it now resides on three servers running the same operating system (Linux, in this example). The partitions are also running the same DB2 version, but it is important to note that different FixPak levels are allowed. This figure shows where files and objects would be located on a new installation of a multipartition system.

It is also important to note that all of the machines participating in a DPF environment have to be interconnected by a high-speed communication facility that supports the TCP/IP protocol. TCP/IP ports are reserved on each machine for this "interpartition" communication. For example, by default after installation, the services file on Linux (/etc/services) is updated as follows (assuming you chose to create the db2inst1 instance):


   DB2_db2inst1         60000/tcp

   DB2_db2inst1_1       60001/tcp

   DB2_db2inst1_2       60002/tcp

   DB2_db2inst1_END     60003/tcp

   db2c_db2inst1        50000/tcp

This also depends on the number of partitions on the server. By default, ports 60000 through 60003 are reserved for interpartition communication. You can update the services file with the correct number of entries to support the number of partitions you are configuring.

When the partitions reside on the same machine, communication between the partitions still requires this setup. You force interpartition communication to be performed in memory by setting the DB2 registry variable DB2_FORCE_FCM_BP to YES.

For a DB2 client to connect to a DPF system, you issue catalog commands at the client to populate the system and node directories. In the example, the port number to use in these commands is 50000 to connect to the db2inst1 instance, and the host name can be any of the servers participating in the DPF environment. The server used in the catalog command becomes the coordinator, unless the DBPARTITIONNUM option of the connect statement is used. The concept of coordinator is described later in this section. Chapter 6, Configuring Client and Server Connectivity, discusses the catalog command in detail.

2.6.2 The Node Configuration File

The node configuration file (db2nodes.cfg) contains information about the database partitions and the servers on which they reside that belong to an instance. Figure 2.11 shows an example of the db2nodes.cfg file for a cluster of four UNIX servers with two partitions on each server.

02fig11.gif

Figure 2.11 An example of the db2nodes.cfg file

In Figure 2.11, the partition number, the first column in the db2nodes.cfg file, indicates the number that identifies the database partition within DB2. You can see that there are eight partitions in total. The numbering of the partitions must be in ascending order, can start from any number, and gaps between the numbers are allowed. The numbering used is important as it will be taken into consideration in commands or SQL statements.

The second column is the hostname or TCP/IP address of the server where the partition is created.

The third column, the logical port, is required when you create more than one partition on the same server. This column specifies the logical port for the partition within the server and must be unique within a server. In Figure 2.11, you can see the mapping between the db2nodes.cfg entries for partitions 2 and 3 for server myserverb and the physical machine implementation. The logical ports must also be in the same order as in the db2nodes.cfg file.

The fourth column in the db2nodes.cfg file, the netname, is required if you are using a high-speed interconnect for interpartition communication or if the resourcesetname column is used.

The fifth column in the db2nodes.cfg file, the resourcesetname, is optional. It specifies the operating system resource that the partition should be started in.

On Windows, the db2nodes.cfg file uses the computer name column instead of the resourcesetname column. The computer name column stores the computer name for the machine on which a partition resides. Also, the order of the columns is slightly different: partition number, hostname, computer name, logical port, netname, and resourcesetname.

The db2nodes.cfg file must be located

  • Under the SQLLIB directory for the instance owner on Linux and UNIX
  • Under the SQLLIB\instance_name directory on Windows

In Figure 2.10 this file would be on the Linux3 machine, as this machine is the Network File System (NFS) source server, the server whose disk(s) can be shared.

On Linux and UNIX you can edit the db2nodes.cfg file with any ASCII editor or use DB2 commands to update the file. On Windows, you can only use the db2ncrt and db2ndrop commands to create and drop database partitions; the db2nodes.cfg file should not be edited directly.

For any platform, you can also use the db2start command to add and or remove a database partition from the DB2 instance and update the db2nodes.cfg file using the add dbpartitionnum and the drop dbpartitionnum clauses respectively.

2.6.3 An Instance in the DPF Environment

Partitioning is a concept that applies to the database, not the instance; you partition a database, not an instance. In a DPF environment an instance is created once on an NFS source server machine. The instance owner's home directory is then exported to all servers where DB2 is to be run. Each partition in the database has the same characteristics: the same instance owner, password, and shared instance home directory.

On Linux and UNIX, an instance maps to an operating system user; therefore, when an instance is created, it will have its own home directory. In most installations /home/user_name is the home directory. All instances created on each of the participating machines in a DPF environment must use the same name and password. In addition, you must specify the home directory of the corresponding operating system user to be the same directory for all instances, which must be created on a shared file system. Figure 2.12 illustrates an example of this.

02fig12.gif

Figure 2.12 An instance in a partitioned environment

In Figure 2.12, the instance myinst has been created on the shared file system, and myinst maps to an operating system user of the same name, which in the figure has a home directory of /home/myinst. This user must be created separately in each of the participating servers, but they must share the instance home directory. As shown in Figure 2.12, all three Linux servers share /home/myinst, and it resides on a shared file system local to Linux3. Since the instance owner directory is locally stored on the Linux3 machine, this machine is considered to be the DB2 instance-owning server.

Figure 2.12 also shows that the Database Administration Server user db2as is created locally on each participating server in a DPF environment. There can only be one DAS per physical server regardless of the number of partitions that machine contains. The DAS user's home directory cannot be mounted on a shared file system. Alternatively, different userids and passwords can be used to create the DAS on different machines.

2.6.4 Partitioning a Database

When you want to partition a database in a DPF environment, simply issue the CREATE DATABASE command as usual. For example, if the instance owner home directory is /home/myinst, when you execute this command:


   CREATE DATABASE mydb2

the structure created is as shown in Figure 2.13.

Example 2.13. A partitioned database in a single file system

/home
  /myinst
   /NODE0000
     /SQL00001
   /NODE0001
     /SQL00001
   /NODE0002
     /SQL00001

If you don't specify a path in your CREATE DATABASE command, by default the database is created in the directory specified by the database manager configuration parameter DFTDBPATH, which defaults to the instance owner's home directory. This partitioning is not optimal because all of the database data would reside in one file system that is shared by the other machines across a network.

We recommend that you create a directory with the same name, locally in each of the participating machines. For the environment in Figure 2.12, let's assume the directory /data has been created locally on each machine. When you execute the command:


   CREATE DATABASE mydb2 on /data

the following directory structure is automatically built for you:

/data/instance_name/NODExxxx/SQLyyyyy

The /data directory is specified in the CREATE DATABASE command, but the directory must exist before executing the command. instance_name is the name of the instance; for example, myinst. NODExxxx distinguishes which partition you are working with, where xxxx represents the number of the partition specified in the db2nodes.cfg file. SQLyyyyy identifies the database, where yyyyy represents a number. If you have only one database on your system, then yyyyy is equal to 00001; if you have three databases on your system, you will have different directories as follows: SQL00001, SQL00002, SQL00003. To map the database names to these directories, you can review the local database directory using the command:


   list db directory on /data

Inside the SQLyyyyy directories are subdirectories for table spaces, and within them, files containing database data&8212;assuming all table spaces are defined as system-managed space (SMS).

Figure 2.14 illustrates a partitioned database created in the /data directory

02fig13.gif

Figure 2.14 A partitioned database across several file systems

Partitioning a database is described in more detail in Chapter 8, The DB2 Storage Model.

2.6.5 Configuration Files in a DPF Environment

As shown in Figure 2.10, the Database Manager Configuration file (dbm cfg), system database directory, node directory, and DCS directory are all part of the instance-owning machine and are not partitioned. What about the other configuration files?

  • Environment variables: Each participating server in a partitioned environment can have different environment variables.
  • Global-level profile registry variable: This is stored in a file called default.env that is located in a subdirectory under the /var directory. There is a local copy of this file on each server.
  • Database configuration file: This is stored in the file SQLDBCON that is located in the SQLyyyyy directory for the database. In a partitioned database environment, a separate SQLDBCON file is created for each partition in every database.
  • The local database directory: This is stored in the file SQLDBDIR in the corresponding directory for the database. It has the same name as the system database directory, which is located under the instance directory. A separate SQLDBDIR file exists for each partition in each database.

2.6.6 Logs in a DPF Environment

The logs on each database partition should be kept in a separate place. The database configuration parameter Path to log files (LOGPATH) on each partition should point to a local file system, not a shared file system. The default log path in each partition includes a NODE000x subdirectory. For example, the value of this parameter in the DPF system shown in Figure 2.10 could be:

  • For Partition 0: /datalogs/db2inst1/NODE0000/SQL00001/SQLOGDIR/
  • For Partition 1: /datalogs/db2inst1/NODE0001/SQL00001/SQLOGDIR/
  • For Partition 2: /datalogs/db2inst1/NODE0002/SQL00001/SQLOGDIR/

To change the path for the logs, update the database configuration parameter NEWLOGPATH.

2.6.7 The Catalog Partition

As stated previously, when you create a database, several table spaces are created by default. One of them, the catalog table space SYSCATSPACE, contains the DB2 system catalogs. In a partitioned environment SYSCATSPACE is not partitioned, but resides on one partition known as the catalog partition. The partition from which the CREATE DATABASE command is issued becomes the catalog partition for the new database. All access to system tables must go through this database partition. Figure 2.10 shows SYSCATSPACE residing on server Linux1, so the CREATE DATABASE command was issued from this server.

For an existing database, you can determine which partition is the catalog partition by issuing the command list db directory . The output of this command has the field Catalog database partition number for each of the entries, which indicates the catalog partition number for that database.

2.6.8 Partition Groups

A partition group is a logical layer that provides for the grouping of one or more database partitions. A database partition can belong to more than one partition group. When a database is created, DB2 creates three default partition groups, and these partition groups cannot be dropped.

  • IBMDEFAULTGROUP: This is the default partition group for any table you create. It contains all database partitions defined in the db2nodes.cfg file. This partition group cannot be modified. Table space USERSPACE1 is created in this partition group.
  • IBMTEMPGROUP: This partition group is used by all system temporary tables. It contains all database partitions defined in the db2nodes.cfg file. Table space TEMPSPACE1 is created in this partition.
  • IBMCATGROUP: This partition group contains the catalog tables (table space SYSCATSPACE). It only includes the database's catalog partition. This partition group cannot be modified.

To create new database partition groups, use the CREATE DATABASE PARTITION GROUP statement. This statement creates the database partition group within the database, assigns database partitions that you specified to the partition group, and records the partition group definition in the database system catalog tables.

The following statement creates partition group pgrpall on all partitions specified in the db2nodes.cfg file:


   CREATE DATABASE PARTITION GROUP pgrpall ON ALL DBPARTITIONNUMS

To create a database partition group pg23 consisting of partitions 2 and 3, issue this command:


   CREATE DATABASE PARTITION GROUP pg23 ON DBPARTITIONNUMS (2,3)

Other relevant partition group statements/commands are:

  • ALTER DATABASE PARTITION GROUP (statement to add or drop a partition in the group)
  • DROP DATABASE PARTITION GROUP (statement to drop a partition group)
  • LIST DATABASE PARTITION GROUPS (command to list all your partition groups; note that IBMTEMPGROUP is never listed)

2.6.9 Buffer Pools in a DPF Environment

Figure 2.10 shows buffer pools defined across all of the database partitions. Interpreting this figure for buffer pools is different than for the other objects, because the data cached in the buffer pools is not partitioned as the figure implies. Each buffer pool in a DPF environment holds data only from the database partition where the buffer pool is located.

You can create a buffer pool in a partition group using the CREATE BUFFERPOOL statement with the DATABASE PARTITION GROUP clause. This means that you have the flexibility to define the buffer pool on the specific partitions defined in the partition group. In addition, the size of the buffer pool on each partition in the partition group can be different. The following statement will create buffer pool bpool_1 in partition group pg234, which consists of partitions 2, 3, and 4.


   CREATE BUFFERPOOL bpool_1 DATABASE PARTITION GROUP pg234
       
   SIZE 10000
       
   EXCEPT ON DBPARTITIONNUM (3 TO 4) SIZE 5000

Partition 2 in partition group pg234 will have a buffer pool bpool_1 defined with a size of 10,000 pages, and Partitions 3 and 4 will have a buffer pool of size 5,000 pages.

As an analogy, think of it as if you were issuing the CREATE BUFFERPOOL statement on each partition separately, with the same buffer pool name for each partition but with different sizes. That is:

  • On partition 2: CREATE BUFFERPOOL bpool_1 SIZE 10000
  • On partition 3: CREATE BUFFERPOOL bpool_1 SIZE 5000
  • On partition 4: CREATE BUFFERPOOL bpool_1 SIZE 5000

Note that we use these statements only to clarify the analogy; they will not work as written. Executing each of these commands as shown will attempt to create the same buffer pool on all partitions. It is not equivalent to using the DATABASE PARTITION GROUP clause of the CREATE BUFFERPOOL statement.

Buffer pools can also be associated to several partition groups. This means that the buffer pool definition will be applied to the partitions in those partition groups.

2.6.10 Table Spaces in a Partitioned Database Environment

You can create a table space in specific partitions, associating it to a partition group, by using the CREATE TABLESPACE statement with the IN DATABASE PARTITION GROUP clause. This allows users to have flexibility as to which partitions will actually be storing their tables. In a partitioned database environment with three servers, one partition per server, the statement:


   CREATE TABLESPACE mytbls IN DATABASE PARTITION GROUP pg234
       
   MANAGED BY SYSTEM USING ('/data') 
       
   BUFFERPOOL bpool_1

creates the table space mytbls, which spans partitions 2, 3, and 4 (assuming pg234 is a partition group consisting of these partitions). In addition, the table space is associated with buffer pool bpool_1 defined earlier. Note that creating a table space would fail if you provide conflicting partition information between the table space and the associated buffer pool. For example, if bpool_1 was created for partitions 5 and 6, and table space mytbls was created for partitions 2, 3, and 4, you would get an error message when trying to create this table space.

2.6.11 The Coordinator Partition

In general, each database connection has a corresponding DB2 agent handling the application connection. An agent can be thought of as a process (Linux/UNIX) or thread (Windows) that performs DB2 work on behalf of the application. There are different types of agents. One of them, the coordinator agent, communicates with the application, receiving requests and sending replies. It can either satisfy the request itself or delegate the work to multiple subagents to work on the request.

The coordinator partition of a given application is the partition where the coordinator agent exists. You use the SET CLIENT CONNECT_NODE command to set the partition that is to be the coordinator partition. Any partition can potentially be a coordinator, so in Figure 2.10 we do not label any particular partition as the coordinator node. If you would like to know more about DB2 agents and the DB2 process model, refer to Chapter 14, The DB2 Process Model.

2.6.12 Issuing Commands and SQL Statements in a DPF Environment

Imagine that you have twenty physical servers, with two database partitions on each. Issuing individual commands to each physical server or partition would be quite a task. Fortunately, DB2 provides a command that executes on all database partitions.

2.6.12.1 The db2_all command

Use the db2_all command when you want to execute a command or SQL statement against all database partitions. For example, to change the db cfg parameter LOGFILSIZ for the database sample in all partitions, you would use:


   db2_all ";db2 UPDATE DB CFG FOR sample USING LOGFILSIZ 500"

When the semicolon (;) character is placed before the command or statement, the request runs in parallel on all partitions.

2.6.12.2 Using Database Partition Expressions

In a partitioned database, database partition expressions can be used to generate values based on the partition number found in the db2nodes.cfg file. This is particularly useful when you have a large number of database partitions and when more than one database partition resides on the same physical machine, because the same device or path cannot be specified for all partitions. You can manually specify a unique container for each database partition or use database partition expressions. The following example illustrates the use of database partition expressions.

On Linux/UNIX, here are sample contents of a db2nodes.cfg file:

0    myservera      0
1    myservera      1
2    myserverb      0
3    myserverb      1

This shows two servers with two database partitions each. The command:


   CREATE TABLESPACE ts2
  
   MANAGED BY DATABASE USING
  
   (file '/data/TS2/container $N+100' 5000)

creates the following containers:

  • /data/TS2/container100 on database partition 0
  • /data/TS2/container101 on database partition 1
  • /data/TS2/container102 on database partition 2
  • /data/TS2/container103 on database partition 3

You specify a database partition expression with the argument $N (note that there must be a space before $N in the command). Table 2.1 shows other arguments for creating containers. Operators are evaluated from left to right, and % represents the modulus (the remainder of a division). Assuming the partition number to be evaluated is 3, the value column in Table 2.1 shows the result of resolving the database partition expression.

Table 2.1. Database Partition Expressions

Database Partition Expressions

Example

Value

[blank]$N

$N

3

[blank]$N+[number]

$N+500

503

[blank]$N%[number]

$N%2

1

[blank]$N+[number]%[number]

$N+15%13

5

[blank]$N%[number]+[number]

$N%2+20

21

2.6.13 The DB2NODE Environment Variable

In section 2.3, The DB2 Environment, we talked about the DB2INSTANCE environment variable used to switch between instances in your database system. The DB2NODE environment variable is used in a similar way, but to switch between partitions on your DPF system. By default, the active partition is the one defined with the logical port number of zero (0) in the db2nodes.cfg file for a server. To switch the active partition, change the value of the DB2NODE variable using the SET command on Windows and the export command on Linux/UNIX. Be sure to issue a terminate command for all connections from any partition to your database after changing this variable or the change will not take effect.

Using the settings for the db2nodes.cfg file shown in Table 2.2, you have four servers, each with two logical partitions. If you log on to server myserverb, any commands you execute will affect partition 2, which is the one with logical port of zero on that server, and the default coordinator partition for that server.

Table 2.2. Sample Partition Information

Partition

Server Name

Logical Port

0

myservera

0

1

myservera

1

2

myserverb

0

3

myserverb

1

4

myserverc

0

5

myserverc

1

6

myserverd

0

7

myserverd

1

If you would like to make partition 0 the active partition, make this change on a Linux/UNIX system:


   DB2NODE=0

   export DB2NODE

   db2 terminate

Note that partition 0 is on server myservera. Even if you are connected to myserverb, you can make a partition on myservera the active one. To determine which is your active partition, you can issue this statement after connecting to a database:


   db2 "values (current dbpartitionnum)"

2.6.14 Partitioning Maps and Partitioning Keys

By now you should have a good grasp of how to set up a DPF environment. It is now time to understand how DB2 distributes data across the partitions. Figure 2.15 shows an example of this distribution.

02fig14.gif

Figure 2.15 Distributing data rows in a DPF environment

A partitioning map is an internally generated array containing 4096 entries for multipartition database partition groups or a single entry for single-partition database partition groups. The partition numbers of the database partition group are specified in a round-robin fashion in the array.

A partitioning key is a column (or group of columns) that determines the partition on which a particular row of data is physically stored. You define a partitioning key explicitly using the CREATE TABLE statement with the PARTITIONING KEY clause.

When you create or modify a database partition group, a partitioning map is associated with it. A partitioning map in conjunction with a partitioning key and a hashing algorithm determine which database partition will store a given row of data.

For the example in Figure 2.15, let's assume partition group pg0123 has been defined on partitions 0, 1, 2, and 3. An associated partitioning map is automatically created. This map is an array with 4096 entries containing the values 0, 1, 2, 3, 0, 1, 2, 3. . . . (note that this is shown in Figure 2.15 as p0, p1, p2, p3, p0, p1, p2, p3 . . . to distinguish them from the array entry numbers). Let's also assume table mytable has been created with a partitioning key consisting of columns col1, col2, and col3. For each row, the partitioning key column values are passed to the hashing algorithm, which returns an output number from 0 to 4095. This number corresponds to one of the entries in the array that contains the value of the partition number where the row is to be stored. In Figure 2.15, if the hashing algorithm had returned an output value of 7, the row would have been stored in partition p3.

  • + Share This
  • 🔖 Save To Your Account