Home > Articles > Software Development & Management > Object Technology

  • Print
  • + Share This
This chapter is from the book

Object-Relational Extensions

With the object extensions of DB2, you can incorporate object-oriented concepts and methodologies into your relational database by extending DB2 with richer sets of data types and functions. With those extensions, you can store instances of object-oriented data types in columns of tables and operate on them using functions in SQL statements. In addition, you can control the types of operations that users can perform on those data types. The object extensions that DB2 provides are

  • Large objects (LOBs)
  • Distinct types
  • UDFs

Schemas

Schemas are qualifiers used for the object-relational extensions as well as stored procedures. They are used to qualify

  • User-Defined Distinct Types
  • UDFs
  • Stored Procedures
  • Triggers

All of the objects qualified by the same schema name can be thought of as a group of related objects. A schema name has a maximum length of 8 bytes. The schema name SYSIBM is used for built-in data types and built-in functions, and SYSPROC is used for some stored procedures delivered by IBM in support of the control center as well as Visual Explain.

Schema Names

The schema name can be specified explicitly when the object is referenced in the CREATE, ALTER, DROP, or COMMENT ON statement. If the object is unqualified and the statement is dynamically prepared, the SQL authorization ID contained in the CURRENT SQLID special register is used for the schema name of the objects.

Schema Privileges

There are certain authorities associated with schemas. Schema privileges include CREATEIN, ALTERIN, and DROPIN. This allows you to create, alter, or drop objects in the identified schema. If the schema name is an authorization ID, then that authorization ID has those privileges implicitly.

PATH Bind Option

The PATH bind option is applicable to BIND PLAN, BIND PACKAGE, REBIND PLAN, and REBIND PACKAGE. The list of schemas specified is used to resolve unqualified references to user-defined distinct types and UDFs in static SQL statements. It is also used to resolve unqualified stored procedure names when the SQL CALL statement specifies a literal for the procedure name. It specifies an ordered list of schemas to be searched to resolve these unqualified references.

CURRENT PATH Special Register

There is also a corresponding special register for the PATH. The SET CURRENT PATH statement changes the value of the PATH special register. This PATH special register is used in the same way as the PATH bind option—to resolve unqualified references in dynamic SQL. It can also be used to resolve unqualified stored procedure names in CALL host-variable statements. The PATH bind option is used to set the initial value for this special register. SYSIBM and SYSPROC do not need to be specified as part of the PATH—they are implicitly assumed as the first schema.

User-Defined Data Types

We looked briefly at what user-defined data types (UDTs) are and how they are implemented using Data Definition Language (DDL) in Chapter 2, "Database Objects." In this section we take a closer look at UDTs and some of their benefits.

By using UDTs, you can avoid some excess code in order to support data typing that is not included in the DB2 product. You enable DB2 to do strong-typing, which says that only functions and operations defined on the UDT can be applied to instances of the UDT. This is beneficial for applications so that you do not have to code for comparison errors.

Once the UDT is defined, column definitions can reference that type during the issuing of the CREATE or ALTER statements the same as they would any DB2 built-in data type. If a distinct type is specified without a schema name, the distinct type is resolved by searching the schemas in the current path.

UDTs allow you to use DB2 built-in data types in special ways. UDTs are built off of the DB2 built-in types. UDTs allow you to extend these types and declare specialized usage on them. DB2 then enforces these rules by performing only the kinds of computations and comparisons that you have defined for the data type.

Defining

In order to use a UDT, you must first create it. The UDTs are created by using one of the DB2 built-in types as a base. You create them using the CREATE DISTINCT TYPE statement, as shown below.

CREATE DISTINCT TYPE distinct-type-name
   AS source-data-type
   WITH COMPARISONS 

The name of the distinct type is a two-part name, which must be unique within the DB2 subsystem. The qualifier is a schema name. The distinct type shares a common internal representation with its source data type. However, the distinct type is considered to be an independent data type that is distinct from the others.

NOTE

LONG VARHCAR and LONG VARGRAPHIC cannot be used as source types.

An instance of a distinct type can be compared only with another instance of the same distinct type. The WITH COMPARISONS clause is to allow for comparison only between the same distinct type. This phrase is required if the source data type is not a LOB type such as BLOB, CLOB, or DBCLOB. Comparisons for these types are not allowed.

NOTE

If you specify the WITH COMPARISONS on a distinct type with an LOB source type, you will get a warning message, but the comparisons are still not allowed.

Casting

There are two operations that are allowed on distinct types: comparisons and casting. You can compare the values of distinct types (non-LOB), or you can cast between the distinct type and the source type.

NOTE

Character and arithmetic operators that are used in built-in functions that are used on a source type are not automatically inherited by the distinct type. These operators and functions need to be created explicitly.

Comparison operators such as those listed in Table 15–2 are allowed on UDTs.

Table 15–2. Comparison Operators for UDTs

=

>=

¬>

IN

<>

<=

BEWTEEN

NOT IN

>

¬=

NOT BETWEEN

IS NOT NULL

<

¬<

IS NULL

 


NOTE

LIKE and NOT LIKE are not supported.

Casting functions are used to convert instances of source data types into instances of a different data types. These functions have the name of the target data type and will have a single parameter, which is the source data type. They will return the target data type. Two cast functions are generated by DB2 when the CREATE DISTINCT TYPE is issued. These are used to convert between the distinct type and its source type. They will be created in the same schema as the distinct type. The following shows an example of creating a UDT and then using it both with and without casting:

CREATE DISTINCT TYPE EURO AS DECIMAL (9,2) WITH COMPARISONS
   EURO(DECIMAL)
   -- where EURO is the target and DECIMAL is the source
   DECIMAL (EURO)
   -- where DECIMAL is the target and EURO is the source

Without casting – using the function name
   SELECT ITEM
   FROM INVENTORY
   WHERE COST > EURO (1000.00)

With casting – using cast function
   SELECT ITEM
   FROM INVENTORY
   WHERE COST > CAST (1000.00 AS EURO)

NOTE

Constants are always considered to be source-type values.

If you want to find all items that have a cost of greater than 1000.00 Euros, you will have to cast, because you cannot compare the data of type EURO with data of the source data type of the EURO —which is DECIMAL. You will need to use the cast function to cast data from DECIMAL to EURO. You can also use the cast function DECIMAL to cast from EURO to DECIMAL and cast the column COST to type DECIMAL. Depending on the way you choose to cast—from or to the UDT—you can use the function name notation data-type(argument) or the cast notation CAST(argument AS data-type).

Built-In Functions for UDTs

The built-in data types come with a collection of built-in functions that operate on them. Some of these functions implement operators, such as the arithmetic operators on numeric data types and substring operators on character data types. Other functions include scalar and column functions, which are discussed in Chapter 4, "Advanced SQL Coding."

When you create a UDT, you can also specify that it inherit some or all of the functions that operate on the corresponding source type. The is done by creating new functions called sourced functions that operate on the UDT and duplicate the semantics of the built-in functions that work on the source type. The following example shows how to create a sourced function.

CREATE FUNCTION '+' (EURO, EURO) RETURNS EURO
        SOURCE SYSIBM.'+' (DECIMAL(9,2), DECIMAL(9,2))

You can also give UDTs distinct semantics of their own by creating external functions that you write in a host language, which will operate on your UDTs.

Privileges

You need to have privileges granted in order to use UDTs. The GRANT USAGE ON DISTINCT TYPE is used in order to grant privileges to use the UDT as a column data type in a CREATE or ALTER statement, or to use the UDT as a parameter in a stored procedure or UDF. The GRANT EXECUTE ON enables users to cast functions on a UDT.

Catalog Information

There is information about the UDTs stored in the DB2 catalog. The following tables contain information about UDTs.

  • SYSIBM.SYSDATATYPES: ROW FOR EACH UDT

  • SYSIBM.SYSROUTINES: ROW FOR EACH CAST FUNCTION

  • SYSIBM.SYSROUTINEAUTH: AUTHORIZATIONS FOR EXECUTE PRIVILEGE

  • SYSIBM.SYSRESTAUTH: AUTHORIZATIONS FOR USAGE PRIVILEGE

UDFs

UDFs form the basis of object-relational extensions to the SQL language along with UDTs and LOBs. Fundamentally, a database function is a relationship between a set of input data values and a result value. DB2 Universal Database comes with many built-in functions; however, it is possible to create your own column, scalar, and table functions.

While there are three types of functions in DB2, there are only two UDFs: scalar and table. There is no concept of a column function being user-defined.

NOTE

The built-in column and scalar functions are covered in Chapter 3, "Basic SQL Coding."

In DB2, you can create your own scalar or table functions. A UDF can be written in a high-level programming language such as COBOL, C, C++, or Java, or you can use a single SQL statement. There are three types of UDFs:

  • Sourced: A UDF based on another scalar UDF or on a built-in scalar or column function. This concept is similar to overloading classes in object-oriented programming.

  • SQL Scalar: A scalar UDF that is based upon a single SQL expression. The source code for an SQL UDF is contained entirely within the UDF definition.

  • External: A UDF that you write in a supported language. An external UDF can be further categorized as a scalar or table function.

External UDFs

An external UDF is similar to any other program written for the OS/390 or z/OS platform. External UDFs may or may not contain SQL statements, IFI (Instrumentation Facility Interface), or DB2 commands. They may be written in assembler, COBOL, C, C++, PL/I, or Java. External UDFs, once written and generated as dynamically loadable libraries or classes, must be registered with the database. An external function is defined to DB2 with a reference to an OS/390 load module that DB2 should load when the function is invoked. The OS/390 load module contains the object code for the application program that contains the logic of the external function. If the program contains SQL statements, then there is an associated package that contains the DBRM. External functions cannot be column functions. The congrat function shown below is an external scalar UDF and is registered using the CREATE FUNCTION statement.

CREATE FUNCTION congrat(VARCHAR(30),VARCHAR(40))
   RETURNS VARCHAR(30)
   EXTERNAL NAME 'CONGRAT'
   LANGUAGE C
   PARAMETER STYLE DB2SQL
   DETERMINISTIC
   FENCED
   READS SQL DATA
   COLLID TEST
   NO EXTERNAL ACTION
   DISALLOW PARALLEL;

DB2 passes parameters to external UDFs in a standard manner much in the same way that parameters are passed to stored procedures. DB2 uses the following structure:

Address of Parameter 1 Address of Parameter 2 Address of Parameter 3 ? Address of Result 1 Address of Result 2 Address of Result 3 ? Address of Parameter 1 Null Indicator Address of Parameter 2 Null Indicator Address of Parameter 3 Null Indicator ? Address of Result 1 Null Indicator Address of Result 2 Null Indicator Address of Result 3 Null Indicator ? Address of SQLSTATE Address of Procedure Name Address of Specific Name Address of Message Text Address of the Scratchpad (if SCRATCHPAD specified in DDL) Address Call Type Indicator (if FINAL CALL specified in DDL) Address of DBINFO (if DBINFO specified in DDL)

NOTE

A scalar function can return only a single result parameter, while table functions return multiple result parameters each representing a column in a row of the table being returned.

The SQLSTATE can be returned from the external UDF to DB2 to indicate a condition that DB2 can then act upon. It is highly recommended that the UDF return a SQLSTATE to the caller. The following SQLSTATEs can be returned:

  • 00000 returns SQLCODE 0

  • 01Hxx returns SQLCODE +462

  • 02000 returns SQLCODE +100

  • 38001 returns SQLCODE –487

  • 38002 returns SQLCODE –577

  • 38003 returns SQLCODE –751

  • 38004 returns SQLCODE –579

  • 38yxx returns SQLCODE –443

If the UDF returns a SQLSTATE that is not allowed, DB2 replaces the SQLSTATE with 39001, and returns a SQLCODE of –463.

Sourced UDFs

Sourced UDFs are registered simply by specifying the DB2 built-in source function. Sourced functions can be scalar functions or column functions, but cannot be table functions. Sourced functions are often helpful when there is a need allow for the use of a built-in function on a UDT (see previous section in this chapter for information on UDTs).

This example allows us to create an AVG function for the SCORE data type:

CREATE FUNCTION AVG (SCORE)
   RETURNS SCORE
   SOURCE SYSIBM.AVG(DECIMAL);

These CREATE FUNCTION statements place an entry for each UDF in the SYSIBM.SYSROUTINES catalog table, and the parameters are recorded in SYSIBM.SYSPARMS. These catalog tables can be queried for information about the UDFs.

SQL Scalar UDFs

An SQL scalar function is a UDF in which the entire functionality of the function is a single SQL expression and is coded into the CREATE FUNCTION statement. The function is identified as an SQL scalar function by coding the LANGUAGE SQL option of the CREATE FUNCTION statement. This enables you to code an expression used commonly within more than one statement and modularize that expression by storing it separately as a UDF. Any SQL statement can then reference the UDF in the same manner in which any scalar function can be invoked. This enables common expressions to be coded only once and stored separately in the DB2 catalog, centralizing the coding and administration of these types of functions.

The SQL expression is specified in the RETURN clause of the CREATE FUNCTION statement, and can contain references to the function input parameters as in the following example that computes the total number of months between two dates:

CREATE FUNCTION TOTMON (STARTX DATE, ENDY DATE)
RETURNS INTEGER
LANGUAGE SQL
CONTAINS SQL
NO EXTERNAL ACTION
DETERMINISTIC
RETURN ABS( (YEAR(STARTX - ENDY)*12) + MONTH(STARTX - ENDY) );

The expressions contained in the SQL scalar UDF cannot contain references to columns names or host variables. However, an SQL scalar UDF can invoke other UDFs, which may be external UDFs that can be an SQL program.

The source code for an SQL scalar function is actually stored in the SYSIBM.SYSVIEWS DB2 catalog table. When an SQL statement referencing a SQL scalar function is compiled, the function source from the SYSIBM.SYSVIEWS catalog table is merged into the statement. Package and plan dependencies on the SQL scalar functions, as with all UDFs, are maintained in the SYSIBM.SYSPACKDEP and SYSIBM.SYSPLANDEP tables respectively.

Table Functions

With DB2, you can also create another type of UDF called a table function. A table function is a UDF that returns a table to the SQL statement that calls it. This means that a table function can only be referenced in the FROM clause of a SELECT statement. The table function provides a means of including external data or complex processes in SQL queries. Table functions can read non-DB2 data—for instance, a file on the operating system or over the World Wide Web—tabularize it, and return the data to DB2 as a relational table that can subsequently be treated like any other relational table. For example, the APPFORM table function in the next example takes in a candidate application form, processes it, and returns the data in an appropriate format to be inserted in the CANDIDATE table (except for the candidate ID, which is generated):

CREATE FUNCTION APPFORM(VARCHAR(30))
RETURNS TABLE
  (LNAME VARCHAR(30),
   FNAME VARCHAR(30),
   INITIAL CHAR(1),
   HPHONE PHONE,
   WPHONE PHONE,
   STREETNO VARCHAR(8),
   STREETNAME VARCHAR(20),
   CITY VARCHAR(20),
   PROV_STATE VARCHAR(30),
   CODE CHAR(6),
   COUNTRY VARCHAR(20))
EXTERNAL NAME APPFORM
LANGUAGE C
PARAMETER STYLE DB2SQL
NO SQL
DETERMINISTIC
NO EXTERNAL ACTION
FINAL CALL
DISALLOW PARALLEL
CARDINALITY 20;

If we wanted to insert a new candidate into the CANDIDATE table based on his or her application form, we could use the following SELECT statement:

INSERT INTO CANDIDATE
   SELECT CID,
      LNAME, FNAME, INITIAL,
      HPHONE, WPHONE,
      STREETNO, STREETNAME, CITY,
      PROV_STATE, CODE, COUNTRY
   FROM TABLE(APPFORM(' \DOCS\NEWFORM.TXT')) AS AP

Invoking User-Defined Functions

Scalar UDFs are invoked much in the same way as any built-in DB2 scalar function. A function name identifies the function, and one or more parameters pass information from the invoking SQL statement to the UDF. The parameters passed can be table columns, constants, or expressions. If an expression is passed to an external UDF, DB2 resolves the expression and then passes the result to the UDF. The result of the UDF execution replaces the function invocation at execution time. In the following example, the SQL scalar function TOTMON is used to calculate the number of months between two dates:

SELECT  HIREDATE, BIRTHDATE,
        TOTMON(HIREDATE, BIRTHDATE) as total_months,
FROM    DSN8710.EMP;

Here, the HIREDATE and BIRTHDATE columns are selected from the EMP table, and the TOTMON function (previously defined in this chapter) determines the total number of months between the two dates, which were fed to the function as parameters. In this case, the TOTMON function, being an SQL scalar function, is merged with the statement during statement compilation as if the expression itself were coded within the SQL statement.

A UDF can be defined as deterministic or not deterministic. A deterministic function is one that will return the same result from one invocation to the next if the input parameter values have not changed. While there is no mechanism within DB2 to "store" the results of a deterministic function, the designation can impact the invoking query execution path relative to materialization. In a situation in which a table expression has been nested within an SQL statement, a nondeterministic function may force the materialization of the inner query. For example,

SELECT   WORKDEPT, SUM(TOTAL_MONTHS), AVG(TOTAL_MONTHS)
FROM
(SELECT  WORKDEPT,
         TOTMON(HIREDATE, BIRTHDATE) as total_months,
 FROM    DSN8710.EMP) AS TAB1

GROUP BY WORKDEPT;

Here, if there is an index on the WORKDEPT column of the EMP table, then the inner table expression called TAB1 may not be materialized, but rather be merged with the outer select statement. This is possible because the TOTMON function is deterministic. If the TOTMON function was not deterministic, then DB2 would have to materialize the TAB1 table expression, possibly storing it in the DSNDB07 temporary table spaces and sorting to perform the desired aggregation. However, its not exactly clear as to whether or not having TOTMON be deterministic is a good thing. If the TOTMON function is CPU-intensive, it may be better to materialize the result of the inner table expression. This is because the merged TOTMON function (if it is deterministic) will actually be executed twice in the outer query, once per reference (for the SUM and AVG functions in this case).

A table function can be referenced in an SQL statement anywhere that a table can normally be referenced. The table function, or a nested table expression, is identified in the query by the use of the TABLE keyword as in the following example:

SELECT TAB1.EMPNO, TAB2.TEMPURATURE, TAB2.FORECAST
FROM   EMP, TABLE(WEATHERFUNC(CURRENT DATE)) AS TAB2

In the above query, the TABLE keyword was used to identify a nested table expression called TAB2 that was an invocation of the table UDF called WEATHERFUNC. The query returns the employee number along with some weather information in some of the columns that are returned from the WEATHERFUNC table function. This is a fairly simple invocation of a table function.

More importantly, you can embed correlated references within a nested table expression. While the weather may not be useful information to return with employee data, perhaps retrieving the resume and credit information from an external source is. In this case, we can pass the employee number as a correlated reference into the table expression identified by the TABLE keyword and ultimately pass it into the table UDF:

SELECT TAB1.EMPNO, TAB2.RESUME, TAB2.CREDITINFO
FROM   EMP AS TAB1, TABLE(EMPRPT(TAB1.EMPNO)) AS TAB2

The TABLE keyword tells DB2 to look to the left of the keyword when attempting to resolve any otherwise unresolvable correlated references within the table expression. If the join was coded in reverse (that is, the invocation of the EMPRPT table UDF appears in the statement before the EMP table), then the correlated reference to the TAB1.EMPNO column would not have been resolved, and the statement would not have compiled successfully.

The use of the TABLE keyword can be expanded beyond that of correlated references as input into table UDFs. The same keyword can be utilized with a nested table expression that may benefit from a correlated reference. This can be especially useful when the nested expression is performing an aggregation and only needs to work on a subset of the data in the table it is accessing. In the following example, we need to list the employee number and salary of each employee, along with the average salary and head count of all employees in their associated departments. This is traditionally coded as a left outer join of two table expressions, the first getting the employee numbers and salaries, and the second calculating the head count and average salary for all departments. If there is filtering against the employee table, then the entire table might be unnecessarily read to perform the aggregations:

SELECT   TAB1.EMPNO, TAB1.SALARY,
         TAB2.AVGSAL,TAB2.HDCOUNT
FROM
   (SELECT EMPNO, SALARY, WORKDEPT
    FROM   DSN8610.EMP
    WHERE  JOB='SALESREP') AS TAB1
LEFT OUTER JOIN
   (SELECT AVG(SALARY) AS AVGSAL,
           COUNT(*)AS HDCOUNT,
           WORKDEPT
    FROM   DSN8610.EMP
    GROUP  BY WORKDEPT) AS TAB2
ON TAB1.WORKDEPT = TAB2.WORKDEPT;

Here, the entire EMP table has to be read in the TAB2 nested table expression in order to calculate the average salary and headcount for all departments. This is unfortunate, because we only need the departments that employ sales reps. We can use the TABLE keyword and a correlated reference to TAB1 within the TAB2 expression to perform filtering before the aggregation:

SELECT   TAB1.EMPNO, TAB1.SALARY,
         TAB2.AVGSAL,TAB2.HDCOUNT
FROM     DSN8610.EMP TAB1
,TABLE(SELECT AVG(SALARY) AS AVGSAL,
              COUNT(*) AS HDCOUNT
       FROM   DSN8610.EMP
       WHERE  WORKDEPT = TAB1.WORKDEPT) AS TAB2

WHERE TAB1.JOB = 'SALESREP';

Polymorphism and UDFs

DB2 UDFs subscribe to the object-orientated concept of polymorphism. Ad hoc polymorphism (better described as overloading) allows an SQL statement to issue the same function against varying parameter lists and/or data types. This overloading requires us to create a unique definition for each variation of a particular function in data types or number of parameters. Polymorphism basically means "many changes," and for DB2 functions it means that many functions can have the same name. These functions are identified by their signature, which is comprised of the schema name, the function name, the number of parameters, and the data types of the parameters. This enables us to create UDFs for all of our UDTs. These sourced UDFs can assume the same name as the UDFs of built-in functions they are sourced from, but are unique in the system due to the data type of their parameter(s). It also allows us to define SQL or external UDFs to accommodate any variation in data type or number of parameters. For example, if we need a variation of our TOTMON function that accommodates timestamps, we could create the following function:

CREATE FUNCTION TOTMON (STARTX TIMESTAMP, ENDY TIMESTAMP)
RETURNS INTEGER
LANGUAGE SQL
CONTAINS SQL
NO EXTERNAL ACTION
NOT DETERMINISTIC
RETURN ABS( (YEAR(STARTX - ENDY)*12) + MONTH(STARTX - ENDY) );

The only thing that differs between this TOTMON and the original TOTMON is that the input parameters here are TIMESTAMPs instead of DATEs. From the application programming point of view, this enables an SQL statement to issue a TOTMON function regardless of whether it is using a pair of dates or timestamps as input. However, it requires that the people responsible for deploying the UDFs do so with consistency of functionality for like-named functions.

External UDF Execution

The external scalar and table UDF programs execute in an OS/390 WLM environment, much in the same way as stored procedures. The WLM environment is supported by one or more WLM address spaces. The WLM keywords that name the WLM environment in which to execute should be specified when creating the function, or else the program defaults to the WLM environment specified at installation time. This default environment can be seen in the SYSIBM.SYSROUTINES catalog table. UDFs execute under the same thread as the invoking program and will run at the same priority, utilizing the WLM enclave processing.

Monitoring and Controlling UDFs

You can invoke UDFs in a SQL statement wherever you can use expressions or built-in functions. UDFs, like stored procedures, run in WLM-established address spaces. DB2 UDFs are controlled by the following commands.

The START FUNCTION SPECIFIC command activates an external function that has been stopped. You cannot start built-in functions or UDFs that are sourced on another function. You can use the START FUNCTION SPECIFIC command to activate all or a specific set of stopped external functions. To activate an external function that is stopped, you would issue the following command:

START FUNCTION SPECIFIC (function-name)

The SCOPE (GROUP) option can also be used on the START FUNCTION command to allow you to start a UDF on all subsystems in a data sharing group.

The DB2 command DISPLAY FUNCTION SPECIFIC will display statistics about external UDFs that are accessed by DB2 applications. This command will display an output line for each function that a DB2 application has accessed. The information that is returned by this command will reflect a dynamic status for a point in time and may change before another DISPLAY is issued. This command does not display information about built-in functions or UDFs that are sourced on another function. To display statistics about an external UDF accessed by DB2 applications, issue the following command:

- DISPLAY FUNCTION SPECIFIC (function-name)

Stopping UDFs

The DB2 command STOP FUNCTION SPECIFIC will prevent DB2 from accepting SQL statements with invocations of the specified functions. This particular command will not prevent SQL statements with invocations of the functions from running if they have already been queued or scheduled by DB2. Built-in functions or UDF that are sourced on another function cannot be explicitly stopped. While the STOP FUNCTION SPECIFIC command is in effect, any attempt to execute the stopped functions are queued. You can use the STOP FUNCTION SPECIFIC command to stop access to all or a specific set of external functions.

This command stops an external function. Built-in functions or UDFs that are sourced on another function cannot be stopped. Use the START FUNCTION SPECIFIC command to activate all or a specific set of stopped external functions.

To prevent DB2 from accepting SQL statements with invocations of the specified functions, issue the following statement:

STOP FUNCTION SPECIFIC (function-name)

UDF Statistics

The optimizer will use statistics, if available, for estimating the costs for access paths where UDFs are used. The statistics that the optimizer needs can be updated by using the SYSSTAT.FUNCTIONS catalog view. There is a field in the statistics report that allows you to view the maximum level of indirect SQL cascading. This includes cascading due to triggers, UDFs, or stored procedures (see Figure 15–4).

Figure 15-4Figure 15–4 Accounting Report Information for UDF Operations

Cost Information

User-defined table functions will add additional access cost to the execution of an SQL statement. In order for DB2 to determine the cost factor for the use of user-defined table functions in the selection of the best access path for an SQL statement, the total cost of the user-defined table function must be determined. This is determined by three components:

  • Initialization cost that results from the first call processing.

  • Cost that is associated with acquiring a single row.

  • Final call cost that performs the cleanup processing.

To determine the elapsed and CPU time spent for UDF operations, you can view an accounting report (see Figure 15–4).

Catalog Information

The SYSIBM.SYSROUTINES catalog table describes UDFs. To retrieve information about UDFs, you can use the following query:

SELECT SCHEME, NAME, FUNCTION_TYPE, PARM_COUNT
FROM SYSIBM.SYSROUTINES
WHERE ROUTINETYPE='F'

Large Objects

As of v6, DB2 allows us to store large objects. It has new datatypes to handle these objects, and there are several application processing issues associated with handling these new types of data.

LOB Datatypes

Three DB2 data types support LOBs:

  • BLOB data type (binary large object)

Binary strings (not associated to a CCSID)
Good for storing image, voice, and sound data
  • CLOB datatype (character large object)

Strings made of single-byte characters or single-/double-byte characters with an associated CCSID
Use if data is larger than VARCHAR allows
  • DBCLOB datatype (double-byte character large object)

Strings made of double-byte characters with an associated CCSID

Each of these can contain up to 2 gigabytes of data, although in most cases the amount of storage for individual columns will be considerably less (depending on the type of data stored). There is a large use today of the 32KB long VARCHAR column, which has limitations in both size and functionality. Most of this use will probably be replaced by using LOBs in the future, certainly for new applications and functions.

LOB Implementation

LOBs are implemented with structures that are different than normal tables and tablespaces. A LOB tablespace must be created for each column (or each column of each partition) of a base LOB table. This tablespace contains the auxiliary table, which must have an auxiliary index associated with it. The LOB tablespace has a different recovery scheme, optional logging, and different locking options. For information on LOB locking, refer to Chapter 16, "Locking and Concurrency."

If a table contains a LOB column and the plan or package is bound with SQLRULES(STD), then DB2 will implicitly create the LOB tablespace, the auxiliary table, and the auxiliary index. DB2 will choose the name and characteristics for these implicitly created objects.

NOTE

It is better to develop naming standards beforehand for these objects and control placement of them. This is critical for management and performance.

INSERTing and LOADing LOBs

LOB loading and insertion is also different than processes for non-LOB data. The methods are also entirely different depending on whether or not extenders are used. Without extenders, there are some real limitations that need to be addressed when inserting LOB data, primarily the 32KB limit and logging impacts. If the total length of the LOB column and the base table row is less than 32KB, then the LOAD utility can insert the LOB column. When the limits of LOAD are exceeded, then SQL INSERT or UPDATE statements need to be used. But the SQL INSERT has its own limitations in that enough memory needs to be available to hold the entire value of the LOB. The limitations are the amount of memory available to the application and the amount of memory that can be addressed by the language used. If the LOBs are all small, then it is not as much of an issue, as memory and language constructs would be available. But when dealing the very large LOBs, the differences can be seen easily when comparing the C language construct with COBOL.

C language for a LOB host variable,

SQL TYPE IS CLOB(20000K) my_clob;

is generated by DB2 as

   struct { unsigned long length;
            Char data[20960000];
          } my_clob;

COBOL language for a LOB host variable,

01 MY-CLOB     USAGE IS SQL TYPE IS CLOB(20000K).

is generated by DB2 as

01  MY-CLOB.
    02  MY-CLOB-LENGTH    PIC 9(9) COMP.
    02 MY-CLOB-DATA.
       49  FILLER     PIC  X(32767).
       49  FILLER     PIC  X(32767).
       49  FILLER     PIC  X(32767).
--Repeated 622 times

This is another area where extenders assist in solving the problem. When a table and column are enabled for an extender, the whole process changes. An INSERT statement can be used in the program to contain extender functions (UDFs) that allow the image to be loaded into the database directly from an external file. Actually with the image extender, for example, the image content is inserted into an administrative support table, and another record is then inserted into another administrative table describing the attributes of the image, such as number of colors, thumbnail-sized version, format characteristics (JPEG, TIF, etc.). The extenders require WLM to be installed in support of the extender UDFs and stored procedures, and specifically using WLM in goal mode for performance reasons.

Even though the LOB data is stored in an auxiliary table, the insert statement specifies the base table. You can read the LOB data from a file in your DB2 program and place the data into the declared DB2 LOB variable. The insert statement then simply references the LOB variable. For example, if you wanted to insert employee resumes into the EMP_RESUME table, which includes a 2MB CLOB data type to hold resumes, you declare the resume variable in your program:

SQL TYPE IS CLOB(2000K) resume;

Then you populate the resume variable with the CLOB data and perform the insert:

EXEC SQL INSERT INTO EMP_RESUME VALUES (:EMPNO, :RESUME);

DB2 uses contiguous storage in data spaces to store LOBs that your program is manipulating. Because LOBs can be quite large, DB2 avoids materializing LOBs until completely necessary. The amount of storage required depends on the size of the LOBs and the number of LOBs referenced within a statement. The amount of storage required by your program and DB2 can become quite large. For this reason, you can use LOB locators to manipulate LOB data without retrieving that data from the DB2 table. A LOB locator is declared in the application program:

SQL TYPE IS CLOB_LOCATOR resume_loc;

An SQL statement can reference the locator, and the LOB is not materialized in DB2 until absolutely necessary, and it is never moved into the application memory:

SELECT RESUME
INTO   :resume_loc
FROM   EMP_RESUME
WHERE  EMPNO=:empno;

Further SQL statements can reference the locator variable, allowing the LOB to be manipulated in various ways. One way would be by using SQL SET commands. DB2 will manipulate the LOB data within the table, and only materialize the LOB as needed.

LOBs and UDTs

Being able to store LOBs and manipulate them through extenders is only part of the story. The capability also exists to define new distinct data types based on the needs of particular applications. A UDT, also known as a distinct type, provides a way to differentiate one LOB from another LOB, even of the same base type, such as BLOB or CLOB. A UDT is not limited to objects and can be created for standard data types as well as LOBs.

Even though stored as LOBs (binary or character), image, video, spatial, XML, and audio objects are treated as types distinct from BLOBs and CLOBs and distinct from each other. For example, suppose an application that processes spatial data features needs a polygon data type. You can create a distinct type named polygon for polygon objects, as follows:

CREATE DISTINCT TYPE polygon AS BLOB (512K)

The polygon-type object is treated as a distinct type of object even though internally it is represented as a 512-KB binary object (BLOB). UDTs are used like SQL built-in types to describe the data stored in columns of tables.The extenders create distinct datatypes for the type of object they process, such as image, audio, and video, which makes it easier for applications to incorporate these types of objects.

CREATE TABLE DB2_MAG_DEMO
   (GEO_ID CHAR(6),
    EURO_ANNUAL    EURO_DOLLAR,
    OZ_ANNUAL      AUSTRALIAN_DOLLAR,
    US_ANNUAL      US_DOLLAR,
    DEMO_GRAPHIC   POLYGON)

Casting functions allow operations between different data types—for example, comparing a slide from a video UDT to an expression. You must cast the expression to a UDT (video in this example) in order for the comparison to work. There are casting functions supplied with DB2 (CHAR, INTEGER, and so on), and there are others created automatically whenever a UDT is created with the CREATE DISTINCT TYPE statement.

  • + Share This
  • 🔖 Save To Your Account