Identity Columns
Identity columns were available in DB2 Version 7 (but came into being after the V6 code refresh). The identity property is applied to a column using the IDENTITY parameter. A column defined with the identity property will cause DB2 to automatically generate a sequential value for that column, when a row is added to the table.
For example, identity columns might be used to generate primary key values.
Using identity columns helps to avoid some of the concurrency and performance problems that can occur when application programs are used to populate sequential values for a “counter” column.
When inserting data into a table that uses an identity column, the program or user will not provide a value for the identity column. Instead, DB2 automatically generates the appropriate value to be inserted.
Notes
- Only one identity column can be defined per DB2 table.
- The data type of the column must be SMALLINT, INTEGER or DECIMAL.
An example creating a table with an identity column follows:
CREATE TABLE SAMPLE
(ID_COL INTEGER NOT NULL
GENERATED ALWAYS AS IDENTITY
START WITH 10
INCREMENT BY 1
...);
In this example, the identity column is named ID_COL. The initial value stored in the table is 10, and each insert will add 1 to the last value. After five inserts, the values in the identity column will be 10,11,12,13,14 and 15.
The GENERATED parameter indicates how DB2 generates values for the column. GENERATED must be specified if the column is to be considered an identity column or the data type of the column is ROWID. This means that DB2 must be permitted to generate values for all identity columns. There are two options for the GENERATED parameter: ALWAYS and BY DEFAULT.
- GENERATED ALWAYS indicates that DB2 will always generate a value for the column when a row is inserted into the table. ALWAYS is specified for identity columns unless data propagation is being used.
- GENERATED BY DEFAULT indicates that DB2 will generate a value for the column when a row is inserted into the table unless a value is specified. If an explicit value is to be inserted into an identity column GENERATED BY DEFAULT must be specified.
When the maximum value of the column is reached, there are options that need to be considered to allow insertion to continue.
The CYCLE keyword will cause DB2 to begin generating values from the minimum value all over again. This can generate duplicate values in the column, and should only be used when unique values in the identity column are not a requirement.
To ensure uniqueness in an identity column, a unique index needs to be created on the identity column. The IDENTITY property alone does not guarantee uniqueness.
Without identity columns, an application program can implement similar functionality, (there is a performance consideration to account for, as an application program cannot do this as efficiently as DB2).
A common technique is to maintain a one-row table that contains the sequence number. Each transaction locks that table, increments the number, and then commits the change to unlock the table. Only one transaction at a time can increment the sequence number, which could lead to a bottleneck.
Experience of Identity Columns at a Client Site
At one client site, identity columns were not used because of operational restrictions and problems in managing the different environments.
On request of the development teams, data would be copied from PROD(uction) to DEV(elopment) and ACC(eptance) using fast load/DSN1COPY utilities instead of using SQL inserts to refresh some databases, and the data propagation mechanism needed to be taken into account. (See general notes above).
Other site standards meant that all DB2 tables had to have a natural key as a primary key, not an “artificial” (generated by DB2) key. Most applications at the client site managed this by keeping a “highest used key” table. (See notes above).
Sphere: Related Content
