kfogarty.com - Database Consultants

Website of Kenny Fogarty. An independent database consultant, based in London, England.

del.icio.us |Digg it |Earthlink |Furl |iFeedReaders |ma.gnolia |Maple.nu |Netvouz |Netscape |RawSugar |reddit |Scuttle |Shadows |Simpy |Spurl |StumbleUpon |Wink |Yahoo MyWeb |

SELECT *

Filed under: Basics, DB2 — Kenny at 3:23 pm on Tuesday, October 31, 2006

It is best enforce a standard whereby SELECT * is not used in any host programs.

The SELECT * statement is a temptation because it allows the programmer to avoid having to explicitly name each of the required columns. However, this convienience can be costly in terms of CPU time, elapsed time and maintenance of host programs.

When an SQL statement SELECTs data for use by a program, the program must have a host variable to receive a value from each column selected. If a column were added to the table, the embedded SELECT * would automatically pass its value to the program. If there is no host variable defined for the new column, there is no place to store the returned value for access by the program.

Similarly, if a column is deleted from the table, SELECT * would not return the values expected by the program, as one of the columns is no longer present.

If the columns required by the program are listed explicitly, any changed and/or deleted columns would not affect the program unless the program explicitly references a column that has been removed.

As stated above, SELECT * is a temptation to code because it avoids the programmer having to name each of the columns required by the program. However, this convenience can have a detrimental effect on CPU time, elapsed time, and maintenance of host programs (see above).

If SELECT * is used within the code of a host program, there could be instances where an unnecessary number of columns are fetched from the database and passed back to the host program. (The more columns that are processed, the more CPU is required). Also, if there is a large number of columns returned, it is less likely that a sort can be done in a sort pool in central storage, and will increase the number of work files that must be written to DASD and then merged, before the rows are returned to the host program.

If the rows returned by the SELECT * is from a distributed process, the rows must be returned over the network, which can be expensive, and cause bottlenecks and other network/performance issues.

It increases the likelihood that a compiled column assignment procedure is greater than 4 KB at bind time and thus not stored in the plan or package. A compiled assignment procedure is used rather than an interpretative process to move columns to host variable output area. If more than 100 rows are fetched, the procedure is compiled each time the SQL is executed rather than only at bind time, which increases the CPU time.

Sphere: Related Content

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.