Advice on Using Variable Character Columns in DB2

By | In Blog, DB2 | August 04th, 2015

db2_advice

One of the long-standing, troubling questions in DB2-land is when to use VARCHAR versus CHAR. The high-level advice for when to use VARCHAR instead of CHAR is for larger columns whose length varies considerably from row-to-row. Basically, VARCHAR should be used to save space in the database when your values are truly variable.

In other words, if you have a 10-byte column, it is probably not a good idea to make it variable… unless, of course, 90% of the values are only one or two bytes, then it might make some sense. Have you gotten the idea here that I’m not going to give any hard and fast rules? Hope so, because I won’t – just high-level guidance.

Another situation: say you have an 80 byte column where values range from 10 bytes to the full 80 bytes… and more than 50% of them are less than 60 bytes. Well, that sounds like a possible candidate for VARCHAR to me.

Of course, there are other considerations. Java programmers tend to prefer variable character columns because Java does not have a native fixed length character data type.

For traditional programming languages though, CHAR is preferred because VARCHAR requires additional programmatic handling (to set the length of each column when inserting or modifying the data).

Ok, so what if you are trying to determine, for an existing table, whether or not the appropriate decision was made to use VARCHAR columns instead of CHAR? You can use information from the DB2 Catalog to get a handle on the actual sizes of each VARCHAR column.

Using views and SQL it is possible to develop a report showing the lengths of the variable column values. First, determine which VARCHAR column you need information about. For the purposes of this example, let’s examine the NAME column of SYSIBM.SYSTABLES. This column is defined as VARCHAR(18). Create a view that returns the length of the NAME column for every row, for example:

Then, issue the following query using SPUFI to produce a report detailing the LENGTH and number of occurrences for that length:

This query will produce a report listing the lengths (in this case, from 1 to 18, excluding those lengths which do not occur) and the number of times that each length occurs in the table. These results can be analyzed to determine the range of lengths stored within the variable column. If you are not concerned about this level of detail, the following query can be used instead to summarize the space characteristics of the variable column in question:

The constant 18 will need to be changed in the query to indicate the maximum length of the variable column as defined in the DDL. This query will produce a report such as the one shown below:

SPACE
USED AS
CHAR(18)
SPACE
USED AS
VARCHAR(18)
TOTAL
SPACE
SAVED
AVERAGE
SPACE AS
CHAR(18)
AVERAGE
SPACE AS
VARCHAR(18)
AVERAGE
SPACE
SAVED
————– ——————— ———– ————— ——————— —————
158058 96515 61543 18 10 8

This information can then be analyzed to determine if the appropriate decision was made when VARCHAR was chosen. (Of course, the values returned will differ based on your environment and the column(s) that you choose to analyze.) Also, keep in mind that this report will not include the 2 byte prefix stored by DB2 for variable length columns.

I hope this high-level overview with advice on when to use VARCHAR versus CHAR has been helpful. If you have your own guidelines or queries that you use please feel free to post a comment to this blog and share them with everyone.

NOTE: You could skip the creation of the VIEW in the above query and just use a nested table expression (aka in-line view) instead.

This post was originally posted on Craig Mullins’ blog at http://db2portal.blogspot.com/2006/08/advice-on-using-variable-character.html

Craig Mullins
Consultant at Mullins Consulting, Inc
Craig S. Mullins is working with Datavail and its DB2 practice to expand offerings. He is president and principal consultant at Mullins Consulting, Inc. and the publisher of The Database Site. Mullins has 30 years of experience in all facets of database management and is the author of two books: “DB2 Developer’s Guide” currently in its 6th edition and “Database Administration: The Complete Guide to DBA Practices and Procedures,” the industry’s only guide to heterogeneous DBA.

Leave a Reply

Your email address will not be published.
Required fields are marked (*).