Read and Write LOBs from an RPG Pointer Field

Article ID: 56657

Q: I'd like to have a database with a very large VARCHAR field, but VARCHAR can apparently be only 32 KB long! I currently use pointer logic to handle the large variables, and that part works great, but how can I store the field in a database?

A: Large object fields (LOBs) are supported only through SQL interfaces, not through RPG's native I/O. Embedded SQL provides three methods of accessing LOBs, and they all work nicely in RPG. Unfortunately, none works with dynamic memory allocations. The SQL CLI, however, does provide a way of accessing LOBs using dynamic memory, and in fact, it's not really that much different from normal database access via the CLI.

More About LOBs

The maximum length of a record in a DB2 database on i5/OS is 32,766 characters. Because of this limit, having a character string longer than 32 KB in a traditional database field is simply infeasible. Indeed, even if you were happy with a 32 KB field, you'd have no space for anything else! If you'd like the ability to store data longer than 32 KB, you need a tool that stores data outside the standard record buffer. That's where LOBs come in.

Look at a record buffer that contains a LOB field, and you see that in place of actual field data, there's just a pointer. That pointer points to another spot on disk where the LOB field is stored. Because LOB data isn't stored inside the record itself, it can be much larger than 32 KB.

Three types of LOB fields are available:

  1. BLOB = Binary Large Objects.
  2. CLOB = Character Large Objects.
  3. DBCLOB = Double-Byte Character Large Objects.

A BLOB is designed for binary data, that is, data in which the raw bits and bytes don't represent EBCDIC, ASCII, or Unicode text values. Examples of data that you might want to store in a BLOB would be pictures, sounds, movies, PDF files, or Word documents. For instance, if you use ODBC to read BLOB data from a PC application, the data would not be translated from EBCDIC to ASCII, because it's not considered text data.

A CLOB is designed for single-byte character data. That means that the byte values in the field do represent text in a particular character set. In contrast to a BLOB, if you read CLOB data from an ASCII machine, it should be translated from EBCDIC to ASCII.

A DBCLOB is identical to a CLOB, except that it's for double-byte character sets. In a DBCLOB, two bytes are allocated to represent each character, whereas a CLOB allocates only one byte per character.

All three LOBs are read, updated, and written in pretty much the same way as one another. For example, if you already know how to read a BLOB field, you can also read a CLOB field or a DBCLOB field with almost identical code. The only modification that you need to make is changing the variable type in your program from BLOB to CLOB.

With that in mind, if you want to learn how to work with LOBs from RPG with embedded SQL, please read the following article:
http://systeminetwork.com/article/rpg-vs-blob

Although that article references BLOBs, you can work with CLOBs or DBCLOBs simply by changing the SQLTYPE keyword to read CLOB or DBCLOB instead of BLOB.

Of course, embedded SQL does not let you read and write data from a dynamically allocated pointer. To do that, you have to use the SQL Call Level Interface (CLI).

Accessing LOBs from RPG with SQL CLI

If you're already familiar with using the SQL CLI APIs from an RPG program, you'll notice that when you bind a column to an RPG variable, all you really do is pass the address (i.e., its location in memory), data type, and size of your variable to SQL. When SQL reads each column from the database, it transforms the data (if necessary) into the format of your variable and then saves it to the memory address that you specified.

Making that functionality work with a pointer instead of a variable is very easy. All you have to do is provide that pointer in place of your memory address, and you're done!

For example, let's say that you have a table in which you'd like to store some XML text that relates to items that you sell. You decide to use a CLOB field to store your XML, so you design a table (i.e., a physical file) that looks like this:

Create Table ItemLobs (
     ItemNo numeric(5,0) not null,
     Xml clob(2000000)
)

The preceding SQL statement creates a file with two fields. The ItemNo field is a simple numeric (aka "zoned decimal") field that contains the item number. The second field, named Xml, is a CLOB field that can hold values up to 2 MB. It's worth noting that even though I've defined the field as 2 MB, it doesn't always occupy that much disk space. The system uses only as much disk space as it actually needs for the field. So if you use only 500 bytes of the 2 million available, the system uses only 500 bytes of disk space. That's yet another advantage of storing the data outside the record buffer!

To insert a new record into that file, you run an SQL statement like the following one:

    SQLAllocStmt( conn : stmt );

    rc = SQLPrepare( stmt
                   : 'Insert Into ItemLobs +
                            (ItemNo,Xml) +
                            values(?,?)'
                   : SQL_NTS );

As you can see, I'm calling a subprocedure named SQLPrepare, and I'm passing a string containing an Insert SQL statement. It's really no different from any other SQL Insert statement, except that it contains question marks in place of the actual field data. These question marks are called parameter markers, and I tell the SQL CLI to insert data from my RPG program in place of those markers.

The SQLPrepare() API is part of the CLI APIs. I like to think of "prepare" as meaning "compile this SQL statement." When you call the SQLPrepare() API, you give it an SQL statement, and it analyzes it and generates optimized code inside the database. Now that I've done that, I can run that prepared statement many times if I like without having to re-prepare the statement.

To function properly, the SQLPrepare() API needs a connection handle. That connection handle had to be created earlier in the program -- and I excluded that part of the process from my code snippet. You can download the program in its entirety from the code download link (at the end of this article), and you can read more about SQL CLI by looking at my previous article on the subject. A link to that article is in the section titled "More Information" herein.

Anyway . . . that's the SQL Insert statement that inserts data into my CLOB. Now I have to tell SQL which RPG variables SQL should read to get the data that will be inserted into the file.

     D rc              s             10i 0
     D ItemNo          s              5s 0
     D Xml             s           1000a   based(p_Xml)
     D XmlLen          s             10i 0
     D XmlSize         s             10i 0
           .
           .
           rc = SQLBindParam( stmt
                            : 1
                            : SQL_NUMERIC
                            : SQL_NUMERIC
                            : %len(ItemNo)
                            : %decpos(ItemNo)
                            : %addr(ItemNo)
                            : 0 );
           check_error(rc:env:conn:stmt:'SQLBindParam(1)');

           rc = SQLBindParam( stmt
                            : 2
                            : SQL_CLOB
                            : SQL_CLOB
                            : XmlSize
                            : 0
                            : p_Xml
                            : XmlLen );
           check_error(rc:env:conn:stmt:'SQLBindParam(2)');

The SQLBindParam() API associates a parameter marker (question mark) with a variable in the RPG program. That's done by telling SQL the address in memory at which a variable resides, as well as the data type and length of that variable.

In the first call to SQLBindParam(), I tell it the address of an RPG field named ItemNo, and I tell it that the data type of the field is zoned decimal (which SQL refers to as "numeric"), and I provide the length of the field and the number of decimal positions. SQL now associates this ItemNo variable with the first question mark (parameter marker) in my SQL statement. Essentially, whatever's stored in ItemNo is inserted into the ItemNo field of my database.

The second call to SQLBindParam() is where it gets interesting. Instead of using %ADDR() to get the address of a variable in memory, I provide p_XML, which is a pointer. This example assumes that something has already set p_XML to point to the data that I want to write to my CLOB.

When associating a LOB with a parameter marker, the eighth parameter to SQLBindParam() is required. You have to tell it the size of your field twice! The fifth parameter should be the maximum length of your variable in characters. The eighth parameter should be the current length in bytes.

When using a BLOB or a CLOB, the length in bytes is the same as the length in characters, because each character occupies one byte of memory. When using a DBCLOB, however, the distinction between bytes and characters is important. Remember that the fifth parameter is measured in characters, and the eighth is measured in bytes.

The main reason that you have to specify the length twice is because parameter markers can potentially be used for output data as well as input. In this example, that's a non-issue, because the Insert statement uses them only to read data from the program and write it to the database. In an output parameter's case, however, SQL needs to know how much memory you've provided, plus it needs a way to tell you how much data it's stored in that memory. In other words, parameter 5 tells it how big the memory allocation is, and parameter 8 is used to explain how many bytes were actually used.

Now that I've associated my RPG variables with the parameter markers, I can insert data into the database easily:

           ItemNo = 1234;
           // p_XML and XmlLen were set to point to my large string that I want to write, as well
           rc = SQLExecute(stmt);
           check_error(rc:env:conn:stmt:'SQLExecute()');

This code executes the Insert statement, writing a record to the database. If I want to, I can write another record just by changing the values of ItemNo and p_XML and calling SQLExecute() again. I don't have to call SQLBindParam() again, and I don't have to call SQLPrepare() again.

Reading data from the CLOB is similar in concept but requires a slightly different API call:

           // dynamically allocate space for 2 million bytes
           XmlSize = 2000000;
           p_Xml   = %alloc(XmlSize);

           // associate that dynamic memory allocation with the
           // first column in the SQL result set
           SQLBindCol( stmt
                     : 1
                     : SQL_CLOB
                     : p_Xml
                     : XmlSize
                     : XmlLen );


          // -------------------------------------------------------
          //   run the SQL Select statement. The data will be
          //   fetched into the dynamic memory.
          // -------------------------------------------------------

           rc = SQLExecute(stmt);
           check_error(rc:env:conn:stmt:'SQLExecute()');

          dow '1';
             rc = SQLFetch(stmt);
             check_error(rc:env:conn:stmt:'SQLFetch()');
             if (rc = SQL_NO_DATA_FOUND);
                leave;
             else;
                //  At this point, the data will be in the dynamic memory
                //  pointed to by p_XML. (You can look at the 'XML' variable
                //  in the debugger to see the first 1,000 bytes of the data.)
                //
                //  Also, the XmlLen variable will be set to the length of the
                //  data retrieved from the CLOB.
             endif;
          enddo;

In this example, I know that my CLOB can have (at most) 2 million bytes of data stored in it, but I don't know how long the data stored in the CLOB is at this moment. So I use the %ALLOC() built-in function (BIF) to ask the operating system for 2 million bytes of memory, and I store the address of that memory in the p_XML pointer.

I also tell SQLBindCol() that the length of the CLOB field should be returned into my XmlLen RPG variable. When the SQLExecute() is run, the dynamic memory (p_XML) will contain the contents of the Xml field, and the XmlLen variable will tell me how much data was loaded into that dynamic memory.

Error Handling

Every SQL CLI API returns a code indicating whether it succeeded or failed. To determine whether a statement failed, interrogate that return code. You can call a particular API to which you can pass the statement handle and the return code, and it tells you what the SQLSTATE (or SQLSTT, in RPG lingo) and the SQLCODE (SQLCOD, in RPG lingo) values are. Because calling that API separately after every SQL statement is cumbersome, I put all my error checking logic into a subprocedure named check_error() that I call after each SQL statement to check whether my SQL is working properly.

The code for check_error() is included in the code download.

More Information

If you want to learn more about SQL CLI, you might be interested in the article I wrote about CLI two years ago. Although the main focus of that article was to teach how to read a result set from a stored procedure, it does go into some depth about how CLI works. Here's a link:
http://systeminetwork.com/article/retrieve-sql-result-set-rpg

You might also be interested in the SQL CLI information in the IBM Information Center.
http://publib.boulder.ibm.com/infocenter/iseries/v5r4/topic/cli/rzadpkickoff.htm

Code Download

You can download the sample code for this article from the following link:
http://www.pentontech.com/IBMContent/Documents/article/56657_600_CliClob.zip

ProVIP Sponsors

ProVIP Sponsors