The issues faced by programmers writing ODBC applications broadly fall into two areas:
The first issue - understanding how ODBC works (programmatically) and how to write interoperable ODBC applications - will be covered in this article, part 9. Issues specific to different languages will be covered in the following months. The plan is:
It may seem strange to be talking about ODBC programming in languages other than C, since the ODBC API is actually a C language API. But other languages have built interfaces to ODBC, interfaces that range from providing most of the power of the ODBC API, all the way down to just providing the ability to pass SQL statements and get results. The list above is not all-inclusive: for instance IBM released a rather short lived language called Visual Basic which was an attempt to create a version of BASIC with objects which had an ODBC interface.
In order to better understand what ODBC is and how ODBC works it helps to place the ODBC API in the context of other APIs, so we will look at embedded SQL programming and generic CLI programming before moving on to ODBC.
Since the goal was to create a language that simplified and standardized how to access data, the language was designed to be non-procedural. A procedural language is one where the desired result is defined by a number of ordered steps - the program. A non-procedural language is where the desired result is described by a single language statement. The thought was that many users of the data (lawyers, accountants, etc.) would not be capable of writing program loops in order to retrieve data. So the non-procedural nature of SQL allowed the user to ask a single question, albeit a sometimes complex question, in the form of a single SQL statement.
The non-procedural aspects of SQL pose some problems which were recognized early on by database vendors. There are some problems that are either impossible or very difficult to do with only SQL, which become possible or much easier within a procedural context. Therefore some database vendors created procedural "dialects" of SQL. For example Sybase has Transact-SQL, Oracle has PL/SQL and Informix has SPL; each of which has normal flow-control devices (IF-THEN, loops, etc.)
But even with procedural constructs added to SQL the language cannot stand on its own; it has to be employed within or by another programming language. SQL has no output facilities, no cout , printf() or say statements. The data retrieved from the database by SQL has to be transferred to another language in order to be sent to the printer, viewed on the screen, or to be otherwise generally useful.
There are generally two methods of interfacing SQL in a programming language.
The embedded method is the oldest method and is the method that SQL was designed around. Using functions to pass SQL statements to the database is a later development. This type of interface is called the Call Level Interface (CLI) and has also been standardized in both the X/Open CAE Specification "Data Management: SQL Call-Level Interface (CLI)" and ISO/IEC 9075-3:1995 (E) Call-Level Interface (SQL/CLI).
Writing programs is easier with the embedded method because it tracks closer to SQL than the function call method. However compiling programs is much easier with the function call/CLI method because the precompiler step is avoided, along with the expansion of the SQL statement(s) into multiple programming language statements and functions. Getting a programming language syntax error with the embedded method can be a nightmare to resolve because of all that extra stuff inserted into the program by the precompiler.
ODBC is based on the CLI method but it adds some additional features, which we will see below.
It helps to understand the added complexity of the CLI method by comparing it to embedded SQL programming. Before we can do that however we must stop for a moment and talk about the SQL SELECT statement. Although this is not a SQL tutorial I need to show you enough to be able to explain the embedded and CLI methods.
The SELECT statement in SQL retrieves data from a table in a database. The form of the statement is:
SELECT column_name,column _ name ,... FROM table_name WHERE condition
Column_name is the name(s) of the columns from the table you want to retrieve. The WHERE clause is optional and allows for specifying a condition that limits the rows from the table that are retrieved.
For example if we wanted to look at all names and addresses of all the democratic voters in the city of Littleton we could write a select statement that says
select first_name,last_name,address from voters where city='LITTLETON' and party = `DEMOCRAT'
This would retrieve about 5,856 rows from a table of 22,656 (voters.html) registered votes - those 5856 rows are called a "result set". So far so good. But how will we deal with 5856 rows of data? We really need to look at each individual row, or send each individual row to a printer. SQL has a construct called a cursor , which is a result set associated with a pointer. We can use the pointer to select single rows from the result set and get the data from the columns on that row. The way that is done is (and here SQL's non-procedural nature seems to change to something that sure seems procedural-like):
Each time the FETCH statement is executed the cursor pointer is moved to the next row and the data is copied to the host programming language variables that are specified. Traditionally the pointer can only move forward through the result set and can only move one row at a time. Newer databases add the ability to move the pointer either forwards or backwards and by some arbitrary number of rows in either direction. (ODBC has something called a cursor library that will provide the backward and arbitrary movement functions if the database doesn't support those features.)
Now lets look at an example of the embedded method.
The example below is a simple C program that opens a cursor requesting all the democratic voters from the city of Littleton, and prints out the results. This program would have to be run through a precompiler, supplied by the database vendor. The output from the precompiler would then be compiled.
EXEC SQL INCLUDE SQLCA;
EXEC SQL BEGIN DECLARE SECTION;
EXEC SQL END DECLARE SECTION;
int rn=0;
/* connect to the database */
EXEC SQL CONNECT "DBA"
/* create cursor for SELECT statement */
EXEC SQL DECLARE CURSOR c1 FOR select first_name,last_name,address from voters
where city='LITTLETON' and party = `DEMOCRAT';
EXEC SQL OPEN CURSOR c1;
while (1) /* start a loop */
{
}
With the embedded interface SQL integrates rather nicely with a programming language, in this example C. It is fairly straightforward to send SQL statements to the database and get the data from the database into the programming language (the host language) variables. The disadvantage is that the program will only run against the database it is precompiled for. This is because the precompiler is specific to a database; each database vendor uses their own precompiler that only works on their database. This also means that a single program cannot easily access more that one database; to access more than one vendor's database in a single program you would have to break the program into separate modules for each database, precompile and compile each module separately and then link the modules together into a single program.
The sequence is somewhat more involved. The basic steps are:
/* handles */
HENV henv;
HDBC hdbc;
HSTMT hstmt;
RETCODE rc;
/* variables to hold data returned from database */
char fname[254];
char lname[254];
char address[254];
/* stores length of data return from fetch */
SDWORD fnameLen;
SDWORD lnameLen;
SDWORD addressLen;
/* row counter variable */
int rn;
/* allocate an environment handle */
rc = SQLAllocEnv(&henv);
/* allocate a connection handle */
rc = SQLAllocConnect(henv,&hdbc);
/* make the connection */
rc = SQLConnect(hdbc,"DBA",SQL_NTS,NULL,NULL,NULL,NULL);
/* allocate a statement handle */
rc = SQLAllocStmt(hdbc,&hstmt);
/* now execute the select statement */
rc = SQLExecDirect(hstmt,"select first_name,last_name,address from voters
where city='LITTLETON' and party = 'DEMOCRAT',SQL_NTS);
/* now specify which C variables are to hold the data */
rc = SQLBindCol(hstmt,1,SQL_C_CHAR,fname,254,&fnameLen);
rc = SQLBindCol(hstmt,2,SQL_C_CHAR,lname,254,&lnameLen);
rc = SQLBindCol(hstmt,3,SQL_C_CHAR,address,254,&addressLen);
while(1)
{
}
Notes on the code.
The CLI program requires more code to accomplish the same thing. For example, in embedded programming we can simply specify which host variable we want to use in the FETCH statement; with CLI programming we have to call a function for each host variable which is used to define the name, type and size of the host variable - things that the precompiler figures out (type and size anyway) for us automatically. The trade-offs are:
ODBC is also a superset of the Call-Level Interface, meaning that ODBC provides some additional functions and features that do not appear in the CLI standard. These additional features include: the Driver Manager, code-page translation libraries, functions that report the database driver's ANSI SQL conformance and support, escape sequences for various literal values and scalar functions, and standardized scalar functions. Each of these additional features is discussed below.
The Driver Manager is a library that manages communications between applications and drivers (meaning here CLI libraries). The Driver Manager exists mainly as a convenience to application writers and solves a number of problems common to all applications. These include determining which driver (CLI library) to load based on a data source name, loading and unloading drivers, and calling functions in drivers.
To see why the latter is a problem, consider what would happen if the application called functions in the driver directly - which is what happens in normal CLI programming. Unless the application was linked directly to a particular driver, it would have to build a table of pointers to the functions in that driver and call those functions by pointer. Using the same code for more than one driver at a time would add yet another level of complexity. The application would first have to set a function pointer to point to the correct function in the correct driver, and then call the function through that pointer.
The Driver Manager solves this problem by providing a single place to call each function. The application is linked to the Driver Manager, not the driver. The application identifies the target driver and builds a table of pointers to the functions in that driver. It uses the connection handle passed by the application to find the address of the function in the target driver and calls that function by address.
For the most part the Driver Manager just passes function calls from the application to the correct driver. However it also implements some functions itself - mainly those that query the list of data source names and drivers installed - and performs basic error checking. For example, the Driver Manager checks that handles are not null pointers, that functions are called in the correct order, and that certain function arguments are valid.
The final role of the Driver Manager is loading and unloading drivers. The application loads and unloads only the Driver Manager. When it wants to use a particular driver it calls a connection function in the Driver Manager and specifies a data source name (or a driver type). Using this name the Driver Manager searches the data source information (on OS/2 this is ODBC.INI) for the driver's file. It then loads the driver (assuming it is not already loaded), stores the address of each function in the driver, and calls the connection function in the driver, which then initializes itself and connects to the data source. When the application is done using the driver it calls a disconnect function in the Driver Manager which calls that function in the driver and then unloads the driver from memory.
Escape sequences are provided for date, time and timestamp values, scalar function calls, LIKE escape characters (a way of specifying _ or % as literal characters to be searched), outer joins and procedure calls.
Date escape sequence is {d `YYYY-MM-DD'}, example {d `2001-12-25'}
Time escape sequence is {t `hh:mm:ss'}, example {t `23:04:01'}
Timestamp sequence is {ts `YYYY-MM-DD hh:mm:ss'}, example {ts `2001-12-25 23:04:01'}
An example of a date escape sequence in a select statement is
select first_name, last_name from voters where registered_date after {d `1999-01-01'}
The escape sequence for ODBC defined scalar functions is {fn scalar-function }, for example {fn UCASE(NAME)}. The ODBC defined scalar functions are discussed below.
The outer join escape sequence is {oj outer-join } where outer-join is:
table-reference LEFT OUTER JOIN { table-reference | outer-join } ON search-condition
for example
select name, deptname from {oj employee left outer join dept on employee.deptid=dept.deptid} where employee.projnum = 544
Scalar functions are functions that can be used in place of a column name in a select statement.
I will just list the scalar functions here because the purpose and usage of most are so obvious explanation is not necessary.
ASCII(
string_exp
)
CHAR(
code
)
CONCAT(
string1, string2
)
DIFFERENCE(
string1, string2
)
INSERT(
string1, start, length string2
)
LCASE(
string
)
LEFT(
string, count
)
LENGTH(
string
)
LOCATE(
string1, string2[, start]
)
LTRIM(
string
)
REPEAT(
string,count
)
REPLACE(
haystack,needle, new_needle
)
RIGHT(
string, count
)
RTRIM(
string
)
SPACE(
count
)
UCASE(
string
)
ABS(
number
)
ACOS(
float
)
ASIN(
float
)
ATAN(
float
)
ATAN2(
float1, float2
)
CEILING(
number
)
COS(
float
)
COT(
float
)
DEGREES(
number
)
EXP(
float
)
FLOOR(
number
)
LOG(
float
)
LOG10(
float
)
MOD(
integer1, integer2
)
PI()
POWER(
number,integer
)
RADIANS(
number
)
RAND(
integer
)
ROUND(
number, integer
)
SIGN(
number
)
SIN(
float
)
SQRT(
float
)
TAN(
float
)
TRUNCATE(
number,integer
)
CURDATE()
CURTIME()
DAYNAME(
date
)
DAYOFMONTH(
date
)
DAYOFWEEK(
date
)
DAYOFYEAR(
date
)
HOUR(
time
)
MINUTE(
time
)
MONTH(
date
)
MONTHNAME(
date
)
NOW()
QUARTER(
date
)
SECOND(
time
)
TIMESTAMPADD(
interval, integer, timestamp
)
TIMESTAMPDIFF(
interval, timestamp1, timestamp2
)
WEEK(
date
)
YEAR(
date
)
DATABASE()
IFNULL()
USER()
CONVERT(
value, data_type)
An application written to the ODBC API of a certain version requires a Driver Manager at that version or higher and a database driver at that version or higher. Meaning that an application written and compiled to version 2.x will work with a version 3 Driver Manager and database driver; but an application written and compiled to version 3 will not work with a version 2.x Driver Manager or database driver.
The major differences between version 2.x and version 3.x are those needed to bring ODBC into compliance with the CLI standards, which mostly revolved around changing function names a number of #DEFINES changing from version 2 to version 3. Version 3 really doesn't add much in the way of additional capabilities.
Since there are a number of ODBC version 2.x database drivers in OS/2 that have never been updated to version 3, it is probably best to write applications to the version 2.5 API. This way the application will be compatible with the widest range of database drivers. The one exception to this might be if you are trying to write code that is source compatible with both the ODBC and the CLI standard, so the same application could be used with or without ODBC (although there is really not much advantage to not using ODBC). To be source compatible with most database vendors that conform to the CLI standards, other than DB2, you would have to code your application to ODBC version 3. With DB2 you can write the application to ODBC version 2.5 and still have it compile as a DB2 CLI application.
If you are using a language other than C or C++ you probably don't have a choice. The language will have its data access part written to a version of ODBC and that is what you are stuck with. But it won't matter to you anyway because the language will mask any differences between versions.
#define SQL_CHAR 1
#define SQL_NUMERIC 2
#define SQL_DECIMAL 3
#define SQL_INTEGER 4
#define SQL_SMALLINT 5
#define SQL_FLOAT 6
#define SQL_REAL 7
#define SQL_DOUBLE 8
#define SQL_VARCHAR 12
#define SQL_DATE 9
#define SQL_TIME 10
#define SQL_TIMESTAMP 11
#define SQL_LONGVARCHAR (-1)
#define SQL_BINARY (-2)
#define SQL_VARBINARY (-3)
#define SQL_LONGVARBINARY (-4)
#define SQL_BIGINT (-5)
#define SQL_TINYINT (-6)
#define SQL_BIT (-7)
ODBC functions that return the structure of a table and the characteristics of a result set, return both the database name for the column type and the ODBC type mapped to that column. This helps, but it is up to the driver how to map a particular database type to an ODBC type. For example some Oracle database drivers return SQL_DATE for an Oracle DATE column; others return the ODBC type as SQL_TIMESTAMP. So you end up having to know and program around the database vendor and driver vendor instead is blindly relying on the ODBC types to solve type mismatches.
Some database data types have no equivalent in C. For example DB2 has a decimal type that doesn't really map well into any existing C type; the value loses precision when placed into C float and double types.
When you map a host variable to a column in a result set, you specify the type of the host variable. The database driver is responsible for converting the data from the database type into the host variable type. Likewise if you are using input parameters the driver is responsible for converting from the host variable type to the database data type. (We haven't talked about input parameters, but the concepts are basically the same for input parameters as output parameters.)
This month we tried to show some of why ODBC is valuable from a programmer's perspective, some basic concepts behind ODBC and some features of ODBC that can and should be used regardless of the language used for writing an ODBC enabled database application. Next month we will discuss writing ODBC applications in C and C++, how to compile the application and where to get the headers and libraries needed. Our intent is not to provide a tutorial on ODBC programming, but rather to explain the mechanics of writing ODBC applications on OS/2. There are books that already exist on writing ODBC applications, and the ODBC reference manuals from Microsoft are actually pretty good themselves.
1 - Database Principles, Programming, and Performance page 26; by O`Neil and O`Neil ISBN: 1-55860-438-3.
This article is courtesy of www.os2ezine.com. You can view it online at http://www.os2ezine.com/20011216/page_10.html.