php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #57039 A new php_oci_lob_read using callbacks
Submitted: 2006-05-25 11:55 UTC Modified: 2006-06-08 14:41 UTC
From: msquillace at sogei dot it Assigned:
Status: Closed Package: oci8 (PECL)
PHP Version: Irrelevant OS: RedHat Enterprise Linux
Private report: No CVE-ID: None
 [2006-05-25 11:55 UTC] msquillace at sogei dot it
Description:
------------
We are using PHP 5.1.4 with OCI8 v.1.2.1 against Oracle 10g UTF-8 databases (v.10.2.0.1.0 and v.10.2.0.2.0), and have been experiencing problems related to CLOB and BLOB reading/loading via OCI-Lob->load and OCI-Lob->read.

Problems still happen with the latest CVS revisions of oci8_lob.c (Rev.1.13) and oci8.c (Rev.1.299).

I decided to try and modify the sources, and came up with a new version of php_oci_lob_read using callbacks that solves all our problems and should also be independent from CLOB encoding issues (Unicode included).

In the spirit of giving back something to the PHP community, and hoping it may be of interest, I append the patches (diff -uw; the CVS 1.13 revision is named oci8_lob.c.cvs) in the "reproduce code" section below.

I believe the new php_oci_lob_ctx typedef should really go in php_oci8_int.h but I put it in oci8_lob.c to minimize the diffs.

I found that the new PHP_OCI_LOB_BUFFER_SIZE (32768) is too small to guarantee optimal performance when reading multimegabyte LOBs, so initially built a version that computed (and cached in the LOB descriptor) the optimal LOB chunk size with OCILobGetChunkSize(); a multiple of this chunk size was then used to dynamically allocate the LOB transfer buffer.

In real use though, this appeared overkill; moreover in our setup we do not observe performance gains when the buffer size grows above 1MB, so I decided for the much simpler approach used in setting the bufsz variable.

Hope this helps.

PS: I know the patch code is far beyond the allowed length, but cannot provide an URL (we're behind firewalls). Alternatively, I can e-mail the patches (or the new oci8_lob.c) to the address of the OCI8 maintainer (maybe tony2001@php.net ?)

Reproduce code:
---------------
--- oci8_lob.c.cvs	2006-05-22 16:26:15.000000000 +0200
+++ oci8_lob.c	2006-05-25 16:35:04.000000000 +0200
@@ -144,25 +144,73 @@
 	return 0;
 } /* }}} */

+typedef struct { /* php_oci_lob_ctx {{{ */
+        char **d;            /* address of pointer to LOB data */
+        ub4 *dlen;           /* address of LOB length variable (bytes) */
+} php_oci_lob_ctx; /* }}} */
+
+/* {{{ php_oci_lob_callback()
+ Append LOB portion to a memory buffer */
+#if defined(HAVE_OCI_LOB_READ2)
+sb4 php_oci_lob_callback (dvoid *ctxp, CONST dvoid *bufxp, oraub8 len, ub1 piece, dvoid **changed_bufpp, oraub8 *changed_lenp TSRMLS_DC)
+#else
+sb4 php_oci_lob_callback (dvoid *ctxp, CONST dvoid *bufxp, ub4 len, ub1 piece TSRMLS_DC)
+#endif
+{
+        ub4 lenp = (ub4) len;
+        php_oci_lob_ctx *ctx = (php_oci_lob_ctx *)ctxp;
+
+	switch (piece)
+	{
+		case OCI_LAST_PIECE:
+			*(ctx->d) = erealloc(*(ctx->d), (size_t) (*(ctx->dlen) + lenp + 1));
+			memcpy(*(ctx->d) + *(ctx->dlen), bufxp, (size_t) lenp);
+			*(ctx->dlen) += lenp;
+			*(*(ctx->d) + *(ctx->dlen)) = 0x00;
+			return OCI_CONTINUE;
+
+		case OCI_FIRST_PIECE:
+		case OCI_NEXT_PIECE:
+			*(ctx->d) = erealloc(*(ctx->d), (size_t) (*(ctx->dlen) + lenp));
+			memcpy(*(ctx->d) + *(ctx->dlen), bufxp, (size_t) lenp);
+			*(ctx->dlen) += lenp;
+			return OCI_CONTINUE;
+
+		default:
+			php_error_docref(NULL TSRMLS_CC, E_WARNING, "Unexpected LOB piece id received (value:%d)", piece);
+			efree(*(ctx->d));
+			*(ctx->d) = NULL;
+			*(ctx->dlen) = 0;
+			return OCI_ERROR;
+	}
+}
+
 /* {{{ php_oci_lob_read()
- Read specified portion of the LOB into the buffer */
+ Read specified LOB portion into a memory buffer */
 int php_oci_lob_read (php_oci_descriptor *descriptor, long read_length, long initial_offset, char **data, ub4 *data_len TSRMLS_DC)
 {
 	php_oci_connection *connection = descriptor->connection;
 	ub4 length = 0;
+	int is_clob = 0, bufsz = 0, stdbufsz = 1048576;
+	php_oci_lob_ctx ctx;
+	ub1 *bufp;
 #if defined(HAVE_OCI_LOB_READ2)
-	oraub8 bytes_read, bytes_total = 0, offset = 0;
+	oraub8 bytes_read, offset = 0;
 	oraub8 requested_len = read_length; /* this is by default */
 	oraub8 chars_read = 0;
 #else
-	int bytes_read, bytes_total = 0, offset = 0;
+	int bytes_read, offset = 0;
 	int requested_len = read_length; /* this is by default */
-	int chars_read = 0;
 #endif
-	int is_clob = 0;

+	/* Initialize ctx here ... callback may never receive OCI_FIRST_PIECE */
 	*data_len = 0;
 	*data = NULL;
+	ctx.dlen = data_len;
+	ctx.d = data;
+
+	/* Compute a good size for a dynamic LOB transfer buffer [avoids calling OCILobGetChunkSize()]. */
+	bufsz = (PHP_OCI_LOB_BUFFER_SIZE > stdbufsz) ? PHP_OCI_LOB_BUFFER_SIZE : stdbufsz;

 	if (php_oci_lob_get_length(descriptor, &length TSRMLS_CC)) {
 		return 1;
@@ -189,6 +237,8 @@
 		return 0;
 	}

+	offset = initial_offset;
+
 	if (descriptor->type == OCI_DTYPE_FILE) {
 		connection->errcode = PHP_OCI_CALL(OCILobFileOpen, (connection->svc, connection->err, descriptor->descriptor, OCI_FILE_READONLY));

@@ -198,7 +248,7 @@
 			return 1;
 		}
 	}
-#ifdef HAVE_OCI_LOB_READ2
+#if defined(HAVE_OCI_LOB_READ2)
 	else {
 		ub2 charset_id = 0;

@@ -214,56 +264,45 @@
 			is_clob = 1;
 		}
 	}
-#endif

-	*data = (char *)emalloc(requested_len + 1);
+	if (is_clob) {
+		chars_read = requested_len;
+		bytes_read = 0;
+	} else {
+		chars_read = 0;
 	bytes_read = requested_len;
-	offset = initial_offset;
-
-	/* TODO
-	 * We need to make sure this function works with Unicode LOBs
-	 * */
-
-#if defined(HAVE_OCI_LOB_READ2)
+	}

-	do {
-		chars_read = 0;
+	bufp = (ub1 *) ecalloc(1, bufsz);
 		connection->errcode = PHP_OCI_CALL(OCILobRead2,
 			(
 				connection->svc,
 				connection->err,
 				descriptor->descriptor,
 				(oraub8 *)&bytes_read,								/* IN/OUT bytes toread/read */
-				(oraub8 *)&chars_read,
+			(oraub8 *)&chars_read,							/* IN/OUT chars toread/read */
 				(oraub8) offset + 1,								/* offset (starts with 1) */
-				(dvoid *) ((char *) *data + *data_len),
-				(oraub8) requested_len,									/* size of buffer */
-				0,
-				NULL,
-				(OCICallbackLobRead2) 0,					/* callback... */
+			(dvoid *) bufp,
+			(oraub8) bufsz,								/* size of buffer */
+			OCI_FIRST_PIECE,
+			(dvoid *)&ctx,
+			(OCICallbackLobRead2) php_oci_lob_callback,				/* callback... */
 				(ub2) connection->charset,	/* The character set ID of the buffer data. */
 				(ub1) SQLCS_IMPLICIT					/* The character set form of the buffer data. */
 			)
 		);
-
-		bytes_total += bytes_read;
+	efree(bufp);
 		if (is_clob) {
-			offset += chars_read;
+		offset = descriptor->lob_current_position + chars_read;
 		} else {
-			offset += bytes_read;
+		offset = descriptor->lob_current_position + bytes_read;
 		}

-		*data_len += bytes_read;
-
-		if (connection->errcode != OCI_NEED_DATA) {
-			break;
-		}
-		*data = erealloc(*data, *data_len + PHP_OCI_LOB_BUFFER_SIZE + 1);
-	} while (connection->errcode == OCI_NEED_DATA);
-
 #else

-	do {
+	bytes_read = requested_len;
+
+	bufp = (ub1 *) ecalloc(1, bufsz);
 		connection->errcode = PHP_OCI_CALL(OCILobRead,
 			(
 				connection->svc,
@@ -271,28 +310,17 @@
 				descriptor->descriptor,
 				&bytes_read,								/* IN/OUT bytes toread/read */
 				offset + 1,								/* offset (starts with 1) */
-				(dvoid *) ((char *) *data + *data_len),
-				requested_len,									/* size of buffer */
-				(dvoid *)0,
-				(OCICallbackLobRead) 0,					/* callback... */
+			(dvoid *) bufp,
+			(ub4) bufsz,								/* size of buffer */
+			(dvoid *)&ctx,
+			(OCICallbackLobRead) php_oci_lob_callback,				/* callback... */
 				(ub2) connection->charset,	/* The character set ID of the buffer data. */
 				(ub1) SQLCS_IMPLICIT					/* The character set form of the buffer data. */
 			)
 		);
-
-		bytes_total += bytes_read;
-		offset += bytes_read;
-
-		*data_len += bytes_read;
-
-		if (connection->errcode != OCI_NEED_DATA) {
-			break;
-		}
-		*data = erealloc(*data, *data_len + PHP_OCI_LOB_BUFFER_SIZE + 1);
-	} while (connection->errcode == OCI_NEED_DATA);
-
+	efree(bufp);
+	offset = descriptor->lob_current_position + bytes_read;
 #endif
-
 	if (connection->errcode != OCI_SUCCESS) {
 		php_oci_error(connection->err, connection->errcode TSRMLS_CC);
 		PHP_OCI_HANDLE_ERROR(connection, connection->errcode);
@@ -301,7 +329,7 @@
 		return 1;
 	}

-	descriptor->lob_current_position = offset;
+	descriptor->lob_current_position = (int) offset;

 	if (descriptor->type == OCI_DTYPE_FILE) {
 		connection->errcode = PHP_OCI_CALL(OCILobFileClose, (connection->svc, connection->err, descriptor->descriptor));
@@ -315,9 +343,6 @@
 		}
 	}

-	*data = erealloc(*data, *data_len + 1);
-	(*data)[ *data_len ] = 0;
-
 	return 0;
 } /* }}} */



Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2006-05-29 07:31 UTC] tony2001 at phpclub dot net
Yes, please send the patch to me by mail.
Did you tests it yourself? With what Oracle server & Oracle client versions? On what platforms?
 [2006-05-29 11:37 UTC] msquillace at sogei dot it
The e-mail I sent you contains the patch, as requested.

The following is from the same mail, I repeat it here to further document the original submission.
----
We tested the patch ourselves, with selected "problem" CLOBs and (B)LOBs; our application programmers have since migrated several gigabytes of old (non-LOB) Oracle 8i data to a new Oracle 10g database schema using CLOBs/BLOBs without further hurdles.

Our Red Hat Enterprise Linux 3.0 (x86) web servers are all using Oracle 10g Client v.10.1.0.3.0 with PHP 5.1.4.

One database server is on a Solaris 9 machine and is an "Oracle Database 10g Enterprise Edition Release 10.2.0.2.0 - 64bit Production With the Partitioning, Oracle Label Security, OLAP and Data Mining Scoring Engine options".

The other database server under test is on a Red Hat Enterprise Linux 3.0 (x86) machine and is an "Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production With the Partitioning, OLAP and Data Mining options".
 [2006-06-08 08:53 UTC] michael at six dot de
The patch works great with php5.2-200606071030 using oracle 9.2 client or 10.1 instantclient (old suse linux 8.2) accessing utf8 oracle instances (oracle 9.2.0.5.0 on suse 9.3 and oracle 10.2.0.1.0 on solaris 10 sparc).
 [2006-06-08 14:41 UTC] tony2001 at phpclub dot net
Patch committed.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Dec 26 10:01:29 2024 UTC