php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #57125 Persistent connections misbehave when Apache process times out
Submitted: 2006-07-04 05:27 UTC Modified: 2006-08-21 12:42 UTC
From: msquillace at sogei dot it Assigned:
Status: Closed Package: oci8 (PECL)
PHP Version: Irrelevant OS: RedHat Enterprise Linux
Private report: No CVE-ID: None
View Add Comment Developer Edit
Anyone can comment on a bug. Have a simpler test case? Does it work for you on a different platform? Let us know!
Just going to say 'Me too!'? Don't clutter the database with that please !
Your email address:
MUST BE VALID
Solve the problem:
12 - 7 = ?
Subscribe to this entry?

 
 [2006-07-04 05:27 UTC] msquillace at sogei dot it
Description:
------------
In both PHP 5.1.4 with oci8.c v1.299 (2006/05/18 13:20:00) and the current OCI8 extension from php5.2-200607031230.tar.bz2 the active persistent connection misbehaves if the Apache process times out in the middle of a DB operation.

Specifically, if a (first) PHP script run by the Apache process N and (re)using a persistent connection is executing a very long running DB operation or is waiting on a lock when Apache times out, it doesn't send any output to the client (expected). Note that PHP itself doesn't timeout in this situation (expected).

Now, if a new (second) PHP script is fed to the same Apache process and tries to reuse the same persistent connection it will get warnings similar to the following:

Warning: oci_execute(): ORA-24909: call in progress. Current operation cancelled in /php/prova_oracle.php on line 15
Warning: ocifetchinto(): ORA-24374: define not done before fetch or execute and fetch in /php/prova_oracle.php on line 17

A third PHP script will get slightly different warnings:

Warning: oci_execute(): ORA-24909: call in progress. Current operation cancelled in /php/prova_oracle.php on line 15
Warning: ocifetchinto(): ORA-24338: statement handle not executed in /php/prova_oracle.php on line 17

What happens next (fourth PHP script on the same Apache process) is quite bad and differs in the two above mentioned PHP configurations:

PHP 5.1.4 -> the Apache process segfaults;
current PHP snapshot -> the Apache process freezes FOREVER, ignoring any Apache timeout.

In order to solve the problem I believe PHP_RSHUTDOWN_FUNCTION(oci) should be made aware of PHP state when Apache times out (first script), and destroy the persistent connection(s) used by the timed-out script.

To make a long debugging session brief, I found a couple of ways to determine if an OCI function was executing at the time of an Apache timeout; the one requiring least code modifications is implemented in the following patch to function php_oci_persistent_helper() in oci8.c:

diff -u old/oci8.c new/oci8.c
--- old/oci8.c  2006-06-28 18:30:51.000000000 +0200
+++ new/oci8.c  2006-07-03 16:17:03.000000000 +0200
@@ -1724,6 +1724,7 @@
 {
        time_t timestamp;
        php_oci_connection *connection;
+       char *function;

        timestamp = time(NULL);

@@ -1731,6 +1732,11 @@
                connection = (php_oci_connection *)le->ptr;

                if (connection->used_this_request) {
+                       function = get_active_function_name(TSRMLS_C);
+                       if(function && (2 < strlen(function)) && (!strncasecmp(function, "oci", 3))) {
+                               return 1; /* OCI call still in progress, close all (used) persistent connection(s). */
+                       }
+
                        if (connection->descriptors) {
                                zend_hash_destroy(connection->descriptors);
                                efree(connection->descriptors);


The above should work as long as the current naming convention holds; it is based on the (empirical) observation that get_active_function_name() normally returns NULL in a module 's RSHUTDOWN function, but returns the interrupted PHP function name after an Apache timeout.

Reproduce code:
---------------
First of all, start Apache in single process mode (I'd also set to, like, 30s the default timeout in httpd.conf):

[root@websrv tmp]# /path/to/httpd/httpd -X

To reproduce the problem, we can lock a DB record using e.g. sqlplus without committing:

SQL> select campo1 from table1 for update;

In a second terminal session via e.g. Curl, or from a browser we then run the following script once:

<?php
$pid=getmypid();
echo "pid:$pid<br>\n";
$conn = oci_pconnect("user", "pwd", "test");
$query = 'select campo1 from table1 for update';
$stid = OCIParse($conn, $query);
oci_execute($stid, OCI_DEFAULT);
while($succ = OCIFetchInto($stid, $row)) {
  foreach($row as $item) {
    echo $item." ";
  }
  echo "<br>\n";
}
?>

The script will timeout and return nothing (expected).

We then run the following script three times (any other script reusing the same persistent connection would do), resulting in the sequence of events described above:

<?php
$pid=getmypid();
echo "pid:$pid<br>\n";
$conn = oci_pconnect("user", "pwd", "test");
$query = "select table_name from user_tables where table_name='TABLE1'";
$stid = OCIParse($conn, $query);
oci_execute($stid, OCI_DEFAULT);
while($succ = OCIFetchInto($stid, $row)) {
  foreach($row as $item) {
    echo $item." ";
  }
  echo "<br>\n";
}
?>



Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2006-07-05 04:30 UTC] tony2001 at phpclub dot net
This solution is just a hack and I would prefer to have something more realistic.
Also, I don't actually see this as a major issue - you can alway shoot in your leg, nobody can prevent that.
If you intentionally LOCK the connection, it will be... yes, locked. And that's clearly user's error.
 [2006-07-05 10:57 UTC] msquillace at sogei dot it
The proposed solution may qualify as a hack, but the issue is quite real.

What is described in the Reproduce Code section is just an easy way to REPRODUCE the misbehavior, and certainly NOT a piece of actual working code.

In my experience, however careful the design and skilled the developers may be, when a complex PHP application is eventually benchmarked and/or deployed unexpected and/or untested problems appear, and they almost invariably come from the database.

The actual working code that prompted my analysis runs quickly on the development servers but, depending on the specific request, MAY timeout on the validation servers where a complete database resides, for reasons still under investigation.

What I don't like is PHP behavior in this case; I believe it should never segfault nor freeze, and the "hack" is nonetheless a working solution.

There may be other, more elegant ways around the problem, of  which I am sadly NOT aware.

What I also (successfully) tried is a redefinition of the PHP_OCI_CALL macro to set/reset a global variable that will be tested in php_oci_persistent_helper().
 [2006-08-09 06:16 UTC] tony2001 at phpclub dot net
This bug has been fixed in CVS.

In case this was a documentation problem, the fix will show up at the
end of next Sunday (CET) on pecl.php.net.

In case this was a pecl.php.net website problem, the change will show
up on the website in short time.
 
Thank you for the report, and for helping us make PECL better.


 [2006-08-11 08:42 UTC] msquillace at sogei dot it
I tested the patch with php5.2-200608100830.tar.gz and the "Reproduce code". The misbehaviour is still there.

This happens because PG(connection_status) & PHP_CONNECTION_TIMEOUT evaluates to FALSE in php_oci_persistent_helper: a trace shows that, when the Apache process times out, PG(connection_status)=0 while PHP_CONNECTION_TIMEOUT=2.

When I first investigated the problem I ruled out OCIBreak() since the Oracle manual states it is not supported for Windows servers (I am told the latest version may have changed this, but I wished to preserve backward compatibility).

Nonetheless, since your patch calls it and the approach would allow to keep the persistent session running, I replaced the non-working PG(connection_status) test with my proposed test on get_active_function_name() so I could study its effect.

Well, when OCIBreak() gets called at the end of the timed-out script the connection state (as seen from sqlplus) changes from ACTIVE to INACTIVE, *BUT* when I call the next script it still gets ORA-24909, ORA-24374 and ORA-24338 errors (the Apache process doesn't abend though, because when OCIBreak is called twice for the same connection it returns an error, forcing php_oci_persistent_helper to return 1 and this results in the persistent connection being destroyed, as my original code would always do).

To summarize:
1) The PG(connection_status) test fails when the Apache process times out;
2) OCIBreak() is a disappointing substitute for persistent connection destruction, which AFAIK is the only clean solution to this misbehaviour.

Bug status (re)set to OPEN.
 [2006-08-11 08:55 UTC] tony2001 at phpclub dot net
>a trace shows that, when the Apache process times out, PG(connection_status)=0 
Test with CGI, Apache1 and Apache2.
In all cases PG(connection_status) is PHP_CONNECTION_TIMEOUT.
 [2006-08-11 10:04 UTC] tony2001 at phpclub dot net
Ok, I tested with with some different code:
<?php $c = oci_pconnect(); for (;;) {} ?>
In this case it DOES time out and PG(connection_status) has the correct value.

But, for example, with $query = "begin dbms_lock.sleep(5); end;"; it doesn't timeout, waiting for the statement to finish executing, so PG(connection_status) is 0.
In the same time (because the statement was successfully executed) the connection can be reused again.

So the only thing I can say is "cannot reproduce".
 [2006-08-11 10:07 UTC] msquillace at sogei dot it
Running mod_php with Apache 1.3.32 here.
Rebuilt php5.2-200608100830 ten minutes ago with:

./configure --prefix=/opt/www --mandir=/usr/man \
--with-config-file-path=/opt/www/conf --disable-debug \
--enable-safe-mode --with-exec-dir=/opt/web/php \
--enable-track-vars --enable-magic-quotes \
--with-apxs=/opt/www/bin/apxs --enable-bcmath \
--with-pdflib=no --enable-ftp --with-gd \
--with-jpeg-dir --with-png-dir --with-zlib-dir \
--with-xpm-dir --with-ttf --with-freetype-dir \
--enable-gd-native-ttf --enable-pcntl --enable-soap \
--with-mysql=no --without-mm --with-ldap --with-bz2 \
--enable-sysvsem --enable-sysvshm --enable-wddx \
--enable-xml --enable-sockets --with-curl=/opt/curl \
--enable-trans-sid --with-openssl --enable-calendar \
--enable-sigchild --enable-shmop --with-zlib \
--with-oci8=/home/oracle --enable-mbstring=all \
--with-gmp --with-mcrypt=/opt/mcrypt \
--with-mhash=/opt/mhash --with-zip=/opt/zziplib \
--with-iconv --with-pthread --with-dom --with-libxml-dir=/opt/libxml2

and re-run test scripts; PG(connection_status)=0, again.

I tested with "/path/to/httpd/httpd -X", "Timeout 30" in httpd.conf and "max_execution_time = 30" in php.ini, but also with set_time_limit(10) and set_time_limit(40) with no difference in the end result, as expected.

Are you using the "Reproduce code"?
 [2006-08-11 10:38 UTC] msquillace at sogei dot it
Looks like we were both writing at the same time ...

To reproduce you can simply lock a record with sqlplus, or with a PHP CLI script that "never" terminates e.g. by calling sleep(3600) after the select for update.

You then use the browser (or curl) to invoke another script trying the same select for update; it will timeout (Apache, not PHP timeout).

A last script can now try to reuse the same persistent connection for whatever activity, and it will get the errors which prompted me to open the bug.
 [2006-08-15 08:17 UTC] tony2001 at phpclub dot net
If I lock the record in sqlplus, Apache doesn't time out at all.
The process keeps running forever.
This happens with CGI and Apache1/Apache2.
 [2006-08-16 06:25 UTC] msquillace at sogei dot it
We do not use PHP in CGI mode here, only mod_php (and Apache1 so far) so I can't confirm the Apache behaviour you observe.

Nonetheless, in the past we experienced situations in which Apache didn't timeout when blocked in Oracle operations, and even devised a PHP daemon that will kill "runaway processes" of that kind after the Apache timeout has expired.

This does not happen in this situation, though; when "strace"ing the process I see it stop for the duration of the Apache timeout, then the process receives a SIGALARM an processing resumes in the PHP shutdown function I register in the auto_prepend script to do standardized cleanup.

I believe Apache starts a timer and, when that is triggered, the PHP script is simply interrupted; Apache then gives control back to PHP so it can process the RSHUTDOWN functions, but apparently the engine still believes to be executing an OCI function. That's why my patch works.
 [2006-08-16 06:59 UTC] tony2001 at phpclub dot net
http://tony2001.phpclub.net/dev/tmp/oci8_in_call.diff
Please try this patch (with latest OCI8 CVS).
 [2006-08-17 03:47 UTC] m dot squillace at flashnet dot it
Thank you for the patch, I will try it as soon as I get back to work next monday.
The new PHP_OCI_CALL_RETURN macro looks a lot like the modified PHP_OCI_CALL macro I mentioned on 2006-07-05, so I am confident it will work.
You had to update several source files to switch macros, though; by leveraging a ternary operator "trick" my version of PHP_OCI_CALL sets the global variable and maintains compatibility with the original macro.
If interested, I will mail it to you for revision (don't know if such tricks are allowed in PHP's sources).
 [2006-08-21 12:01 UTC] msquillace at sogei dot it
Downloaded php5.2-200608211430.tar.bz2 and applied your patch.
Everything is now working as it should, bug fixed!
 [2006-08-21 12:42 UTC] tony2001 at phpclub dot net
Patch committed.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Apr 19 23:01:28 2024 UTC