php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #74120 SyncEvent fires only when someone is waiting
Submitted: 2017-02-17 13:58 UTC Modified: 2017-02-20 05:38 UTC
From: amancio at prjc dot com dot br Assigned: cubic (profile)
Status: Closed Package: PECL (PECL)
PHP Version: 5.6.30 OS: Linux
Private report: No CVE-ID: None
 [2017-02-17 13:58 UTC] amancio at prjc dot com dot br
Description:
------------
Using package 'sync' PECL extension;

When an instance of SyncEvent calls fire() in a thread, the other instance of SyncEvent just gets fired if it is executing wait(). If the thread is processing other things, the next call to wait() will wait indefinitely.

This bad behaviour was observed just on Linux with sync 1.1.0. On Windows, sync 1.1.0 behaves correctly (the next call to wait() knows the event was already fired). Sync 1.0.1 (the previous version) behaves correctly on both platforms, Windows and Linux.

Test script:
---------------
// a single-threaded example:
$event = new SyncEvent("Test-123", true);
$event->fire();
if ($event->wait(0))
  echo "Fired, ok";
else
  echo "Not fired?";

Expected result:
----------------
On any platform, any version should have to print "Fired, ok".

Actual result:
--------------
On Windows, Sync 1.0.1 and 1.1.0 prints "Fired, ok".
On Linux, Sync 1.0.1 prints "Fired, ok", but Sync 1.1.0 prints "Not fired?".

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2017-02-17 15:38 UTC] cubic@php.net
-Status: Open +Status: Assigned
 [2017-02-17 15:38 UTC] cubic@php.net
Thanks for the bug report and test case.  Looks like a probable regression in manual reset event objects.  I'll look into this.

Note that any bugs in the PECL package will almost always reflect over here:

https://github.com/cubiclesoft/cross-platform-cpp

Version 1.1.0 represents a major rewrite to bypass POSIX semaphore issues on some platforms (e.g. Mac) and added support for PHP 7.  Windows saw no significant changes other than PHP 7 support.  That at least explains why any regressions took place.
 [2017-02-18 03:11 UTC] cubic@php.net
-Assigned To: +Assigned To: cubic
 [2017-02-18 03:11 UTC] cubic@php.net
I am unable to replicate the problem on Ubuntu Linux.  However, after staring at the logic for a while, I did find two semi-related bugs with the underlying event object implementation.  But neither bug should have triggered in a truly isolated setup.

It would help me to know what var_dump($event->fire()) returns.  Useful tip:  var_dump() is frequently easier than writing if/else logic.

What flavor of Linux are you running?  There are subtle differences in POSIX implementations across even the most common Linux distros.

Does the object exist in /dev/shm/?  What permissions does it have (should be 0666)?  If you delete the object manually, does the test start working as expected?

Also of note:  ext/sync/tests/011.phpt runs named manual event objects through its paces as best as a single process can test them.  You should run the test suite to see if there are other object types broken on your platform.  This command should work:

pecl run-tests -p sync
 [2017-02-19 21:16 UTC] amancio at prjc dot com dot br
Hi Thomas, thank you for spending your time on this!

This time I did more extensive tests to get better details about the problem. I tested on 3 different Linux boxes:
* An Arch Linux, kernel 4.8.13-1-ARCH, apache 2.4.25, PHP 5.6.30;
* A CentOS 7.2.1511, kernel 3.10.0-327.36.3.el7.x86_64, apache 2.4.6, PHP 5.4.16;
* A CentOS 6.6, kernel 2.6.32-504.8.1.el6.x86_64, apache 2.2.15, PHP 5.3.3;

I slightly changed the test script to make two of it:

// Test script A
$event = new SyncEvent("Test-123", true);
var_dump($event->wait(0));  // pay attention here
var_dump($event->fire());
var_dump($event->wait(0));
var_dump($event->reset());

// Test script B
$event = new SyncEvent("Test-123", true);
var_dump($event->wait(1));  // you can put any number > 0 here
var_dump($event->fire());
var_dump($event->wait(0));
var_dump($event->reset());

In any Linux box, the first (clean) run of any script creates a /dev/shm/Sync_Event-* file with 0666 permission.

The first run of test script A always works: it prints bool(false) bool(true) bool(true) bool(true). And in all cases, test script B always fails: it prints bool(false) bool(true) bool(false) bool(true).

In both CentOS boxes, new executions of script A always work, but script B never works. Even alternating the order of execution doesn't change this result.

But in Arch Linux, a simple run of script B breaks something. Any subsequent run of script A prints bool(false) bool(true) bool(false) bool(true), until I manually remove the /dev/shm/Sync_Event-* file.

I ran the tests as you suggested. In both CentOS, all 16 tests passed, but in Arch two tests failed:

FAIL [14/16] SyncReaderWriter - named reader-writer allocation, locking, and unlocking freeze test.[/usr/share/php56/pear/test/sync/tests/014.phpt]
FAIL [16/16] SyncSharedMemory - named shared memory allocation reuse test.[/usr/share/php56/pear/test/sync/tests/016.phpt]
2 FAILED TESTS:
/usr/share/php56/pear/test/sync/tests/014.phpt
/usr/share/php56/pear/test/sync/tests/016.phpt

If you need more information, I am willing to help.
 [2017-02-19 21:43 UTC] amancio at prjc dot com dot br
When I run manually the 014.phpt, it runs without error, printing the same result as expected. I don't know why it fails when I run with 'pecl run-tests'.

But the 016.phpt, it fails calling SyncSharedMemory->first(). All three calls return false, even after I remove the /dev/shm/Sync_SharedMem-* file.

Arch Linux, kernel 4.8.13-1-ARCH.
 [2017-02-20 05:11 UTC] cubic@php.net
Automatic comment on behalf of webmaster@cubiclesoft.com
Revision: http://git.php.net/?p=pecl/system/sync.git;a=commit;h=5af68db0545d67cca0e8a83183605adb15f72dac
Log: Fix bug #74120 - sync_WaitForUnixEvent() issues.
 [2017-02-20 05:11 UTC] cubic@php.net
-Status: Assigned +Status: Closed
 [2017-02-20 05:38 UTC] cubic@php.net
Thanks for the additional information - it really helped.  With the new information, I've been able to replicate the issue and track down the cause.  I've also pushed a set of fixes upstream into the cross platform library, ported the fixes to my local extension building environment, verified that the fixes in the extension correct the issue, and updated the test suite to better detect any future regressions with event objects.

The PECL sync 1.1.1 release contains the above fixes and should be available via "pecl upgrade".  Thanks for being patient and providing good, detailed information to replicate the problem.

I'll deal with the Arch-specific issues in a separate release.  Since part of the main test suite appears to fail on Arch, I don't think a separate bug needs to be opened.
 
PHP Copyright © 2001-2020 The PHP Group
All rights reserved.
Last updated: Sat Jul 04 19:01:16 2020 UTC