|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
[2018-09-20 20:41 UTC] jay at diablomedia dot com
Description: ------------ I was testing a fork of Zend Framework 1 that we maintain for our legacy applications against php 7.3.0rc1 and noticed the Zend_Validate component's tests were failing on php 7.3: https://travis-ci.org/diablomedia/zf1-validate/jobs/431170057 In this case, php 7.3 is allowing some hostnames through validation that are being rejected in php < 7.3. I dug a bit, and was able to reproduce this with a single regular expression that is evaluated differently in php 7.3 than it is in php 5.6 through php 7.2. Here's a link to 3v4l.org demonstrating the difference (script is in the "Test script" section of this report): https://3v4l.org/DR3S5 Test script: --------------- <?php $domainPart = ' domain.com'; $regexChar = '/^[\x{0100}-\x{017f}]{1,63}$/iu'; preg_match($regexChar, $domainPart, $matches); print_r($matches); Expected result: ---------------- I would expect the $matches array in the test script to be empty here, similar to all PHP versions before 7.3. Actual result: -------------- The regex matches on " domain.com", which it doesn't in earlier versions of PHP. PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
|
|||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Mon Oct 27 18:00:01 2025 UTC |
Looks like a PCRE bug because turning off JIT makes it work correctly again. So that's the workaround. ini_set("pcre.jit", 0); PHP 7.3 is currently bundling PCRE 10.31. Version 10.32-RC1 includes a fix for something that looks related. > 35. In a pattern such as /[^\x{100}-\x{ffff}]*[\x80-\xff]/ which has a repeated > negative class with no characters less than 0x100 followed by a positive class > with only characters less than 0x100, the first class was incorrectly being > auto-possessified, causing incorrect match failures. https://www.pcre.org/changelog.txt https://bugs.exim.org/show_bug.cgi?id=2300 While your example doesn't have the negated set, the fix for the bug was very simple so it could affect less complicated situations like yours as well.For the record, I can reproduce with pcre2test. (The backslash is required to escape the leading space.) re> /^[\x{0100}-\x{017f}]{1,63}$/i,utf,jit=0 data> \ domain.com No match data> re> /^[\x{0100}-\x{017f}]{1,63}$/i,utf,jit=1 data> \ domain.com 0: domain.com