|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
[2018-11-02 01:32 UTC] ryosuke dot kobayashi at fujisystems dot co dot jp
Description:
------------
mb_ereg_replace() just returns null when given string contains some specific characters in 'SJIS-win' (e.g. 'ⅰ', '伃'...), and it works without output errors.
Between 0xFA40('ⅰ') and 0xFC4B('黑') causes this bug, imo.
It also happens with mb_ereg_match*().
I confirmed that this happens PHP Version 7.1 or higher. Here's a results I tried.
PHP 7.0.32 with oniguruma 5.9.6 => It works.
PHP 7.0.32 with oniguruma 6.3.0 => It works.
PHP 7.1.23 with oniguruma 5.9.6 => It does not work.
PHP 7.2.11 with oniguruma 6.3.0 => It does not work.
Test script:
---------------
function chk($a,$b){
mb_internal_encoding('SJIS-win');
$j=0;
for($i=$a;$i<$b;$i++){
$s=sprintf('%x',$i);
$hex = hex2bin($s);
if(mb_check_encoding($hex)){
if (!mb_ereg($hex, $hex)){
echo "$s($hex):NG\n";
}
}else{
}
$j++;
}
echo "cnt:$j\n";
}
chk(0xED40,0xFC51);
PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
|
|||||||||||||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Fri Oct 24 10:00:01 2025 UTC |
I briefly checked code. It seems the difference came from supported encoding between mbstring and Onigruma. Mbstring has 'SJIS-win' encoding while Oniguruma has only 'SJIS'. Any SJIS valiants are validated as 'SJIS'. As a result, Current (newer) code is trying to validate 'SJIS-win' as 'SJIS' which will fail in certain cases. Following code should be fixed to address this bug. i.e. php_mb_check_encoding() needs 'SJIS-win' from '_php_mb_regex_mbctype2name(MBREX(current_mbctype))' in this case, not 'SJIS'. php_mbregex.c if (!php_mb_check_encoding( string, string_len, _php_mb_regex_mbctype2name(MBREX(current_mbctype)) )) { Using 'SJIS' as mbregex encoding wouldn't fix issue. https://3v4l.org/P56Zg There should be other issue.