php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #79118 mb_ereg_replace function cannot replace strings
Submitted: 2020-01-15 02:00 UTC Modified: 2020-01-15 04:43 UTC
From: minhlt82 at gmail dot com Assigned:
Status: Not a bug Package: mbstring related
PHP Version: 7.4.1 OS: Linux
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: minhlt82 at gmail dot com
New email:
PHP Version: OS:

 

 [2020-01-15 02:00 UTC] minhlt82 at gmail dot com
Description:
------------
The mb_ereg_replace function cannot replace strings that contain some characters with the SJIS encoding. In these cases, the result is NULL.
It works well with PHP versions < 7.1 but fails with PHP 7.1 and above.


Test script:
---------------
mb_regex_encoding('SJIS');
$string = "hello ここに文を入れる。";
$pattern = "hello";
$replacement = 'hi';
//expected result: hi ここに文を入れる。
//but it returns NULL
var_dump(mb_ereg_replace($pattern, $replacement, $string));

//Similar error for binary strings, you can test more here.
//http://sandbox.onlinephpfunctions.com/code/1a741afb48b299c6f335d9175be730a5ec3854d5


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2020-01-15 02:22 UTC] requinix@php.net
-Status: Open +Status: Feedback -Package: Regexps related +Package: mbstring related
 [2020-01-15 02:22 UTC] requinix@php.net
Seems to be working fine.
https://3v4l.org/Z575v
https://3v4l.org/ofV7G

Because you can't pass raw binary data to something that operates on characters.
 [2020-01-15 03:29 UTC] minhlt82 at gmail dot com
@Requinix: 
Thank you for the super fast support.
For the replacement of the binary string, following your link https://3v4l.org/ofV7G I am still facing a problem that the hex value of binary string before and after the conversion (by mb_convert_encoding)is changed from f8f7 to 3fc3b7.
Could you please help me to check here https://3v4l.org/JE5r8
Thank you so much.
 [2020-01-15 04:43 UTC] requinix@php.net
-Status: Feedback +Status: Not a bug
 [2020-01-15 04:43 UTC] requinix@php.net
Sorry, but your problem does not imply a bug in PHP itself.  For a
list of more appropriate places to ask for help using PHP, please
visit http://www.php.net/support.php as this bug system is not the
appropriate forum for asking support questions.  Due to the volume
of reports we can not explain in detail here why your report is not
a bug.  The support channels will be able to provide an explanation
for you.

Thank you for your interest in PHP.

You cannot pass raw binary data to something that operates on characters. I don't know what 63735 means to you, I don't know why you're packing it, and I can't fathom why you're inserting that packed data into a human-readable string, but I do know that you can't put the bytes 0xF8F7 into a string and then safely convert it to any other encoding.

My first thought is that you need to use some sort of string templating system, as is typically done with i18n translations.
 [2020-01-15 06:21 UTC] minhlt82 at gmail dot com
Thanks for the reply. 
I packed the characters to create emoji and then displayed on some phones in Japan. The problem I am wondering is that it works fine with php 7.0.14 and earlier. I think there are several changes from version 7.1 that cause this problem.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat Dec 21 17:01:58 2024 UTC