php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Thank you for your help! If the status of the bug report you submitted changes, you will be notified. You may return here and check the status or update your report at any time.
The URL for your bug report is: https://bugs.php.net/bug.php?id=34113.
Bug #34113 mb_ereg_replace does not function as it should under "UTF-8"
Submitted: 2005-08-13 01:15 UTC Modified: 2005-08-21 01:00 UTC
Votes:1
Avg. Score:5.0 ± 0.0
Reproduced:0 of 0 (0.0%)
From: mlmlml at lily dot freemail dot ne dot jp Assigned:
Status: No Feedback Package: mbstring related
PHP Version: 4.3.11 OS: windows 2000 server
Private report: No CVE-ID: None
 [2005-08-13 01:15 UTC] mlmlml at lily dot freemail dot ne dot jp
Description:
------------
When "mb_ereg_replace" is used under "UTF-8" source environment, it does not seem to recognize ?gmulti-byte space?h (zenkaku-space in Japanese).   The same code runs correctly under ?gEUC-JP?h source environment, so I supposed this was something buggy rather than mis-usage of the function.


[mbstring]
mbstring.language = Japanese
mbstring.internal_encoding = UTF-8
mbstring.http_input = auto
mbstring.http_output = UTF-8
;mbstring.encoding_translation = Off
mbstring.detect_order = auto
mbstring.substitute_character = none;
;mbstring.func_overload = 0


Reproduce code:
---------------
<?php
    /*
    the all three var_dump()s "UTF-8", and this verifies 
    the source is actually run under "UTF-8"
    */
    var_dump(mb_regex_encoding());
    var_dump(mb_internal_encoding());

    mb_regex_encoding(mb_internal_encoding());
    var_dump(mb_regex_encoding());

    /*
    Although this var_dump should give 
    
    "**********?@*****" 
    
    by recognising "?@"(multi-byte space), 
    when it is run under "UTF-8", it gives 
    
    "****************".  
    
    This assures that the "multi-byte space" is not 
    recognised as it should.
    */
    $string = "multi-byte?@space";
    var_dump(mb_ereg_replace('[^?@]', '*', $string));
?>

Expected result:
----------------
multi-byte space should not be converted to "*" in the reproduce code since the regular expression is "[^?@]".

Actual result:
--------------
multi-byte space is ignored and converted to "*".

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2005-08-21 01:00 UTC] php-bugs at lists dot php dot net
No feedback was provided for this bug for over a week, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Sat Jan 04 01:01:32 2025 UTC