php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #34113 mb_ereg_replace does not function as it should under "UTF-8"
Submitted: 2005-08-13 01:15 UTC Modified: 2005-08-21 01:00 UTC
Votes:1
Avg. Score:5.0 ± 0.0
Reproduced:0 of 0 (0.0%)
From: mlmlml at lily dot freemail dot ne dot jp Assigned:
Status: No Feedback Package: mbstring related
PHP Version: 4.3.11 OS: windows 2000 server
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: mlmlml at lily dot freemail dot ne dot jp
New email:
PHP Version: OS:

 

 [2005-08-13 01:15 UTC] mlmlml at lily dot freemail dot ne dot jp
Description:
------------
When "mb_ereg_replace" is used under "UTF-8" source environment, it does not seem to recognize ?gmulti-byte space?h (zenkaku-space in Japanese).   The same code runs correctly under ?gEUC-JP?h source environment, so I supposed this was something buggy rather than mis-usage of the function.


[mbstring]
mbstring.language = Japanese
mbstring.internal_encoding = UTF-8
mbstring.http_input = auto
mbstring.http_output = UTF-8
;mbstring.encoding_translation = Off
mbstring.detect_order = auto
mbstring.substitute_character = none;
;mbstring.func_overload = 0


Reproduce code:
---------------
<?php
    /*
    the all three var_dump()s "UTF-8", and this verifies 
    the source is actually run under "UTF-8"
    */
    var_dump(mb_regex_encoding());
    var_dump(mb_internal_encoding());

    mb_regex_encoding(mb_internal_encoding());
    var_dump(mb_regex_encoding());

    /*
    Although this var_dump should give 
    
    "**********?@*****" 
    
    by recognising "?@"(multi-byte space), 
    when it is run under "UTF-8", it gives 
    
    "****************".  
    
    This assures that the "multi-byte space" is not 
    recognised as it should.
    */
    $string = "multi-byte?@space";
    var_dump(mb_ereg_replace('[^?@]', '*', $string));
?>

Expected result:
----------------
multi-byte space should not be converted to "*" in the reproduce code since the regular expression is "[^?@]".

Actual result:
--------------
multi-byte space is ignored and converted to "*".

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2005-08-21 01:00 UTC] php-bugs at lists dot php dot net
No feedback was provided for this bug for over a week, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Mon Jan 06 04:01:30 2025 UTC