php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #34113 mb_ereg_replace does not function as it should under "UTF-8"
Submitted: 2005-08-13 01:15 UTC Modified: 2005-08-21 01:00 UTC
Votes:1
Avg. Score:5.0 ± 0.0
Reproduced:0 of 0 (0.0%)
From: mlmlml at lily dot freemail dot ne dot jp Assigned:
Status: No Feedback Package: mbstring related
PHP Version: 4.3.11 OS: windows 2000 server
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: mlmlml at lily dot freemail dot ne dot jp
New email:
PHP Version: OS:

 

 [2005-08-13 01:15 UTC] mlmlml at lily dot freemail dot ne dot jp
Description:
------------
When "mb_ereg_replace" is used under "UTF-8" source environment, it does not seem to recognize ?gmulti-byte space?h (zenkaku-space in Japanese).   The same code runs correctly under ?gEUC-JP?h source environment, so I supposed this was something buggy rather than mis-usage of the function.


[mbstring]
mbstring.language = Japanese
mbstring.internal_encoding = UTF-8
mbstring.http_input = auto
mbstring.http_output = UTF-8
;mbstring.encoding_translation = Off
mbstring.detect_order = auto
mbstring.substitute_character = none;
;mbstring.func_overload = 0


Reproduce code:
---------------
<?php
    /*
    the all three var_dump()s "UTF-8", and this verifies 
    the source is actually run under "UTF-8"
    */
    var_dump(mb_regex_encoding());
    var_dump(mb_internal_encoding());

    mb_regex_encoding(mb_internal_encoding());
    var_dump(mb_regex_encoding());

    /*
    Although this var_dump should give 
    
    "**********?@*****" 
    
    by recognising "?@"(multi-byte space), 
    when it is run under "UTF-8", it gives 
    
    "****************".  
    
    This assures that the "multi-byte space" is not 
    recognised as it should.
    */
    $string = "multi-byte?@space";
    var_dump(mb_ereg_replace('[^?@]', '*', $string));
?>

Expected result:
----------------
multi-byte space should not be converted to "*" in the reproduce code since the regular expression is "[^?@]".

Actual result:
--------------
multi-byte space is ignored and converted to "*".

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2005-08-21 01:00 UTC] php-bugs at lists dot php dot net
No feedback was provided for this bug for over a week, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Tue Apr 23 15:01:32 2024 UTC