php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #51880 Missfunction of mb_eregi() and mb_ereg()
Submitted: 2010-05-21 16:13 UTC Modified: 2015-05-10 04:22 UTC
Votes:1
Avg. Score:3.0 ± 0.0
Reproduced:0 of 0 (0.0%)
From: tnpaulik at gmail dot com Assigned: cmb (profile)
Status: No Feedback Package: mbstring related
PHP Version: Irrelevant OS: Windows, Linux, doesn't matter
Private report: No CVE-ID: None
Have you experienced this issue?
Rate the importance of this bug to you:

 [2010-05-21 16:13 UTC] tnpaulik at gmail dot com
Description:
------------
mb_eregi doesnt macht caseinsensitivity for non ASCII signs in PHP 5.2 and 5.3

Example:
mb_eregi('Ü','ü') returns false.


mb_ereg is case insensotoive for non ASCII charakters if i put tem in []

Example:
mb_ereg("[Ü]","ü") returns true.

Test script:
---------------
if (!mb_eregi("Ü","ü"))
echo "THAT shoudldn't be so...\n";

if (mb_ereg("[Ü]","ü"))
echo "THAT shoudldn't be so...\n";

Expected result:
----------------
no output

Actual result:
--------------
THAT shoudldn't be so...
THAT shoudldn't be so...

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2010-05-22 16:15 UTC] felipe@php.net
-Status: Open +Status: Assigned -Assigned To: +Assigned To: moriyoshi
 [2011-05-26 16:17 UTC] bgamrat at wirehopper dot com
This may be related.

I had a date string (YYYY-MM-DD HH:MM:SS) validation that was inconsistent.  The code below runs the validation 100 times on the same values and regex.  Most of the time the mb_ereg works, occasionally it doesn't.

Earlier issues with case-sensitivity caused me to add a case-insensitive fallback, and to solve this issue, I added a fallback to use preg_match.

<?php
mb_internal_encoding('UTF-8');
mb_detect_order('UTF-8');
mb_regex_encoding('UTF-8');

echo date('r').'<br />';;

for ($i=0;$i<100;$i++)
        filter ('^\d{4}\-\d{2}\-\d{2} \d{2}\:\d{2}\:\d{2}$','2011-05-15 09:00:07');

function filter($sRegExp,$sInput)
{
        if (!isset($sInput))
                return false;
        $sInput=trim($sInput);
        /* mb_ereg functions don't use slashes */
        if ($sRegExp[0]=='/')
                $sRegExp=substr($sRegExp,1,-1);
        $aMatches=array();
        $iResult=mb_ereg($sRegExp,$sInput,$aMatches);
        echo 'Testing '.$sInput.' against '.$sRegExp.PHP_EOL.var_export($aMatches,true).' result '.$iResult.'<br />';;
        if (strlen($sInput)!=$iResult)
        {
              $sLowerCaseRegExp=mb_strtolower($sRegExp);
              $sLowerCaseInput=mb_strtolower($sInput);
              $iResult=mb_ereg($sLowerCaseRegExp,$sLowerCaseInput,$aMatches);
              echo 'Fallback Testing '.$sLowerCaseInput.' against '.$sLowerCaseInput.PHP_EOL.var_export($aMatches,true).' result '.$iResult.'<br />';;
              if (strlen($sInput)!=$iResult)
              {
                      $bResult=preg_match('/'.$sRegExp.'/i',$sLowerCaseInput);
                      echo 'preg_match/i '.$bResult.'<br />';
                      return $bResult!=0;
              }
        }
        return true;
}

Linux domain.com 2.6.18-164.9.1.el5 #1 SMP Tue Dec 15 21:04:57 EST 2009 i686 i686 i386 GNU/Linux

Apache/2.2.3 

PHP version 5.1.6

mbstring
Multibyte Support  enabled  
Multibyte string engine  libmbfl  
Multibyte (japanese) regex support  enabled  
Multibyte regex (oniguruma) version  3.7.1  

mbstring extension makes use of "streamable kanji code filter and converter", which is distributed under the GNU Lesser General Public License version 2.1. 

Directive Local Value Master Value 
mbstring.detect_order no value no value 
mbstring.encoding_translation Off Off 
mbstring.func_overload 0 0 
mbstring.http_input pass pass 
mbstring.http_output pass pass 
mbstring.internal_encoding no value no value 
mbstring.language neutral neutral 
mbstring.strict_detection Off Off 
mbstring.substitute_character no value no value
 [2011-05-27 00:19 UTC] bgamrat at wirehopper dot com
Some of my coworkers found that mb_regex_set_options is global in scope for the thread.  If mb_regex_set_options is set differently on different threads, the mb_ereg functions will function differently as well.

To verify, echo mb_regex_set_options() on each request of the earlier submitted test code.  If it is incompatible with the regex, mb_ereg will not match.
 [2015-04-28 23:03 UTC] cmb@php.net
-Status: Assigned +Status: Feedback -Package: *Regular Expressions +Package: mbstring related -Assigned To: moriyoshi +Assigned To: cmb
 [2015-04-28 23:03 UTC] cmb@php.net
I can't reproduce this issue when the internal and the regex
encoding are properly set: <http://3v4l.org/I4Ksj>.

@tnpaulik: Can you?

@bgamrat: the issue you have reported seems unrelated to this
ticket. Please open a new ticket.
 [2015-05-10 04:22 UTC] php-bugs at lists dot php dot net
No feedback was provided. The bug is being suspended because
we assume that you are no longer experiencing the problem.
If this is not the case and you are able to provide the
information that was requested earlier, please do so and
change the status of the bug back to "Re-Opened". Thank you.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Apr 25 04:01:38 2024 UTC