php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #77819 Proposal
Submitted: 2019-03-29 06:19 UTC Modified: 2019-03-29 08:25 UTC
From: hen20-19 at yahoo dot co dot jp Assigned:
Status: Wont fix Package: mbstring related
PHP Version: 7.1.27 OS: Windows
Private report: No CVE-ID: None
 [2019-03-29 06:19 UTC] hen20-19 at yahoo dot co dot jp
Description:
------------
I propose revival of handling ability of illegal byte sequences by mb_ereg() functions.

After PHP 7.1, 
mb_ereg() functions reject illegal byte sequences.

I think it's admirably fine for beginners with checking function of input strings, 
but for me, with mb_ereg(), 

I had been handling illegal byte sequences as easy-made cipher, 
eg. UTF-8 with inappropriate BOM, 
which easily prevent careless beginners from unintentional changing of file contents with such as MS-Excel.

mb_ereg() functions 
enable us to create and deal with original file formats only for us.
But the functions gone.

I strongly long for the functions come back.
Perhaps with a new option.



Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2019-03-29 08:25 UTC] nikic@php.net
-Status: Open +Status: Wont fix
 [2019-03-29 08:25 UTC] nikic@php.net
mb_ereg() is backed by the oniguruma library, which requires that regular expressions and subjects passed to it are valid under the given encoding. Strings with invalid encoding may lead to crashes and security issues.

If your string is not valid UTF-8, then you need to treat it as binary data and specify the encoding accordingly. In that case I'd recommend using preg_match() instead though, which is binary by default and uses PCRE, which is a superior regular expression library.
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Thu Jan 02 19:01:28 2025 UTC