php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #79188 Memory corruption in preg_replace/preg_replace_callback and unicode
Submitted: 2020-01-29 09:20 UTC Modified: 2020-01-29 10:01 UTC
From: cschneid@php.net Assigned:
Status: Closed Package: PCRE related
PHP Version: Irrelevant OS:
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If this is not your bug, you can add a comment by following this link.
If this is your bug, but you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: cschneid@php.net
New email:
PHP Version: OS:

 

 [2020-01-29 09:20 UTC] cschneid@php.net
Description:
------------
When using preg_replace with an empty pattern AND replacement string in a Unicode string there can be a memory corruption happening by overflowing the result buffer.

The easiest way to reproduce is to compile PHP with --enable-debug and run it with valgrind, see test script below.

The bug was reproduced with PHP >= 7.0, older versions were not tested.

The patch adds additional tests to increase the result string buffer size since I wasn't entirely sure how to fix the new_len calculation above. Maybe someone with a deeper understanding of the code can come up with a better patch.

Test script:
---------------
USE_ZEND_ALLOC=0 valgrind --undef-value-errors=no sapi/cli/php -r 'preg_replace("//u", "", "a" . str_repeat("\u{1f612}", 10));'

and

USE_ZEND_ALLOC=0 valgrind --undef-value-errors=no sapi/cli/php -r 'preg_replace_callback("//u", function() { return ""; }, "a" . str_repeat("\u{1f612}", 10));'


Actual result:
--------------
$ USE_ZEND_ALLOC=0 valgrind --undef-value-errors=no sapi/cli/php -r 'preg_replace("//u", "", "a" . str_repeat("\u{1f612}", 10));'==21533== Memcheck, a memory error detector
==21533== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==21533== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==21533== Command: sapi/cli/php -r preg_replace("//u",\ "",\ "a"\ .\ str_repeat("\\u{1f612}",\ 10));
==21533== 
==21533== Invalid write of size 1
==21533==    at 0x4C358EB: memmove (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==21533==    by 0x504092: php_pcre_replace_impl (php_pcre.c:1737)
==21533==    by 0x50391C: php_pcre_replace (php_pcre.c:1544)
==21533==    by 0x504EB4: php_replace_in_subject (php_pcre.c:2127)
==21533==    by 0x50594B: preg_replace_common (php_pcre.c:2268)
==21533==    by 0x505D25: zif_preg_replace (php_pcre.c:2326)
==21533==    by 0x83E2B1: ZEND_DO_ICALL_SPEC_RETVAL_UNUSED_HANDLER (zend_vm_execute.h:1240)
==21533==    by 0x89C658: execute_ex (zend_vm_execute.h:51879)
==21533==    by 0x8A06D9: zend_execute (zend_vm_execute.h:55983)
==21533==    by 0x7C095E: zend_eval_stringl (zend_execute_API.c:1010)
==21533==    by 0x7C0AF7: zend_eval_stringl_ex (zend_execute_API.c:1051)
==21533==    by 0x7C0B6F: zend_eval_string_ex (zend_execute_API.c:1062)
==21533==  Address 0x6d56c30 is 0 bytes after a block of size 64 alloc'd
==21533==    at 0x4C308BF: realloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==21533==    by 0x7A2004: __zend_realloc (zend_alloc.c:2994)
==21533==    by 0x7A0FF0: _realloc_custom (zend_alloc.c:2434)
==21533==    by 0x7A1140: _erealloc (zend_alloc.c:2556)
==21533==    by 0x4FF7BD: zend_string_extend (zend_string.h:205)
==21533==    by 0x503D4A: php_pcre_replace_impl (php_pcre.c:1668)
==21533==    by 0x50391C: php_pcre_replace (php_pcre.c:1544)
==21533==    by 0x504EB4: php_replace_in_subject (php_pcre.c:2127)
==21533==    by 0x50594B: preg_replace_common (php_pcre.c:2268)
==21533==    by 0x505D25: zif_preg_replace (php_pcre.c:2326)
==21533==    by 0x83E2B1: ZEND_DO_ICALL_SPEC_RETVAL_UNUSED_HANDLER (zend_vm_execute.h:1240)
==21533==    by 0x89C658: execute_ex (zend_vm_execute.h:51879)

Patches

pcre_unicode_memory_corruption.patch (last revision 2020-01-29 09:21 UTC by cschneid@php.net)

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2020-01-29 09:21 UTC] cschneid@php.net
The following patch has been added/updated:

Patch Name: pcre_unicode_memory_corruption.patch
Revision:   1580289683
URL:        https://bugs.php.net/patch-display.php?bug=79188&patch=pcre_unicode_memory_corruption.patch&revision=1580289683
 [2020-01-29 09:29 UTC] nikic@php.net
-Status: Open +Status: Verified
 [2020-01-29 10:01 UTC] nikic@php.net
I've put up an alternative patch at https://github.com/php/php-src/pull/5126, which I think integrates better with the general structure of the code, and can thus be more efficient. WDYT?
 [2020-01-29 11:06 UTC] cschneid@php.net
Looks good, seems to fix the bug and be more in line with the rest of the code.
A bit more complex change than my patch that's why I can't really tell off-hand if there are any side-effects of this patch but I trust you.
 [2020-02-05 10:22 UTC] nikic@php.net
Automatic comment on behalf of nikita.ppv@gmail.com
Revision: http://git.php.net/?p=php-src.git;a=commit;h=13bfa9f5ac04a65300cf20211e2e3314e827595d
Log: Fixed bug #79188
 [2020-02-05 10:22 UTC] nikic@php.net
-Status: Verified +Status: Closed
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Mar 29 08:01:27 2024 UTC