php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #79188 Memory corruption in preg_replace/preg_replace_callback and unicode
Submitted: 2020-01-29 09:20 UTC Modified: 2020-01-29 10:01 UTC
From: cschneid@php.net Assigned:
Status: Closed Package: PCRE related
PHP Version: Irrelevant OS:
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: cschneid@php.net
New email:
PHP Version: OS:

 

 [2020-01-29 09:20 UTC] cschneid@php.net
Description:
------------
When using preg_replace with an empty pattern AND replacement string in a Unicode string there can be a memory corruption happening by overflowing the result buffer.

The easiest way to reproduce is to compile PHP with --enable-debug and run it with valgrind, see test script below.

The bug was reproduced with PHP >= 7.0, older versions were not tested.

The patch adds additional tests to increase the result string buffer size since I wasn't entirely sure how to fix the new_len calculation above. Maybe someone with a deeper understanding of the code can come up with a better patch.

Test script:
---------------
USE_ZEND_ALLOC=0 valgrind --undef-value-errors=no sapi/cli/php -r 'preg_replace("//u", "", "a" . str_repeat("\u{1f612}", 10));'

and

USE_ZEND_ALLOC=0 valgrind --undef-value-errors=no sapi/cli/php -r 'preg_replace_callback("//u", function() { return ""; }, "a" . str_repeat("\u{1f612}", 10));'


Actual result:
--------------
$ USE_ZEND_ALLOC=0 valgrind --undef-value-errors=no sapi/cli/php -r 'preg_replace("//u", "", "a" . str_repeat("\u{1f612}", 10));'==21533== Memcheck, a memory error detector
==21533== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==21533== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==21533== Command: sapi/cli/php -r preg_replace("//u",\ "",\ "a"\ .\ str_repeat("\\u{1f612}",\ 10));
==21533== 
==21533== Invalid write of size 1
==21533==    at 0x4C358EB: memmove (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==21533==    by 0x504092: php_pcre_replace_impl (php_pcre.c:1737)
==21533==    by 0x50391C: php_pcre_replace (php_pcre.c:1544)
==21533==    by 0x504EB4: php_replace_in_subject (php_pcre.c:2127)
==21533==    by 0x50594B: preg_replace_common (php_pcre.c:2268)
==21533==    by 0x505D25: zif_preg_replace (php_pcre.c:2326)
==21533==    by 0x83E2B1: ZEND_DO_ICALL_SPEC_RETVAL_UNUSED_HANDLER (zend_vm_execute.h:1240)
==21533==    by 0x89C658: execute_ex (zend_vm_execute.h:51879)
==21533==    by 0x8A06D9: zend_execute (zend_vm_execute.h:55983)
==21533==    by 0x7C095E: zend_eval_stringl (zend_execute_API.c:1010)
==21533==    by 0x7C0AF7: zend_eval_stringl_ex (zend_execute_API.c:1051)
==21533==    by 0x7C0B6F: zend_eval_string_ex (zend_execute_API.c:1062)
==21533==  Address 0x6d56c30 is 0 bytes after a block of size 64 alloc'd
==21533==    at 0x4C308BF: realloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==21533==    by 0x7A2004: __zend_realloc (zend_alloc.c:2994)
==21533==    by 0x7A0FF0: _realloc_custom (zend_alloc.c:2434)
==21533==    by 0x7A1140: _erealloc (zend_alloc.c:2556)
==21533==    by 0x4FF7BD: zend_string_extend (zend_string.h:205)
==21533==    by 0x503D4A: php_pcre_replace_impl (php_pcre.c:1668)
==21533==    by 0x50391C: php_pcre_replace (php_pcre.c:1544)
==21533==    by 0x504EB4: php_replace_in_subject (php_pcre.c:2127)
==21533==    by 0x50594B: preg_replace_common (php_pcre.c:2268)
==21533==    by 0x505D25: zif_preg_replace (php_pcre.c:2326)
==21533==    by 0x83E2B1: ZEND_DO_ICALL_SPEC_RETVAL_UNUSED_HANDLER (zend_vm_execute.h:1240)
==21533==    by 0x89C658: execute_ex (zend_vm_execute.h:51879)

Patches

pcre_unicode_memory_corruption.patch (last revision 2020-01-29 09:21 UTC by cschneid@php.net)

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2020-01-29 09:21 UTC] cschneid@php.net
The following patch has been added/updated:

Patch Name: pcre_unicode_memory_corruption.patch
Revision:   1580289683
URL:        https://bugs.php.net/patch-display.php?bug=79188&patch=pcre_unicode_memory_corruption.patch&revision=1580289683
 [2020-01-29 09:29 UTC] nikic@php.net
-Status: Open +Status: Verified
 [2020-01-29 10:01 UTC] nikic@php.net
I've put up an alternative patch at https://github.com/php/php-src/pull/5126, which I think integrates better with the general structure of the code, and can thus be more efficient. WDYT?
 [2020-01-29 11:06 UTC] cschneid@php.net
Looks good, seems to fix the bug and be more in line with the rest of the code.
A bit more complex change than my patch that's why I can't really tell off-hand if there are any side-effects of this patch but I trust you.
 [2020-02-05 10:22 UTC] nikic@php.net
Automatic comment on behalf of nikita.ppv@gmail.com
Revision: http://git.php.net/?p=php-src.git;a=commit;h=13bfa9f5ac04a65300cf20211e2e3314e827595d
Log: Fixed bug #79188
 [2020-02-05 10:22 UTC] nikic@php.net
-Status: Verified +Status: Closed
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Nov 21 12:01:29 2024 UTC