php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #75832 Segmentation fault after preg_match w/ utf8 (or extended ascii?)
Submitted: 2018-01-17 12:16 UTC Modified: 2018-05-05 21:53 UTC
Votes:1
Avg. Score:4.0 ± 0.0
Reproduced:0 of 0 (0.0%)
From: stenschke at gyselroth dot com Assigned:
Status: No Feedback Package: PCRE related
PHP Version: 7.1Git-2018-01-17 (Git) OS: Linux
Private report: No CVE-ID: None
View Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
If you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: stenschke at gyselroth dot com
New email:
PHP Version: OS:

 

 [2018-01-17 12:16 UTC] stenschke at gyselroth dot com
Description:
------------
Following code causes PHP error: "Segmentation fault (core dumped)"

utf8_decode('<w:fldSimple w:instr=" MERGEFIELD Üüü \* MERGEFORMAT "><w:r><w:t xml:space="preserve">«Üüü»</w:t></w:r></w:fldSimple>');
preg_match('/<w:fldSimple w:instr="\s*MERGEFIELD\s*([a-zäöü|\d|_|\.]*)\s*\\\*\s*/u', $xml, $matches);

P.s. Exact used PHP version (not in bugtracker's options) is: PHP 7.1.11-1+ubuntu16.04.1+deb.sury.org+1 (cli) (built: Oct 27 2017 13:49:56)

Actual result:
--------------
Array()

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2018-01-17 12:36 UTC] requinix@php.net
-Status: Open +Status: Feedback
 [2018-01-17 12:36 UTC] requinix@php.net
Thank you for this bug report. To properly diagnose the problem, we
need a backtrace to see what is happening behind the scenes. To
find out how to generate a backtrace, please read
http://bugs.php.net/bugs-generating-backtrace.php for *NIX and
http://bugs.php.net/bugs-generating-backtrace-win32.php for Win32

Once you have generated a backtrace, please submit it to this bug
report and change the status back to "Open". Thank you for helping
us make PHP better.


 [2018-01-17 12:51 UTC] stenschke at gyselroth dot com
-Status: Feedback +Status: Open
 [2018-01-17 12:51 UTC] stenschke at gyselroth dot com
I made a copy/paste error, correct code is:

$xml = utf8_decode('<w:fldSimple w:instr=" MERGEFIELD Üüü \* MERGEFORMAT "><w:r><w:t xml:space="preserve">«Üüü»</w:t></w:r></w:fldSimple>');
preg_match('/<w:fldSimple w:instr="\s*MERGEFIELD\s*([a-zäöü|\d|_|\.]*)\s*\\\*\s*/u', $xml, $matches);
 [2018-01-17 12:57 UTC] requinix@php.net
-Status: Open +Status: Feedback
 [2018-01-17 12:57 UTC] requinix@php.net
Still want that backtrace.

But don't mix an ISO 8859-1 input string with a UTF-8 pattern and the /u flag. It won't work. (Shouldn't crash, though.)
 [2018-05-05 21:53 UTC] requinix@php.net
-Status: Feedback +Status: No Feedback
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Mon Dec 30 14:01:28 2024 UTC