php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #75832 Segmentation fault after preg_match w/ utf8 (or extended ascii?)
Submitted: 2018-01-17 12:16 UTC Modified: 2018-05-05 21:53 UTC
Votes:1
Avg. Score:4.0 ± 0.0
Reproduced:0 of 0 (0.0%)
From: stenschke at gyselroth dot com Assigned:
Status: No Feedback Package: PCRE related
PHP Version: 7.1Git-2018-01-17 (Git) OS: Linux
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: stenschke at gyselroth dot com
New email:
PHP Version: OS:

 

 [2018-01-17 12:16 UTC] stenschke at gyselroth dot com
Description:
------------
Following code causes PHP error: "Segmentation fault (core dumped)"

utf8_decode('<w:fldSimple w:instr=" MERGEFIELD Üüü \* MERGEFORMAT "><w:r><w:t xml:space="preserve">«Üüü»</w:t></w:r></w:fldSimple>');
preg_match('/<w:fldSimple w:instr="\s*MERGEFIELD\s*([a-zäöü|\d|_|\.]*)\s*\\\*\s*/u', $xml, $matches);

P.s. Exact used PHP version (not in bugtracker's options) is: PHP 7.1.11-1+ubuntu16.04.1+deb.sury.org+1 (cli) (built: Oct 27 2017 13:49:56)

Actual result:
--------------
Array()

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2018-01-17 12:36 UTC] requinix@php.net
-Status: Open +Status: Feedback
 [2018-01-17 12:36 UTC] requinix@php.net
Thank you for this bug report. To properly diagnose the problem, we
need a backtrace to see what is happening behind the scenes. To
find out how to generate a backtrace, please read
http://bugs.php.net/bugs-generating-backtrace.php for *NIX and
http://bugs.php.net/bugs-generating-backtrace-win32.php for Win32

Once you have generated a backtrace, please submit it to this bug
report and change the status back to "Open". Thank you for helping
us make PHP better.


 [2018-01-17 12:51 UTC] stenschke at gyselroth dot com
-Status: Feedback +Status: Open
 [2018-01-17 12:51 UTC] stenschke at gyselroth dot com
I made a copy/paste error, correct code is:

$xml = utf8_decode('<w:fldSimple w:instr=" MERGEFIELD Üüü \* MERGEFORMAT "><w:r><w:t xml:space="preserve">«Üüü»</w:t></w:r></w:fldSimple>');
preg_match('/<w:fldSimple w:instr="\s*MERGEFIELD\s*([a-zäöü|\d|_|\.]*)\s*\\\*\s*/u', $xml, $matches);
 [2018-01-17 12:57 UTC] requinix@php.net
-Status: Open +Status: Feedback
 [2018-01-17 12:57 UTC] requinix@php.net
Still want that backtrace.

But don't mix an ISO 8859-1 input string with a UTF-8 pattern and the /u flag. It won't work. (Shouldn't crash, though.)
 [2018-05-05 21:53 UTC] requinix@php.net
-Status: Feedback +Status: No Feedback
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Oct 10 13:01:26 2024 UTC