php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #75153 wrap words are getting utf8 error
Submitted: 2017-09-04 14:35 UTC Modified: 2017-09-04 16:17 UTC
From: patrykmoura at gmail dot com Assigned:
Status: Not a bug Package: Strings related
PHP Version: 7.0.23 OS: Unix and Windows
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: patrykmoura at gmail dot com
New email:
PHP Version: OS:

 

 [2017-09-04 14:35 UTC] patrykmoura at gmail dot com
Description:
------------
Hello,
When I do a wordwrap, or a preg_replace with a regex, or a implode with limit 10 with the word "Higienização", it returns "Higieniza??ão" instead of "Higienizaç ão". I guess there's something about the Ç word.

Thanks

Test script:
---------------
echo wordwrap("Higienização", 10, " ", true);
echo implode(PHP_EOL, str_split("Higienização", 10));
echo preg_replace('/([^\s]{10})(?=[^\s])/', '$1'.' ', $string);

Expected result:
----------------
"Higienizaç ão"

Actual result:
--------------
"Higieniza??ão"

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2017-09-04 15:57 UTC] requinix@php.net
-Status: Open +Status: Not a bug
 [2017-09-04 15:57 UTC] requinix@php.net
As with many string functions wordwrap() is not multibyte safe. Unfortunately the mbstring extension does not have an equivalent, but the user comments in the docs for wordwrap suggest a couple PCRE-based solutions for UTF-8 encoding.
 [2017-09-04 16:13 UTC] patrykmoura at gmail dot com
Hi,

I understand the reason for wordwrap, but, why the other ways I've tried, lead me to the same error?
 [2017-09-04 16:17 UTC] nikic@php.net
Both wordwrap() and str_split() are not multibyte-safe. Your PCRE version just misses the /u modifier.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sun Dec 22 01:01:30 2024 UTC