php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #75153 wrap words are getting utf8 error
Submitted: 2017-09-04 14:35 UTC Modified: 2017-09-04 16:17 UTC
From: patrykmoura at gmail dot com Assigned:
Status: Not a bug Package: Strings related
PHP Version: 7.0.23 OS: Unix and Windows
Private report: No CVE-ID: None
 [2017-09-04 14:35 UTC] patrykmoura at gmail dot com
Description:
------------
Hello,
When I do a wordwrap, or a preg_replace with a regex, or a implode with limit 10 with the word "Higienização", it returns "Higieniza??ão" instead of "Higienizaç ão". I guess there's something about the Ç word.

Thanks

Test script:
---------------
echo wordwrap("Higienização", 10, " ", true);
echo implode(PHP_EOL, str_split("Higienização", 10));
echo preg_replace('/([^\s]{10})(?=[^\s])/', '$1'.' ', $string);

Expected result:
----------------
"Higienizaç ão"

Actual result:
--------------
"Higieniza??ão"

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2017-09-04 15:57 UTC] requinix@php.net
-Status: Open +Status: Not a bug
 [2017-09-04 15:57 UTC] requinix@php.net
As with many string functions wordwrap() is not multibyte safe. Unfortunately the mbstring extension does not have an equivalent, but the user comments in the docs for wordwrap suggest a couple PCRE-based solutions for UTF-8 encoding.
 [2017-09-04 16:13 UTC] patrykmoura at gmail dot com
Hi,

I understand the reason for wordwrap, but, why the other ways I've tried, lead me to the same error?
 [2017-09-04 16:17 UTC] nikic@php.net
Both wordwrap() and str_split() are not multibyte-safe. Your PCRE version just misses the /u modifier.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Dec 05 19:01:30 2024 UTC