php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #75214 trim&rtrim function makes some Chinese characters messy
Submitted: 2017-09-16 01:39 UTC Modified: 2017-09-17 23:11 UTC
Votes:1
Avg. Score:2.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:0 (0.0%)
Same OS:1 (100.0%)
From: yzhsh89 at 126 dot com Assigned:
Status: Duplicate Package: *Unicode Issues
PHP Version: 5.6.31 OS: ANY
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: yzhsh89 at 126 dot com
New email:
PHP Version: OS:

 

 [2017-09-16 01:39 UTC] yzhsh89 at 126 dot com
Description:
------------
Some Chinese characters may lead to messy codes through trim or rtrim, such as the Chinese character "言", and other untested.
I am using the PHP5.6.30 version, Other versions have not been tested.

Test script:
---------------
$str = "、汉语言、";
echo ltrim($str, '、')."<br/>"; //correct
echo rtrim($str, '、')."<br/>"; //messy
echo trim($str, '、')."<br/>"; //messy

Expected result:
----------------
汉语言、
、汉语言
汉语言

Actual result:
--------------
汉语言、
、汉语�
汉语�

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2017-09-16 02:36 UTC] Wes dot example at example dot org
hi, php strings are generic byte sequences, not character sequences.

which means that you are not allowed to use the second parameter of trim, ltrim and rtrim like that.

look for a trim/ltrim/rtrim function that works on multibyte strings, and look at the functions bundled in the mbstring, intl or iconv extensions
 [2017-09-16 02:45 UTC] Wes dot example at example dot org
here's a list of functions you can and you cannot use with multibyte encodings
http://www.phpwact.org/php/i18n/utf-8?rev=1165702501
 [2017-09-16 03:18 UTC] yohgaki@php.net
-Type: Bug +Type: Feature/Change Request
 [2017-09-16 03:18 UTC] yohgaki@php.net
Standard string functions does not support multibyte encodings basically.
You cannot use multibyte chars as trimming char.
 [2017-09-16 13:28 UTC] cmb@php.net
Request #23501 already suggests to add mb_trim(). Shouldn't we
close this ticket as duplicate?
 [2017-09-17 23:11 UTC] yohgaki@php.net
-Status: Open +Status: Duplicate
 [2017-09-17 23:11 UTC] yohgaki@php.net
I agree. Make this a dup of Request #23501
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat Dec 21 18:01:29 2024 UTC