PHP :: Bug #76950 :: trim not working

Bug #76950	trim not working
Submitted:	2018-09-29 20:36 UTC	Modified:	2018-10-11 11:41 UTC
From:	tobias at tromm dot no-ip dot org	Assigned:
Status:	Not a bug	Package:	*General Issues
PHP Version:	7.2.10	OS:	Windows Server 2016
Private report:	No	CVE-ID:	None

View Developer Edit

[2018-09-29 20:36 UTC] tobias at tromm dot no-ip dot org

Description:
------------
Hi.

The function rtrim is not working correctly if user uses ASCII table to cheat the system.

Please check the full example I am given with some fix [probably there is a better way to fix it, but, it is working].


Test script:
---------------
https://www.papinho.com/PHP_bug.zip

Expected result:
----------------
Result String: "David"

This is the result with my fix script.

Actual result:
--------------
Given String: "           David           "

Given String with the use of php trim function: "          David          "

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports

[2018-09-29 20:46 UTC] requinix@php.net

-Status: Open +Status: Feedback

[2018-09-29 20:46 UTC] requinix@php.net

If you have a repro script then include it in the report. Don't link us a .zip to download and open.

[2018-09-29 20:48 UTC] tobias at tromm dot no-ip dot org

-Summary: rtrim not working +Summary: trim not working -Status: Feedback +Status: Open

[2018-09-29 20:48 UTC] tobias at tromm dot no-ip dot org

fix title

[2018-09-29 20:52 UTC] tobias at tromm dot no-ip dot org

I am afraid that if I copy the script here the characters will be converted and the result will change.

Thas why I provide you a zip.

[2018-09-29 20:53 UTC] requinix@php.net

<?php

   $apelido = "           David           ";

   echo "Given String: \"".$apelido."\"<br><br>";
   echo "Given String with the use of php trim function: \"".trim($apelido)."\"<br><br>";

   echo "Now we will check the ASCII table to check wich character are being used:<br><br>";

   for ($cont=0; $cont <strlen($apelido); $cont++){
      echo "ORD: ".ord($apelido[$cont])."<br>";
   }

   $fazer = 1;

   do {
      $fazer = 0;

      //Check the beginning of the string
      if ($apelido[0] == chr(32) OR $apelido[0] == chr(194) OR $apelido[0] == chr(160)) {
         $fazer = 1;
         $apelido = ltrim($apelido, chr(32));
         $apelido = ltrim($apelido, chr(194));
         $apelido = ltrim($apelido, chr(160));
      }

      //Check the end of the string
      if ($apelido[strlen($apelido)-1] == chr(32) OR $apelido[strlen($apelido)-1] == chr(194) OR $apelido[strlen($apelido)-1] == chr(160)) {
         $fazer = 1;
         $apelido = rtrim($apelido, chr(32));
         $apelido = rtrim($apelido, chr(194));
         $apelido = rtrim($apelido, chr(160));
      }

} while ($fazer == 1);

   echo "<br>Result String: \"".$apelido."\"<br><br>";

?>

[2018-09-29 20:56 UTC] tobias at tromm dot no-ip dot org

PLEASE DONT COPY THESE CODE, IT WILL CHANGE THE RESULT COMPLETELY!!!

I just copy it and paste on my php editor and the result will change.

Use the zip file from the link I provide.

Otherwise, the space char will be converted!

[2018-09-29 20:59 UTC] requinix@php.net

-Status: Open +Status: Not a bug

[2018-09-29 20:59 UTC] requinix@php.net

Yes, it was converted. But for next time, familiarize yourself with functions like addcslashes so that code CAN be copied and pasted.

$apelido = " \302\240 \302\240 \302\240 \302\240 \302\240 David \302\240 \302\240 \302\240 \302\240 \302\240 ";

The documentation for rtrim explicitly states what characters are trimmed by default. If you don't like that list then provide your own.

[2018-09-29 21:29 UTC] requinix@php.net

You probably need the longer explanation.

Welcome to the world of character encodings. Nearly all of PHP's normal functions work on bytes. You are thinking about characters. Since PHP doesn't internally manage Unicode characters, the only reasonable way PHP can currently convert between the two is look at the byte ranges that all ("all") character encodings can agree upon: the 0-126 range. That means functions like trim will only deal with the bytes \n\r\t\v and space, and \0 for the fun of it, and they will not cover anything above \x7F.

Those bytes blocking trim from reducing the entire string to "David" are above \x7F. The exact interpretation of what characters those bytes are depends on the character encoding. In Latin1, \xA0 (\240) is a non-breaking space, but in UTF-8 it is not a character at all - instead it is part of a 2-4 byte sequence that represents a character (and the sequence for a non-breaking space is \xC2\xA0 or \302\240).

There is no mb_trim function but you can use pcre_replace with \s and the /u option.

[2018-09-29 21:40 UTC] tobias at tromm dot no-ip dot org

Thank you alot @requinix for the explanation.

Maybe in the future php could have a function to remove all non-breaking space if that's possible.

Thank you again.

[2018-09-29 23:54 UTC] a at b dot c dot de

For what it's worth, 

preg_replace('/(^[\t\n\r\000\v\pZ]+)|([\t\n\r\000\v\pZ]+$)/u', '', $apelido);

Removes leading/trailing instances of everything that trim() removes and also everything that Unicode considers a "whitespace" character (assumes UTF-8 encoding).

[2018-10-11 11:41 UTC] tobias at tromm dot no-ip dot org

Unfortunately some user still using a way to insert a database value with space in the end of the string.

If someone have any idea where I will be grateful.

I tryed also preg_replace('/(^[\t\n\r\000\v\pZ]+)|([\t\n\r\000\v\pZ]+$)/u', '', $apelido);

On my tests everything goes ok, but the user is doing something to hack it someway with the following ord sequence in the end:

ORD: 32
ORD: 32
ORD: 32
ORD: 32
ORD: 32
ORD: 32
ORD: 32
ORD: 32

	php.net \| support \| documentation \| report a bug \| advanced search \| search howto \| statistics \| random bug \| login
go to bug id or search bugs for


Copyright © 2001-2025 The PHP Group All rights reserved.	Last updated: Fri Jul 04 03:01:35 2025 UTC